Musings on PEP8

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Musings on PEP8

kirby urner-4

I find myself thinking about PEP8 a lot, not that I have it memorized.

Now that Unicode reigns at the top-level, we've got an influx of
Chinese namespaces, Hindi namespaces, Cyrillic namespaces... 
a nice long list, and the PEP8 conventions regarding capitalization, 
while sensible in Latin-1, might not cover the new cases (I say 
"might not" with some sarcasm, or an innocent stare (playing 
it straight)).  

I've seen arguments in diversity-minded circles that straying 
from Latin-1 top-level will obliterate the open source nature of 
open source, with many a Chinese engineer welcoming the 
advantages of a world around simple base cases, the old 
ASCII, a mother tongue of computer scientists (more so than 
EBCDIC even (sarcasm again)).


The inter-readability of Latin-1 means lots of headaches removed, 
like at least *something* positive came out of that Roman period 
(as a child of Rome, I get to sound chiding).

The flip side argument, which I find more persuasive, is that 
one of the biggest barriers to diversity is over-reliance on Latin-1,
and "just ASCII" in particular.  

I heard those cheers in Vilnius, when Guido talked about the brave
new Unicode world.  Google's blogger interface switches to 
Lithuanian automatically when that's your timezone, or however 
it's figured.  Lots of alphabetical markings you might not find in Latin-1.  


The whole point of Unicode was to open up source code writing, 
as an occupation, to more than just Euro-English speakers.  
The bridge has been built and Python has already crossed 
over it.


None of which is to say that knowledge of Latin-1 is dispensable.
My first chapters in Naming and Ordering per MathFuture threads
(also Cardinality vs Ordinality) starts with "mappings" (the usual
approach to functions per Dolciani) with familiar glyphs (we're
learning them anyway in learning to read a native language), 
pairing with ASCII and Unicode bytecodes.  

Yes, it's a long discussion (UTF8 vs UTF32 etc) but we're 
talking about time slices and repeated revisits in a spiraling 
trajectory (per Saxon treatments).  

So even if the bulk of your coding is in some Thai characterset, 
you're quite familiar with the lower 128 in the Unicode codespace.  
Python itself has 33 keywords and a large number of builtins, 
such that "average Python" might look like Romanji-intensive 
Japanese, i.e. "heavy on the Latin-1 pepper, other spices" (yet 
lots of room for top-level class, function, variable names, libraries 
stuffed with them, all outside Latin-1).

These concerns have been a long term focus, and continue
to be, as Python students I encounter may be there for work
and that may mean using non-Latin-1 Python namespaces 
much of the time.

Kirby


_______________________________________________
Edu-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/edu-sig
Reply | Threaded
Open this post in threaded view
|

Re: Musings on PEP8

Vernon D. Cole
> From: kirby urner <[hidden email]>
> To: [hidden email]
>  
> I find myself thinking about PEP8 a lot, not that I have it memorized.
>
> Now that Unicode reigns at the top-level, we've got an influx of
> Chinese namespaces, Hindi namespaces, Cyrillic namespaces...
> a nice long list, and the PEP8 conventions regarding capitalization,
> while sensible in Latin-1, might not cover the new cases [...]

Remember that PEP 8 is a guideline, not a requirement, and that it is documented as:
"This document gives coding conventions for the Python code comprising the standard library in the main Python distribution. "
Any programmer may choose any style she prefers for her own use.  If she is writing modules for the standard library -- and expects to have them accepted by the community -- then she will follow PEP 8.

> The inter-readability of Latin-1 means lots of headaches removed,
> like at least *something* positive came out of that Roman period [...]

Yes, having taken a contract to maintain code where the author wrote his comments in Romanized Ukrainian, I have a great respect for non-English-native authors who take the time to learn English well.  It is a terrible language to have to learn, I am told, but is the only one _all_ software engineers can be expected to know.

Also PEP 8 states that: "Latin-1 (or
   UTF-8) should only be used when a comment or docstring needs to
   mention an author name that requires Latin-1; otherwise, using
   \x, \u or \U escapes is the preferred way to include non-ASCII
   data in string literals."

Even then, I would hope that an author would include an Anglicized version of his name, so that I can recognize it when I see it again. The only alphabets I personally can read are Latin, Cyrillic, Hebrew, and Greek.  If your name is in Cherokee, then please put (John Standing Bear) in ASCII along side it.

There is a very good reason for this:  standard library code must be readable for people all over the world.  That's why a Dutch software engineer wrote a language in which all the keywords and commentary are in English.  
>
> The flip side argument, which I find more persuasive, is that
> one of the biggest barriers to diversity is over-reliance on Latin-1,
> and "just ASCII" in particular.
>
> The whole point of Unicode was to open up source code writing,
> as an occupation, to more than just Euro-English speakers.

I disagree.  The whole point of Unicode is to open up application writing, so that _users_ can see computer output in their own languages.  A person who wishes to pursue code writing as an occupation must understand and use English -- or be relegated to producing work only for his own culture.  In the modern "flat" world, English is the language of commerce and computer programming.  Not being able to write understandable English is a severe handicap. My programs are written in Python, documented in English, and usable by persons of another language.  For example, see CaesarCalc.py from https://launchpad.net/romanclass , which assumes the user to be able to understand pigeon Latin. Even then, I give the result of (XVI - XVI) as "Nulla" because I expect that most users will not recognize "Nvlla" as meaning "nothing."

Here is sample output.  Notice that, when it blows up the traceback is in Python with English explanations:
<console dump>
procer numerus hic:III - II
I
procer numerus hic:3 - 2
I
procer numerus hic:3 - 3
Nulla
procer numerus hic:2 - 3
Traceback (most recent call last):
  File "CaesarCalc.py", line 40, in <module>
    print (cvt(subtrahends[0]) - cvt(subtrahends[1]))
  File "/home/vernon/romanclass-1.0.1/romanclass.py", line 99, in __sub__
    return Roman(self.__int__() - other)
  File "/home/vernon/romanclass-1.0.1/romanclass.py", line 85, in __new__
    raise OutOfRangeError, 'Cannot store "%s" as Roman' % repr(N)
romanclass.OutOfRangeError: Cannot store "-1" as Roman
</console dump>

IMHO, on the whole, PEP 8 is a pretty good document.
--
Vernon


_______________________________________________
Edu-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/edu-sig
Reply | Threaded
Open this post in threaded view
|

Re: Musings on PEP8

kirby urner-4
Hi Vernon,

... not to be confused with Vern "the Watcher" Ceder.

On Mon, Jul 18, 2011 at 8:47 AM, Vernon Cole <[hidden email]> wrote:
 
There is a very good reason for this:  standard library code must be readable for people all over the world.  That's why a Dutch software engineer wrote a language in which all the keywords and commentary are in English.  


Yes, the Standard Library is to be Anglicized for some time to come, 
maybe always, per Guido's talks.

Of course there's nothing to stop someone from writing a translator 
for the Standard Library, such that the source originals (as modified) 
might be rendered in myriad other charactersets.  

Top-level names tend to be amenable to such treatment.  

This may be done down to the C family level, though I'm not suggesting 
that it should be (nor are all Python implementations C family I hasten 
to add, (a Jython is "C family" if the Java VM is)).

The same is not true for 3rd party modules which, as you say, 
may be written in any style.

Learning the Latin (English) alphabet, building a vocabulary, remains 
a good idea obviously, along with ASCII in the context of Unicode.  

I expect those focused in computer science will continue giving 
themselves the benefit of this learning.

I received Romanized Indonesian source code for quite awhile, until 
the student moved to Japan and apparently stopped doing Python.

I'm impressed with all the alphabets you know.

3rd party modules written in Cyrillic with the peppering of 
Roman we know must be there, thanks to Standard Library
(untranslated) and the 33 keywords (so far), could be used 
in computer science to help English speakers learn a 
Cyrillic language.

 
>
> The flip side argument, which I find more persuasive, is that
> one of the biggest barriers to diversity is over-reliance on Latin-1,
> and "just ASCII" in particular.
>
> The whole point of Unicode was to open up source code writing,
> as an occupation, to more than just Euro-English speakers.

I disagree.  The whole point of Unicode is to open up application writing, so that _users_ can see computer output in their own languages.  A person who wishes to pursue code writing as an occupation must understand and use English -- or be relegated to producing work only for his own culture.  In the modern "flat" world, English is the language of commerce and computer programming.  Not being able to write understandable English is a severe handicap. My programs are written in Python, documented in English, and usable by persons of another language.  For example, see CaesarCalc.py from https://launchpad.net/romanclass , which assumes the user to be able to understand pigeon Latin. Even then, I give the result of (XVI - XVI) as "Nulla" because I expect that most users will not recognize "Nvlla" as meaning "nothing."


Certainly the GUI needs to be intelligible yes.

Lets just say there's a school of thought that has 
no problem with a math, logic or grammar teacher 
using only Chinese characters for top level names 
in various exercises using Python or other 
Unicode aware computer language.  And no 
problem with another teacher using only Hebrew
characters for top level names and so on.

This school of though hangs out on the Python
Diversity list and self-organizes there.  If you go
back in the archives, you'll find myself and a 
guy named Carl doing stuff in the Python wiki
to expand the language base, including at the 
source code level.  With Pycon / Tehran in the
planning, we want to be in a better position to 
address issues relating GeoDjango to Farsi, say.

These exercises (mentioned above) may have 
nothing to do with writing commercial applications.  
These may not be programmers in training 
(though some may be in commercial media, 
where "programming" also has meaning (e.g. 
in radio / TV)).  Instead of using a calculator 
or abacus to learn numeracy skills, people 
have laptops and internet access.

Having readable source code in languages 
that aren't in a Roman alphabet is already 
a spreading phenomenon, with many writers 
happily giving up that so-called "world readability" 
in favor of remaining intelligible to the girl or boy 
next door.  

The syntax of URIs and domain names has 
already taken this turn.  You will have http//arabic letters// 
quite frequently these days, thanks to the 
Unicode basis of http (which Python now needs 
to deal with, and does, as an http-aware language).

CSS for Arabic is the kind of style concern for 
which we may have insufficient literature to date.
We may have people joining Diversity who want to
develop that literature (recruiting happening).


Here is sample output.  Notice that, when it blows up the traceback is in Python with English explanations:
<console dump>
procer numerus hic:III - II
I
procer numerus hic:3 - 2
I
procer numerus hic:3 - 3
Nulla
procer numerus hic:2 - 3
Traceback (most recent call last):
  File "CaesarCalc.py", line 40, in <module>
    print (cvt(subtrahends[0]) - cvt(subtrahends[1]))
  File "/home/vernon/romanclass-1.0.1/romanclass.py", line 99, in __sub__
    return Roman(self.__int__() - other)
  File "/home/vernon/romanclass-1.0.1/romanclass.py", line 85, in __new__
    raise OutOfRangeError, 'Cannot store "%s" as Roman' % repr(N)
romanclass.OutOfRangeError: Cannot store "-1" as Roman
</console dump>

IMHO, on the whole, PEP 8 is a pretty good document.
--
Vernon

I'm not denigrating PEP8 in any way, even though 
I used some light sarcasm in my post.  That was 
not directed against PEP8, so much as against 
the idea that the "rule book" is somehow complete, 
just because we have it down that functions should 
generally not start with a capital letter, and 
l (lowercase L) is a terrible name for all purposes 
because it's so indistinguishable from uppercase 
I and the number 1 in many fonts.

I think as people start getting a lot more experience 
writing Python with different namespaces, with 
non-Roman top-level names etc., that the rule 
book is inevitably going to expand and that a 
Book of Styles could conceivably become enormous. 

But then think of English:  we acknowledge many 
styles as being appropriate and don't have just 
the one "book" where style is concerned (we have 
so many) -- not like the dictionary, with a goal of 
including every word in a finite and deliberately 
exclusive set of standard words.

I have some examples of Python source in my 
blogs, using kanji as top-level names (might be 
a Japanese program, as one of the kanji is for 
Mt. Fuji as I recall).  

Then there's some tracking down Stallman on 
a visit to Sri Lanka (awhile back) and chatter 
about Python in Tamil and Sinhalese.  And yes, 
I am aware English is spoken in this parts as well,
as evidenced by Arthur C. Clarke's having lived
there for so long.  One of our CSN chiefs has a
track record there too, another English speaker.


Kirby


_______________________________________________
Edu-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/edu-sig
Reply | Threaded
Open this post in threaded view
|

Re: Musings on PEP8

Vern Ceder-3
Since Kirby invoked me by name ;), I'll jump in with a quick top post, a) because I'm lazy and in a hurry, and b) because my comments are only generally related to the specifics of the previous posts. Apologies.

First of all, in general I would respond to Kirby's musings by invoking my own personal principle that PEP 20 (specifically, "Practicality beats purity") trumps PEP 8. I would think that would be true when it comes to naming conventions in other languages. If it's a library that is useful to, say, Klingon speakers only, it would make sense to name the library and it's components in Klingon. OTOH, if they wanted to share their work and wanted it to be useful to the non-Klingon speaking Federation, Klingon might not be a practical or effective choice. (Of course, being Klingon, they may not care... ;) )

Personally, I run into this issue on a daily basis these days. As the current maintainer of an entire web platform developed by our Japanese sister company, I face emails and documentation (including code comments) in Japanese, giving me ample practice with both Google translate and deciphering kana. When I chatted with the Japanese team (via Google translate, gestures, and scrawling code on the whiteboard) about the new features of Python 3, support for unicode in Python code got a cheer, and I certainly understand that. 

However, for sharing code, I'd have to agree with my namesake - diverging from the English standard is problematic. I'm finding what I think of as "technical Japanese" to be not that hard to understand, but that's exactly because so much of the vocabulary is borrowed from English - data, account, server, etc, etc, etc.

Finally, I have to note that both of us Vernons are conversant in Latin, which is the sort of coincidence sportscasters are prone to mis-label "ironic"... ;)

Cheers,
Vern

On Mon, Jul 18, 2011 at 1:29 PM, kirby urner <[hidden email]> wrote:
Hi Vernon,

... not to be confused with Vern "the Watcher" Ceder.

On Mon, Jul 18, 2011 at 8:47 AM, Vernon Cole <[hidden email]> wrote:
 
There is a very good reason for this:  standard library code must be readable for people all over the world.  That's why a Dutch software engineer wrote a language in which all the keywords and commentary are in English.  


Yes, the Standard Library is to be Anglicized for some time to come, 
maybe always, per Guido's talks.

Of course there's nothing to stop someone from writing a translator 
for the Standard Library, such that the source originals (as modified) 
might be rendered in myriad other charactersets.  

Top-level names tend to be amenable to such treatment.  

This may be done down to the C family level, though I'm not suggesting 
that it should be (nor are all Python implementations C family I hasten 
to add, (a Jython is "C family" if the Java VM is)).

The same is not true for 3rd party modules which, as you say, 
may be written in any style.

Learning the Latin (English) alphabet, building a vocabulary, remains 
a good idea obviously, along with ASCII in the context of Unicode.  

I expect those focused in computer science will continue giving 
themselves the benefit of this learning.

I received Romanized Indonesian source code for quite awhile, until 
the student moved to Japan and apparently stopped doing Python.

I'm impressed with all the alphabets you know.

3rd party modules written in Cyrillic with the peppering of 
Roman we know must be there, thanks to Standard Library
(untranslated) and the 33 keywords (so far), could be used 
in computer science to help English speakers learn a 
Cyrillic language.

 
>
> The flip side argument, which I find more persuasive, is that
> one of the biggest barriers to diversity is over-reliance on Latin-1,
> and "just ASCII" in particular.
>
> The whole point of Unicode was to open up source code writing,
> as an occupation, to more than just Euro-English speakers.

I disagree.  The whole point of Unicode is to open up application writing, so that _users_ can see computer output in their own languages.  A person who wishes to pursue code writing as an occupation must understand and use English -- or be relegated to producing work only for his own culture.  In the modern "flat" world, English is the language of commerce and computer programming.  Not being able to write understandable English is a severe handicap. My programs are written in Python, documented in English, and usable by persons of another language.  For example, see CaesarCalc.py from https://launchpad.net/romanclass , which assumes the user to be able to understand pigeon Latin. Even then, I give the result of (XVI - XVI) as "Nulla" because I expect that most users will not recognize "Nvlla" as meaning "nothing."


Certainly the GUI needs to be intelligible yes.

Lets just say there's a school of thought that has 
no problem with a math, logic or grammar teacher 
using only Chinese characters for top level names 
in various exercises using Python or other 
Unicode aware computer language.  And no 
problem with another teacher using only Hebrew
characters for top level names and so on.

This school of though hangs out on the Python
Diversity list and self-organizes there.  If you go
back in the archives, you'll find myself and a 
guy named Carl doing stuff in the Python wiki
to expand the language base, including at the 
source code level.  With Pycon / Tehran in the
planning, we want to be in a better position to 
address issues relating GeoDjango to Farsi, say.

These exercises (mentioned above) may have 
nothing to do with writing commercial applications.  
These may not be programmers in training 
(though some may be in commercial media, 
where "programming" also has meaning (e.g. 
in radio / TV)).  Instead of using a calculator 
or abacus to learn numeracy skills, people 
have laptops and internet access.

Having readable source code in languages 
that aren't in a Roman alphabet is already 
a spreading phenomenon, with many writers 
happily giving up that so-called "world readability" 
in favor of remaining intelligible to the girl or boy 
next door.  

The syntax of URIs and domain names has 
already taken this turn.  You will have http//arabic letters// 
quite frequently these days, thanks to the 
Unicode basis of http (which Python now needs 
to deal with, and does, as an http-aware language).

CSS for Arabic is the kind of style concern for 
which we may have insufficient literature to date.
We may have people joining Diversity who want to
develop that literature (recruiting happening).


Here is sample output.  Notice that, when it blows up the traceback is in Python with English explanations:
<console dump>
procer numerus hic:III - II
I
procer numerus hic:3 - 2
I
procer numerus hic:3 - 3
Nulla
procer numerus hic:2 - 3
Traceback (most recent call last):
  File "CaesarCalc.py", line 40, in <module>
    print (cvt(subtrahends[0]) - cvt(subtrahends[1]))
  File "/home/vernon/romanclass-1.0.1/romanclass.py", line 99, in __sub__
    return Roman(self.__int__() - other)
  File "/home/vernon/romanclass-1.0.1/romanclass.py", line 85, in __new__
    raise OutOfRangeError, 'Cannot store "%s" as Roman' % repr(N)
romanclass.OutOfRangeError: Cannot store "-1" as Roman
</console dump>

IMHO, on the whole, PEP 8 is a pretty good document.
--
Vernon

I'm not denigrating PEP8 in any way, even though 
I used some light sarcasm in my post.  That was 
not directed against PEP8, so much as against 
the idea that the "rule book" is somehow complete, 
just because we have it down that functions should 
generally not start with a capital letter, and 
l (lowercase L) is a terrible name for all purposes 
because it's so indistinguishable from uppercase 
I and the number 1 in many fonts.

I think as people start getting a lot more experience 
writing Python with different namespaces, with 
non-Roman top-level names etc., that the rule 
book is inevitably going to expand and that a 
Book of Styles could conceivably become enormous. 

But then think of English:  we acknowledge many 
styles as being appropriate and don't have just 
the one "book" where style is concerned (we have 
so many) -- not like the dictionary, with a goal of 
including every word in a finite and deliberately 
exclusive set of standard words.

I have some examples of Python source in my 
blogs, using kanji as top-level names (might be 
a Japanese program, as one of the kanji is for 
Mt. Fuji as I recall).  

Then there's some tracking down Stallman on 
a visit to Sri Lanka (awhile back) and chatter 
about Python in Tamil and Sinhalese.  And yes, 
I am aware English is spoken in this parts as well,
as evidenced by Arthur C. Clarke's having lived
there for so long.  One of our CSN chiefs has a
track record there too, another English speaker.


Kirby


_______________________________________________
Edu-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/edu-sig




--
Vern Ceder
[hidden email], [hidden email]
The Quick Python Book, 2nd Ed - http://bit.ly/bRsWDW



_______________________________________________
Edu-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/edu-sig
Reply | Threaded
Open this post in threaded view
|

Re: Musings on PEP8

kirby urner-4
On Mon, Jul 18, 2011 at 1:00 PM, Vern Ceder <[hidden email]> wrote:
Since Kirby invoked me by name ;), I'll jump in with a quick top post, a) because I'm lazy and in a hurry, and b) because my comments are only generally related to the specifics of the previous posts. Apologies.

First of all, in general I would respond to Kirby's musings by invoking my own personal principle that PEP 20 (specifically, "Practicality beats purity") trumps PEP 8. I would think that would be true when it comes to naming conventions in other languages. If it's a library that is useful to, say, Klingon speakers only, it would make sense to name the library and it's components in Klingon. OTOH, if they wanted to share their work and wanted it to be useful to the non-Klingon speaking Federation, Klingon might not be a practical or effective choice. (Of course, being Klingon, they may not care... ;) )

I'm glad we're having this thread as it relates to my work concerns also, where many of my students, with whom I connect asynchronously, disclose to me their difficulties with English.

Struggling with translators and so on is a fact of life, but the time will come when hospital room LCDs illuminate with familiar glyphs, including simple things like the light switch, bed controls.  

Pictures and videos from home will fill the hospital room's picture frames, with pointers embedded right in the medical record (managed like a profile, by the patients themselves).

You'll be in a "language bubble" where your caretakers have spared no effort to have you not focused on phrase book deciphering.  This is your body we're talking about.  

It's ridiculous that you should have to learn an alien tongue to follow the action.

The menu in the dining room will be in your language.  

At least some of your fellow passengers will share your language also.

This might be a geek cruise, for people who code in Perl, mostly using Greek.

I'm not saying every hospital room will be this advanced, and perhaps not the ones Anglophones manage, as they tend to take that trademark "others should learn English" approach that so characterized the 113 years of Anglo-British rule.

Russian maybe, hospital cruise ships, with some US health plans providing access, but most too far behind the times.

Hotel management science is pioneering in these same directions.  

Universities may bring up the rear, I don't know.
 

Personally, I run into this issue on a daily basis these days. As the current maintainer of an entire web platform developed by our Japanese sister company, I face emails and documentation (including code comments) in Japanese, giving me ample practice with both Google translate and deciphering kana. When I chatted with the Japanese team (via Google translate, gestures, and scrawling code on the whiteboard) about the new features of Python 3, support for unicode in Python code got a cheer, and I certainly understand that. 

However, for sharing code, I'd have to agree with my namesake - diverging from the English standard is problematic. I'm finding what I think of as "technical Japanese" to be not that hard to understand, but that's exactly because so much of the vocabulary is borrowed from English - data, account, server, etc, etc, etc.


It's problematic, but that's not going to stop it from happening in various communities. 

A population of a few hundred thousand might easily support a bevy of open source solutions that are encoded in the Klingon of that realm.

Think of a class definition with one Chinese ideograph for a name, and most methods at most two characters.  

The dot notation is still there, as are the calling parens.

The keywords, __init__, __repr__ -- all pretty familiar.  Use a translator then?

Of course the word of "self" might actually be replaced, with the symbol for "used to be known as Prince" maybe (joke):




There's this dream of English always being some "lingua franca" (joke) of the Open Source world, but per recent PSF member threads (me a threader), not everyone dreams the same dream.

At one of the recent OSCONs, maybe five years ago, we had a panel on Open Source in Africa.  

The message from that corner was the open source tools were being remastered, with an eye to reinventing many wheels from scratch.



Finally, I have to note that both of us Vernons are conversant in Latin, which is the sort of coincidence sportscasters are prone to mis-label "ironic"... ;)

I'm not bad with Latin cognates, having grown up watching Italian TV and movies (lived in Rome), studied French and Spanish.

I'm also aware of the importance of English as a supra-national language, in the Philippine Islands for example (my high school home), where so many small user groups use it to get along at meetups.

English itself is always morphing.  

Some have argued for the existence of a language called American (pronounce amer-IKAN, like puerto-RICAN) which goes even further towards accommodating its non-Anglo users.

Gene Fowler (poet) called it Amerish (same idea).

Kirby


_______________________________________________
Edu-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/edu-sig
Reply | Threaded
Open this post in threaded view
|

Re: Musings on PEP8

mokurai-2
In reply to this post by kirby urner-4
On Sun, July 17, 2011 1:46 pm, kirby urner wrote:
> I find myself thinking about PEP8 a lot, not that I have it memorized.
>
> Now that Unicode reigns at the top-level, we've got an influx of
> Chinese namespaces, Hindi namespaces, Cyrillic namespaces...
> a nice long list, and the PEP8 conventions regarding capitalization,
> while sensible in Latin-1, might not cover the new cases (I say
> "might not" with some sarcasm, or an innocent stare (playing
> it straight)).

I got involved in the original RFC for Unicode URLs and more general URIs
when the discussion was mainly, "We have to! So many people desperately
need it!" "We can't, it'll break the Web!" We got it worked out. I am
pleased to see Unicode country name TLDs proliferating, too.

I was one of those who needed it. I was doing silly things like editing an
APL magazine and converting a Chinese-Korean-Japanese-European Go glossary
from ASCII to Unicode.

> I've seen arguments in diversity-minded circles that straying
> from Latin-1 top-level will obliterate the open source nature of
> open source, with many a Chinese engineer welcoming the
> advantages of a world around simple base cases, the old
> ASCII, a mother tongue of computer scientists (more so than
> EBCDIC even (sarcasm again)).
>
> http://en.wikipedia.org/wiki/Extended_Binary_Coded_Decimal_Interchange_Code

I remember a long time ago reading about Russian COBOL, combining Latin
and Cyrillic in EBCDIC. Fun times.

I have had the personal misfortune of attempting to explain to a Japanese
programmer why it was the Japanese character set definition that replaced
backslash with the Yen sign that was broken, not Unicode. They couldn't
just do a search and replace in Windows code where that code in a text
string usually meant Yen, but in code meant the Windows directory
separator in path expressions. So they blamed Western cultural
imperialists and didn't listen when we explained how many Japanese experts
were involved in the Japanese character set mappings.

> The inter-readability of Latin-1 means lots of headaches removed,
> like at least *something* positive came out of that Roman period
> (as a child of Rome, I get to sound chiding).
>
> The flip side argument, which I find more persuasive, is that
> one of the biggest barriers to diversity is over-reliance on Latin-1,
> and "just ASCII" in particular.

The char = 8-bit or even 7-bit byte delusion, with hundreds of
incompatible 8-bit character sets and fonts. I have had a few things to
say about that.

> I heard those cheers in Vilnius, when Guido talked about the brave
> new Unicode world.  Google's blogger interface switches to
> Lithuanian automatically when that's your timezone, or however
> it's figured.  Lots of alphabetical markings you might not find in
> Latin-1.
>
> http://controlroom.blogspot.com/2007/07/blogger-control-panel-in-lithuanian.html
>
> The whole point of Unicode was to open up source code writing,
> as an occupation, to more than just Euro-English speakers.

Much more than source code. Are you becoming the man who pounds nails all
day, to whom everything looks like a hammer? ^_^

> The bridge has been built and Python has already crossed
> over it.

Which means that various parts of Sugar, including Turtle Art, have also
crossed over.

> http://controlroom.blogspot.com/2007/11/unicode.html

I was at the Unicode Conference where Jim Brown of IBM told that part of
the world that APL2 was fully Unicode-capable for data and identifiers.
There was also a proposal to put Unicode math into programming languages.

> None of which is to say that knowledge of Latin-1 is dispensable.

One of the points I had to make in that IETF discussion was that every
Japanese schoolchild was already learning to keyboard in romaji in
addition to kana, with kana-to-kanji conversion.

http://lists.w3.org/Archives/Public/uri/1997Apr/0109.html

> My first chapters in Naming and Ordering per MathFuture threads
> (also Cardinality vs Ordinality) starts with "mappings" (the usual
> approach to functions per Dolciani) with familiar glyphs (we're
> learning them anyway in learning to read a native language),
> pairing with ASCII and Unicode bytecodes.

I have just been working on a tutorial, not yet completed, on ancient
visual numerals now available in Unicode, including Egyptian Hieroglyphic
heqat measure symbols, which I am putting on Turtle Art variable name
tiles. (For our Egyptophiles, Egyptian fraction analysis was done with the
heqat as the unit.) Next I plan to teach the turtle how to write heqat,
cuneiform, and Counting Rod numerals, and then base-20 Mayan, which
unfortunately is not in Unicode yet. There are fonts using the Private Use
Area, however. I am also considering making a bunch of pie-slice fraction
numerals for my planned tutorial on fractions using Turtle Art.

http://wiki.sugarlabs.org/go/Activities/TurtleArt/Tutorials/Numerals

I have several of these tutorials in a reasonably finished state, and many
more written but not illustrated, with more outlined. Cardinals and
mappings are in the Counting tutorial, but I have not tackled ordinals
yet.

> Yes, it's a long discussion (UTF8 vs UTF32 etc) but we're
> talking about time slices and repeated revisits in a spiraling
> trajectory (per Saxon treatments).
>
> So even if the bulk of your coding is in some Thai characterset,
> you're quite familiar with the lower 128 in the Unicode codespace.
> Python itself has 33 keywords and a large number of builtins,
> such that "average Python" might look like Romaji-intensive
> Japanese, i.e. "heavy on the Latin-1 pepper, other spices" (yet
> lots of room for top-level class, function, variable names, libraries
> stuffed with them, all outside Latin-1).
>
> These concerns have been a long term focus, and continue
> to be, as Python students I encounter may be there for work
> and that may mean using non-Latin-1 Python namespaces
> much of the time.
>
> Kirby
> _______________________________________________
> Edu-sig mailing list
> [hidden email]
> http://mail.python.org/mailman/listinfo/edu-sig
>


--
Edward Mokurai
(&#40664;&#38647;/&#2343;&#2352;&#2381;&#2350;&#2350;&#2375;&#2328;&#2358;&#2348;&#2381;&#2342;&#2327;&#2352;&#2381;&#2332;/&#1583;&#1726;&#1585;&#1605;&#1605;&#1740;&#1711;&#1726;&#1588;&#1576;&#1583;&#1711;&#1585;
&#1580;) Cherlin
Silent Thunder is my name, and Children are my nation.
The Cosmos is my dwelling place, the Truth my destination.
http://wiki.sugarlabs.org/go/Replacing_Textbooks


_______________________________________________
Edu-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/edu-sig
Reply | Threaded
Open this post in threaded view
|

Re: Musings on PEP8

Kirby Urner-6

http://wiki.sugarlabs.org/go/Activities/TurtleArt/Tutorials/Numerals

I have several of these tutorials in a reasonably finished state, and many
more written but not illustrated, with more outlined. Cardinals and
mappings are in the Counting tutorial, but I have not tackled ordinals
yet.

 
Fascinating stuff sir, lots I didn't know.  

Python has syntactical meaning for the backslash as well ( \ ) -- line continuation character.

So even in Python in Japanese, you would need \ to stay the same.

One doesn't find a lot of examples, at least not easily, of non-Latin-1 Python programs (source code).

I'm on the lookout for exhibits, collecting URIs.

Kirby


_______________________________________________
Edu-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/edu-sig