|
Hi, On Sat, 25 Feb 2012 20:23:39 +0000 Armin Ronacher <[hidden email]> wrote: > > I just uploaded PEP 414 which proposes am optional 'u' prefix for string > literals for Python 3. > > You can read the PEP online: http://www.python.org/dev/peps/pep-0414/ I don't understand this sentence: > The automatic upgrading of binary strings to unicode strings that > would be enabled by this proposal would make it much easier to port > such libraries over. What "automatic upgrading" is that talking about? > For instance, the urllib module in Python 2 is using byte strings, > and the one in Python 3 is using unicode strings. Are you talking about urllib.parse perhaps? > By leveraging a native string, users can avoid having to adjust for > that. What does "leveraging a native string" mean here? > The following is an incomplete list of APIs and general concepts that > use native strings and need implicit upgrading to unicode in Python > 3, and which would directly benefit from this support I'm confused. This PEP talks about unicode literals, not native string literals, so why would these APIs "directly benefit from this support"? Thanks Antoine. _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Guido van Rossum
On Sat, 25 Feb 2012 19:13:26 -0800
Guido van Rossum <[hidden email]> wrote: > If this can encourage more projects to support Python 3 (even if it's > only 3.3 and later) and hence improve adoption of Python 3, I'm all > for it. > > A small quibble: I'd like to see a benchmark of a 'u' function implemented in C. Even without implementing it in C, caching the results makes it much less prohibitive in tight loops: if sys.version_info >= (3, 0): def u(value): return value else: def u(value, _lit_cache={}): if value in _lit_cache: return _lit_cache[value] s = _lit_cache[value] = unicode(value, 'unicode-escape') return s u'\N{SNOWMAN}barbaz' -> 100000000 loops, best of 3: 0.00928 usec per loop u('\N{SNOWMAN}barbaz') -> 10000000 loops, best of 3: 0.15 usec per loop u'foobarbaz_%d' % x -> 1000000 loops, best of 3: 0.424 usec per loop u('foobarbaz_%d') % x -> 1000000 loops, best of 3: 0.598 usec per loop Regards Antoine. _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Antoine Pitrou
On 26 Feb 2012, at 17:45, Antoine Pitrou wrote: > > Hi, > > On Sat, 25 Feb 2012 20:23:39 +0000 > Armin Ronacher <[hidden email]> wrote: >> >> I just uploaded PEP 414 which proposes am optional 'u' prefix for string >> literals for Python 3. >> >> You can read the PEP online: http://www.python.org/dev/peps/pep-0414/ > > I don't understand this sentence: > >> The automatic upgrading of binary strings to unicode strings that >> would be enabled by this proposal would make it much easier to port >> such libraries over. > > What "automatic upgrading" is that talking about? If you use native string syntax (no prefix) then moving from Python 2 to Python 3 automatically "upgrades" (I agree an odd choice of word) byte string literals to unicode string literals. > >> For instance, the urllib module in Python 2 is using byte strings, >> and the one in Python 3 is using unicode strings. > > Are you talking about urllib.parse perhaps? > >> By leveraging a native string, users can avoid having to adjust for >> that. > > What does "leveraging a native string" mean here? By using native string syntax (without the unicode literals future import) then apis that take a binary string in Python 2 and a unicode string in Python 3 "just work" with the same syntax. You are "leveraging" native syntax to use the same apis with different types across the different version of Python. > >> The following is an incomplete list of APIs and general concepts that >> use native strings and need implicit upgrading to unicode in Python >> 3, and which would directly benefit from this support > > I'm confused. This PEP talks about unicode literals, not native string > literals, so why would these APIs "directly benefit from this support"? Because sometimes in your code you want to specify "native strings" and sometimes you want to specify Unicode strings. There is no single *syntax* that is compatible with both Python 2 and Python 3 that permits this. (If you use "u" for Unicode in Python 2 and no prefix for native strings then your code is Python 3 incompatible, if you use the future import so that your strings are unicode in both Python 2 and Python 3 then you lose the syntax for native strings.) Michael > > Thanks > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > [hidden email] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Antoine Pitrou
Hi,
On 2/26/12 5:45 PM, Antoine Pitrou wrote: >> The automatic upgrading of binary strings to unicode strings that >> would be enabled by this proposal would make it much easier to port >> such libraries over. > > What "automatic upgrading" is that talking about? The word "upgrade" is probably something that should be changed. It refers to the fact that 'foo' is a bytestring in 2.x and the same syntax means a unicode string in Python 3. This is exactly what is necessary for interfaces that were promoted to unicode interfaces in Python 3 (for instance Python identifiers, URLs etc.) > Are you talking about urllib.parse perhaps? Not only the parsing module. Headers on the urllib.request module are unicode as well. What the PEP is referring to is the urllib/urlparse and cgi module which was largely consolidated to the urllib package in Python 3. > What does "leveraging a native string" mean here? It means by using a native string to achieve the automatic upgrading which "does the right thing" in a lot of situations. > I'm confused. This PEP talks about unicode literals, not native string > literals, so why would these APIs "directly benefit from this support"? The native string literal already exists. It disappears if `unicode_literals` are future imported which is why this is relevant since the unicode literals future import in 2.x is recommended by some for making libraries run in both 2.x and 3.x. Regards, Armin _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Nick Coghlan
Nick Coghlan wrote:
> Armin's straw poll was actually about whether or not people used the > future import for division, rather than unicode literals. It is indeed > the same problem There are differences, though. Personally I'm very glad of the division import -- it's the only thing that keeps me sane when using floats. The alternative is not only butt-ugly but imposes an annoying performance penalty. I don't mind occasionally needing to glance at the top of a module in order to get the benefits. On the other hand, it's not much of a burden to put 'u' in front of string literals, and there is no performance difference. -- Greg _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Nick Coghlan
On Feb 26, 2012, at 09:20 PM, Nick Coghlan wrote:
>It reduces the problem (compared to omitting the import and using a >u() function), but it's still ugly and still involves the "action at a >distance" of the unicode literals import. Frankly, that doesn't bother me at all. I've been using the future import in all my code pretty successfully for a long while now. It's much more important for a project to use or not use the future import consistently, and then there really should be no confusion when looking at the code for that project. I'm not necessarily saying I'm opposed to the purpose of the PEP. I do think it's unnecessary for most Python problem domains, but can appreciate that WSGI apps are feeling a special pain here that should be addressed somehow. It would be nice however if the solution were in the form of a separate module that could be used in earlier Python versions. -Barry _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
On Sun, 2012-02-26 at 16:06 -0500, Barry Warsaw wrote:
> On Feb 26, 2012, at 09:20 PM, Nick Coghlan wrote: > > >It reduces the problem (compared to omitting the import and using a > >u() function), but it's still ugly and still involves the "action at a > >distance" of the unicode literals import. > > Frankly, that doesn't bother me at all. I've been using the future import in > all my code pretty successfully for a long while now. It's much more > important for a project to use or not use the future import consistently, and > then there really should be no confusion when looking at the code for that > project. That's completely reasonable in a highly controlled project with relatively few highly-bought-in contributors. In projects with lots of hit-and-run contributors, though, it's more desirable to have things meet a rule of least surprise. Much of the software I work on is Python 3 compatible, but it's still used primarily on Python 2. Because most people still care primarily about Python 2, and most don't have a lot of Python 3 experience, it's extremely common to see folks submitting patches with u'' literals in them. > I'm not necessarily saying I'm opposed to the purpose of the PEP. I do think > it's unnecessary for most Python problem domains, but can appreciate that WSGI > apps are feeling a special pain here that should be addressed somehow. It > would be nice however if the solution were in the form of a separate module > that could be used in earlier Python versions. If we use the unicode_literals future import, or some other exernal module strategy, it doesn't help much with the hitnrun contributor thing, I fear. - C _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Guido van Rossum
On Sat, Feb 25, 2012 at 22:13, Guido van Rossum <[hidden email]> wrote: If this can encourage more projects to support Python 3 (even if it's +1 from me for the same reasons. If this were to go in then for Python 3.3 the section of the porting HOWTO on what to do when you support Python 2.6 and later (http://docs.python.org/howto/pyporting.html#python-2-3-compatible-source) would change to:
* Use ``from __future__ import print_functions`` OR use ``print(x)`` but always with a single argument OR use six * Use ``from __future__ import unicode_literals`` OR make sure to use the 'u' prefix for all Unicode strings (and then mention the concept of native strings) or use six
* Use the 'b' prefix for byte literals or use six All understandable and with either a __future__ import solution or syntactic support solution for all issues, giving people the choice of either approach for what they prefer for each approach. I would also be willing to move the Python 2/3 compatible source section to the top and thus implicitly become the preferred way to port since people in the community have seemingly been gravitating towards that approach even without this help.
-Brett A small quibble: I'd like to see a benchmark of a 'u' function implemented in C. _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Chris McDonough
Chris McDonough <chrism <at> plope.com> writes:
> If we use the unicode_literals future import, or some other exernal > module strategy, it doesn't help much with the hitnrun contributor > thing, I fear. Surely some curating of hit-and-run contributions takes place? If you accept contributions from hit-and-run contributors without changes, ISTM that could compromise the quality of the codebase somewhat. Also, is not the overall impact on the codebase of hit-and-run contributors small compared to more the impact from involved contributors? Regards, Vinay Sajip _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Brett Cannon-2
On Feb 26, 2012, at 05:44 PM, Brett Cannon wrote:
>On Sat, Feb 25, 2012 at 22:13, Guido van Rossum <[hidden email]> wrote: > >> If this can encourage more projects to support Python 3 (even if it's >> only 3.3 and later) and hence improve adoption of Python 3, I'm all >> for it. >> >> >+1 from me for the same reasons. Just to be clear, I'm solidly +1 on anything we can do to increase the pace of Python 3 migration. -Barry _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Vinay Sajip
On Sun, 2012-02-26 at 23:06 +0000, Vinay Sajip wrote:
> Chris McDonough <chrism <at> plope.com> writes: > > > If we use the unicode_literals future import, or some other exernal > > module strategy, it doesn't help much with the hitnrun contributor > > thing, I fear. > > Surely some curating of hit-and-run contributions takes place? If you accept > contributions from hit-and-run contributors without changes, ISTM that could > compromise the quality of the codebase somewhat. Nah. Real developers just accept all pull requests and let god sort it out. ;-) But seriously, the less time it takes me to review and fix a pull request from a casual contributor, the better. - C _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Armin Ronacher
On 2/26/2012 7:46 AM, Armin Ronacher wrote:
I am not enthusiastic about adding duplication that is useless for writing Python 3 code, but like others, I do want to encourage more porting of libraries to run with Python 3. I understand that the unicode transition seems the be the biggest barrier, especially for some applications. It is OK with me if ported code only runs on 3.3+, with its improved unicode. If u'' is added, I would like it to be added as deprecated in the doc with a note that it is only intended for multi-version Python 2/3 code. > In case this PEP gets approved I will refactor the tokenize module while > adding support for "u" prefixes and use that as the basis for a > installation hook for older Python 3 versions. I presume such a hook would simply remove 'u' prefixes and would run *much* faster than 2to3. If such a hook is satisfactory for 3.2, why would it not be satisfactory for 3.3? -- Terry Jan Reedy _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
On Mon, Feb 27, 2012 at 11:55 AM, Terry Reedy <[hidden email]> wrote:
> I presume such a hook would simply remove 'u' prefixes and would run *much* > faster than 2to3. If such a hook is satisfactory for 3.2, why would it not > be satisfactory for 3.3? Because an import hook is still a lot more complicated than "Write modern code that runs on 2.6+ and follows certain guidelines and it will also just run on 3.3+". Cheers, Nick. -- Nick Coghlan | [hidden email] | Brisbane, Australia _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Barry Warsaw
2012/2/27 Barry Warsaw <[hidden email]>
+1 I think this is a great proposal that has the potential to remove one of the (for me at least, _the_) main obstacles to writing code compatible with both 2.7 and 3.x.
/f I reject your reality and substitute my own. _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Nick Coghlan
Am 26.02.2012 07:06, schrieb Nick Coghlan:
> On Sun, Feb 26, 2012 at 1:13 PM, Guido van Rossum <[hidden email]> wrote: >> A small quibble: I'd like to see a benchmark of a 'u' function implemented in C. > > Even if it was quite fast, I don't think such a function would bring > the same benefits as restoring support for u'' literals. You claim that, but your argument doesn't actually support that claim (or I fail to see the argument). > > Using myself as an example, my work projects (such as PulpDist [1]) > are currently written to target Python 2.6, since that's the system > Python on RHEL 6. As a web application, PulpDist has unicode literals > *everywhere*, but (as Armin pointed out to me), turning on "from > __future__ import unicode_literals" in every file would be incorrect, Right. So you shouldn't use the __future__ import, but the u() function. > IIRC, I've previously opposed the restoration of unicode literals as a > retrograde step. Looking at the implications for the future migration > of PulpDist has changed my mind. Did you try to follow the path of the u() function? Regards, Martin _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Chris McDonough
> Much of the software I work on is Python 3 compatible, but it's still
> used primarily on Python 2. Because most people still care primarily > about Python 2, and most don't have a lot of Python 3 experience, it's > extremely common to see folks submitting patches with u'' literals in > them. These can be easily fixed, right? Regards, Martin _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Armin Ronacher
>> There are no significant overhead to use converters.
> That's because what you're benchmarking here more than anything is the > overhead of eval() :-) See the benchmark linked in the PEP for one that > measures the actual performance of the string literal / wrapper. There are a few other unproven performance claims in the PEP. Can you kindly provide the benchmarks you have been using? In particular, I'm interested in the claim " In many cases 2to3 runs one or two orders of magnitude slower than the testsuite for the library or application it's testing." Regards, Martin _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Barry Warsaw
Am 27.02.2012 00:07, schrieb Barry Warsaw:
> On Feb 26, 2012, at 05:44 PM, Brett Cannon wrote: > >> On Sat, Feb 25, 2012 at 22:13, Guido van Rossum <[hidden email]> wrote: >> >>> If this can encourage more projects to support Python 3 (even if it's >>> only 3.3 and later) and hence improve adoption of Python 3, I'm all >>> for it. >>> >>> >> +1 from me for the same reasons. > > Just to be clear, I'm solidly +1 on anything we can do to increase the pace of > Python 3 migration. I find this rationale a bit sad: it's not that there is any (IMO) good technical reason for the feature - only that people "hate" the many available alternatives for some reason. But then, practicality beats purity, so be it. Regards, Martin _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
In reply to this post by Armin Ronacher
Il 25 febbraio 2012 21:23, Armin Ronacher
<[hidden email]> ha scritto: > Hi, > > I just uploaded PEP 414 which proposes am optional 'u' prefix for string > literals for Python 3. > > You can read the PEP online: http://www.python.org/dev/peps/pep-0414/ > > This is a followup to the discussion about this topic here on the > mailinglist and on twitter/IRC over the last few weeks. > > > Regards, > Armin If the main point of this proposal is avoiding an explicit 2to3 run on account of 2to3 being too slow then I'm -1. That should be fixed at 2to3 level, not at python syntax level. A common strategy to distribute code able to run on both python 2 and python 3 is using the following hack in setup.py: http://docs.python.org/dev/howto/pyporting.html#during-installation That's what I used in psutil and it works just fine. Also, I believe it's the *right* strategy as it lets you freely write python 2 code and avoid using ugly hacks such as "sys.exc_info()[1]" and "if PY3: ..." all around the place. 2to3 might be slow but introducing workarounds encouraging not to use it is only going to cause a proliferation of ugly and hackish code in the python ecosystem. Now, psutil is a relatively small project and the 2to3 conversion doesn't take much time. Having users "unawarely" run 2to3 at installation time is an acceptable burden in terms of speed. That's going to be different on larger code bases such as Twisted's. One way to fix that might be making 2to3 generate and rely on a "2to3.diff" file containing all the differences. That would be generated the first time "python setup.py build/install" is run and then partially re-calculated every time a file is modified. Third-party library vendors can include 2to3.diff as part of the tarball they distribute so that the end user won't experience any slow down deriving from the 2to3 conversion. --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
|
On Mon, Feb 27, 2012 at 9:34 PM, Giampaolo Rodolà <[hidden email]> wrote:
> If the main point of this proposal is avoiding an explicit 2to3 run on > account of 2to3 being too slow then I'm -1. No, the main point is that adding a compile step to the Python development process sucks. The slow speed of 2to3 is one factor, but single source is just far, far, easier to maintain than continually running 2to3 to get a working Python 3 version. When we have the maintainers of major web frameworks and libraries telling us that this is a painful aspect for their ports (and, subsequently, the ports of their users), it would be irresponsible of us to ignore their feedback. Sure, some early adopters are happy with the 2to3 process, that's not in dispute. However, many developers are not, and (just as relevant) many folks that haven't started their ports yet have highlighted it as one of the aspects that bothers them. Is restoring support for unicode literals a small retrograde step that partially undoes the language cleanup that occurred in 3.0? Yes, it is. However, it really does significantly increase the amount of 2.x code that will *just run* on Python 3 (or will run with minor tweaks). I can live with that - as MvL said, this is a classic case of practicality beating purity. Cheers, Nick. -- Nick Coghlan | [hidden email] | Brisbane, Australia _______________________________________________ Python-Dev mailing list [hidden email] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%2B1324100855712-1801473%40n6.nabble.com |
| Powered by Nabble | Edit this page |
