Should PEP 3333 be Python 3-only? What about transcoding?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Should PEP 3333 be Python 3-only? What about transcoding?

PJ Eby
As I've been tidying up wsgiref in the stdlib for PEP 3333, I've been
noticing that there's a bit of an issue with the PEP as far as CGI variables.

Currently, the CGI example is the same as it is in PEP 3333, which
means that it's correct code for Python 2.x, but wrong for 3.x due to
the environment transcoding issue.  (See
http://bugs.python.org/issue10155 for details.)

There are other code sample differences, too.  In effect, PEP 3333 is
still using Python 2 code samples, because it's trying to cover every
version of Python from 2.1 through 3.2.

Should we ditch that, and say, "hey, if you want Python 2.x code
samples, go see PEP 333?"

That will simplify a couple of things, but still won't address the
transcoding issue.

Specifically, the problem is that on Python 3, os.environ contains
*unicode*, not bytes masquerading as unicode.  Unfortunately, this
means that it very possibly contains garbage for CGI variables, as
the web server puts bytes in the environment, then Python converts
those bytes to unicode using the system encoding + surrogateescape.

To get back to bytes, then, we have to decode using the same
combination, then re-encode with latin-1 to get back to a
WSGI-compatible string.

The hitch is this: not everything in os.environ comes from an HTTP
request, and therefore may not be decodable in such a fashion.  For
example, if you decode TMP or HOME or even DOCUMENT_ROOT that way,
you're going to get rubbish.

In wsgiref for the stdlib, I've used a variation of And Clover's
patch in issue #10155 to implement something that *only* transcodes
CGI variables that come from the web client request, but it's
dreadfully complex.

This isn't really a problem in wsgiref, because as far as I know,
nobody else has bothered to make another CGI WSGI runner besides the
one in wsgiref, and the sample in the PEP.

But it is a problem for the PEP, because the complexity involved is
high -- so high it would completely obscure the essential simplicity
of the CGI example, if it was written in-line.

There are many possible ways to address this, but my current leaning is to:

1. Change the PEP 3333 code samples to Python 3 only, and
backreference PEP 333 for Python 2 code samples

2. Make the CGI sample in 3333 do an indiscriminate transcode (which
only takes a few lines) and add a note to indicate that a robust CGI
implementation should only do it to CGI variables, suggesting the
wsgiref.handlers.read_environ() code as an example.

Any thoughts?


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Should PEP 3333 be Python 3-only? What about transcoding?

and-py
On Wed, 2010-11-03 at 19:19 -0400, P.J. Eby wrote:
> Should we ditch that, and say, "hey, if you want Python 2.x code
> samples, go see PEP 333?"

That seems reasonable to me: if there is indeed never going to be a
Python 2.8, there is no way the PEP can ever be accepted for a Python 2
release anyway.

Given this, I might go further and suggest dropping all mention of
Python 2. It might make the wording issues easier. (Although, ahem, that
would mean a bunch of rewriting for some poor soul.)

> 2. Make the CGI sample in 3333 do an indiscriminate transcode (which
> only takes a few lines) and add a note to indicate that a robust CGI
> implementation should only do it to CGI variables

Or go straight for unmolested os.environ, as long as there is that note
that it's not really the Right Thing. If we're going to be wrong for
some cases either way, might as well go for the simplest. The PEP code
needs to be illustrative more than it needs to be 100% correct.

--
And Clover
mailto:[hidden email] http://www.doxdesk.com
skype:uknrbobince gtalk:chat?jid=[hidden email]


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Should PEP 3333 be Python 3-only? What about transcoding?

Lennart Regebro-2
On Thu, Nov 4, 2010 at 02:26, and-py <[hidden email]> wrote:
> On Wed, 2010-11-03 at 19:19 -0400, P.J. Eby wrote:
>> Should we ditch that, and say, "hey, if you want Python 2.x code
>> samples, go see PEP 333?"
>
> That seems reasonable to me: if there is indeed never going to be a
> Python 2.8, there is no way the PEP can ever be accepted for a Python 2
> release anyway.

That doesn't mean it should not work on Python 2. Major rule in Python
3 porting: Don't change your API's. ;-)
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com