|
I move to bless mod_wsgi's definition of WSGI 1.1 [1] as the official definition of WSGI 1.1, which describes how to implement WSGI adapters for both Python 2.x and 3.x. It may not be perfect, but, it's been implemented twice, and seems ot have no fatal flaws (it doesn't do any lossy transforms, so any issues are irritations at worst). The basis for this definition is also described in the "WSGI 1.0 Ammendments" [2] page.
The definitions as they stand are clear enough to understand and implement, but not currently in spec-worthy language. (e.g. it says "should" and "may" in a colloquial fashion, but actually means MUST in some places and SHOULD in others, as defined by RFC 2119) Thus, I'd like to suggest that Graham (if he's willing?) should reformat the "Definition"/"Ammendments" as an actual diff against the current PEP 333. Then, I will recommend adopting that document as an actual standard WSGI 1.1, to replace PEP 333. This discussion has gone on long enough, and it doesn't really matter as much to have the perfect API, as it does to have a standard. James [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X [2] http://www.wsgi.org/wsgi/Amendments_1.0 _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
I second the move, recorded here:
http://listtree.appspot.com/wsgi2/ICvaujouPxb2gfEhDS_aiw -- Aaron Watters --- On Thu, 11/26/09, James Y Knight <[hidden email]> wrote: > From: James Y Knight <[hidden email]> > Subject: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec > To: "Web SIG" <[hidden email]> > Date: Thursday, November 26, 2009, 8:42 PM > I move to bless mod_wsgi's definition > of WSGI 1.1 [1] as the official definition of WSGI 1.1, > which describes how to implement WSGI adapters for both > Python 2.x and 3.x. It may not be perfect, but, it's been > implemented twice, and seems ot have no fatal flaws (it > doesn't do any lossy transforms, so any issues are > irritations at worst). The basis for this definition is also > described in the "WSGI 1.0 Ammendments" [2] page. > > The definitions as they stand are clear enough to > understand and implement, but not currently in spec-worthy > language. (e.g. it says "should" and "may" in a colloquial > fashion, but actually means MUST in some places and SHOULD > in others, as defined by RFC 2119) > > Thus, I'd like to suggest that Graham (if he's willing?) > should reformat the "Definition"/"Ammendments" as an actual > diff against the current PEP 333. Then, I will recommend > adopting that document as an actual standard WSGI 1.1, to > replace PEP 333. > > This discussion has gone on long enough, and it doesn't > really matter as much to have the perfect API, as it does to > have a standard. > > James > > [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X > [2] http://www.wsgi.org/wsgi/Amendments_1.0 > > _______________________________________________ > Web-SIG mailing list > [hidden email] > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
In reply to this post by James Y Knight
At 08:42 PM 11/26/2009 -0500, James Y Knight wrote:
>I move to bless mod_wsgi's definition of WSGI 1.1 [1] as the >official definition of WSGI 1.1, which describes how to implement >WSGI adapters for both Python 2.x and 3.x. It may not be perfect, >but, it's been implemented twice, and seems ot have no fatal flaws >(it doesn't do any lossy transforms, so any issues are irritations >at worst). The basis for this definition is also described in the >"WSGI 1.0 Ammendments" [2] page. > >The definitions as they stand are clear enough to understand and >implement, but not currently in spec-worthy language. (e.g. it says >"should" and "may" in a colloquial fashion, but actually means MUST >in some places and SHOULD in others, as defined by RFC 2119) > >Thus, I'd like to suggest that Graham (if he's willing?) should >reformat the "Definition"/"Ammendments" as an actual diff against >the current PEP 333. Then, I will recommend adopting that document >as an actual standard WSGI 1.1, to replace PEP 333. I'm +1, with a few caveats. First, as you mention, it needs to be spec'd properly. In particular, it should be clarified that the main changes are to *allow byte strings* in certain places where WSGI 1.0 demands a unicode string w/latin-1 encoding. Second, I do not think that the "additional guarantees/requirements" can be safely added to a 1.x version, as they make it impossible for an app to tell whether it's *really* running under 1.1 or under a broken piece of middleware that's passing through wsgi.version but not actually providing 1.1-level guarantees. I would therefore suggest that these additional guarantees and requirements be deferred to WSGI 2.0. _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
On Nov 27, 2009, at 10:20 AM, P.J. Eby wrote:
> Second, I do not think that the "additional guarantees/requirements" can be safely added to a 1.x version, as they make it impossible for an app to tell whether it's *really* running under 1.1 or under a broken piece of middleware that's passing through wsgi.version but not actually providing 1.1-level guarantees. I would therefore suggest that these additional guarantees and requirements be deferred to WSGI 2.0. Okay, let's look at these additional requirements in more detail. I see 4 that should be kept, 1 that can be dispensed with, and 1 I'm not sure about. > 1. The 'readline()' function of 'wsgi.input' may optionally take a size hint. Already de-facto required. Leaving it out helps no-one. KEEP. > 2. The 'wsgi.input' must provide an empty string as end of input stream marker. I don't think this will be a problem. What would WSGI middleware do to break this requirement? It was only put in in the first place so that CGI adapters could pass through their input stream (which may not ever provide an EOF) without having to wrap it. I agree that was a mistake, and should be corrected. KEEP. > 3. The size argument to 'read()' function of 'wsgi.input' would be optional and if not supplied the function would return all available request content. Thus would make 'wsgi.input' more file like as the WSGI specification suggests it is, but isn't really per original definition. This one could be a problem with middleware, and that feature shouldn't ever be used, in any case: reading into memory an arbitrary amount of data from a client is not a good thing to encourage. OMIT. > 4. The 'wsgi.file_wrapper' supplied by the WSGI adapter must honour the Content-Length response header and must only return from the file that amount of content. This would guarantee that using wsgi.file_wrapper to return part of a file for byte range requests would work. Given item #6, I suppose this is actually just a matter of efficiency, in case the file wrapper is sent to a middleware rather than directly to the wsgi gateway? If it goes directly to the gateway, that can of course stop reading by itself. ?undecided? > 5. Any WSGI application or middleware should not return more data than specified by the Content-Length response header if defined. As long as this is meant as "SHOULD", that's fine. It's not actually a requirement, but rather a suggestion of best practices. KEEP. > 6. The WSGI adapter must not pass on to the server any data above what the Content-Length response header defines if supplied. This is already required by HTTP. If the WSGI gateway doesn't make this happen somehow, it's generating invalid HTTP and that's a bug. Okay to clarify in the spec to ensure people don't miss the requirement when implementing. KEEP. James _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
At 12:34 PM 11/27/2009 -0500, James Y Knight wrote:
>On Nov 27, 2009, at 10:20 AM, P.J. Eby wrote: > > Second, I do not think that the "additional > guarantees/requirements" can be safely added to a 1.x version, as > they make it impossible for an app to tell whether it's *really* > running under 1.1 or under a broken piece of middleware that's > passing through wsgi.version but not actually providing 1.1-level > guarantees. I would therefore suggest that these additional > guarantees and requirements be deferred to WSGI 2.0. > >Okay, let's look at these additional requirements in more detail. I >see 4 that should be kept, 1 that can be dispensed with, and 1 I'm >not sure about. I agree with 2 of your keeps, and remain -0.5 to -1 on the others. See below... > > 1. The 'readline()' function of 'wsgi.input' may optionally take > a size hint. > >Already de-facto required. Leaving it out helps no-one. KEEP. Fair enough, since it's a MAY. On the other hand, because it's a MAY, it actually *helps* no-one, from a spec compatibility POV. (That is, you have to test whether it's available, so it's no different than it not being in the spec to begin with.) So, putting it in doesn't *hurt*, but neither does it *help*... so I lean towards leaving it to 2.x, where it can actually help. > > 2. The 'wsgi.input' must provide an empty string as end of input > stream marker. > >I don't think this will be a problem. What would WSGI middleware do >to break this requirement? It could be reading the original input stream, and replacing it with another one. Not very common I would guess, but it's still possible for a piece of perfectly valid 1.0 middleware to fail this requirement for 1.1, leading to the condition where you really can't tell if you're running valid 1.1 or not. >It was only put in in the first place so that CGI adapters could >pass through their input stream (which may not ever provide an EOF) >without having to wrap it. I agree that was a mistake, and should be >corrected. I agree... but only in 2.x. > > 3. The size argument to 'read()' function of 'wsgi.input' would > be optional and if not supplied the function would return all > available request content. Thus would make 'wsgi.input' more file > like as the WSGI specification suggests it is, but isn't really per > original definition. > >This one could be a problem with middleware, and that feature >shouldn't ever be used, in any case: reading into memory an >arbitrary amount of data from a client is not a good thing to encourage. OMIT. Agreed -- even in 2.x it's questionable if not harmful. > > 4. The 'wsgi.file_wrapper' supplied by the WSGI adapter must > honour the Content-Length response header and must only return from > the file that amount of content. This would guarantee that using > wsgi.file_wrapper to return part of a file for byte range requests would work. > >Given item #6, I suppose this is actually just a matter of >efficiency, in case the file wrapper is sent to a middleware rather >than directly to the wsgi gateway? If it goes directly to the >gateway, that can of course stop reading by itself. ?undecided? I don't really see how this one helps anything in 1.x, and so lean towards leaving it out. > > 5. Any WSGI application or middleware should not return more data > than specified by the Content-Length response header if defined. > >As long as this is meant as "SHOULD", that's fine. It's not actually >a requirement, but rather a suggestion of best practices. KEEP. > > > 6. The WSGI adapter must not pass on to the server any data above > what the Content-Length response header defines if supplied. > >This is already required by HTTP. If the WSGI gateway doesn't make >this happen somehow, it's generating invalid HTTP and that's a bug. >Okay to clarify in the spec to ensure people don't miss the >requirement when implementing. KEEP. Good points - I agree with these two, and they can be considered 1.0 clarifications as well. After the first four (which I see no reason to include) I was probably a little over-inclined to throw these two out (especially since I was reading the "should" above as a "must", per your proposal). _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
On Fri, Nov 27, 2009 at 12:20 PM, P.J. Eby <[hidden email]> wrote:
I think it was meant to be a must. Â The *caller* MAY pass in a size hint, the implementor MUST implement this optional argument. Â This is the de-facto requirement.
Â
Middleware sometimes does this, but any time it does this it always replaces the input stream with something truly file-like, e.g., StringIO or a temp file. Â Nothing but servers really hands sockets around, and sockets are the only objects I'm aware of that don't act quite like a file.
Well, we need a way to handle content of unknown length, but if the file terminates with '' then this isn't that important.
I don't really understand this either, unless it was handling range responses as well. Â Content-Length alone isn't very interesting in this case.Â
In this context, maybe 4 is just an extension of these?  Put 4 after 6 and maybe it'll seem more obvious...? -- Ian Bicking  |  http://blog.ianbicking.org  |  http://topplabs.org/civichacker _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
In reply to this post by James Y Knight
Please ensure you have also all read:
http://blog.dscpl.com.au/2009/10/details-on-wsgi-10-amendmentsclarificat.html I will post again later in detail when have some time to explain a few more points not mentioned in that post and where people aren't quite understanding the reasoning for doing things. One very quick comment about read(). Allowing read() with no argument is no different to a user saying read(environ['CONTENT_LENGTH']). Because a WSGI adapter/middleware is going to have to track bytes read to ensure can return an empty string as end sentinel, it will know length remaining and would internally for read() with no argument do read(remaining_bytes). As such no real differences in inefficiencies as far as memory use goes for implementing read() because of need to implement end sentinel. Also, you have concerns about read() with no argument, but frankly readline() with no argument, which is already required, is much worse because you cant really track bytes read and just read to end of input. This is because they only want to read to end of line and so reading all input is going to blow out memory use unreasonably as you speculate for read(). As such, a readline() implementation is likely to read in blocks and internally buffer where read() doesn't necessarily have to. It may also be pertinent to read: http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html as from memory it talks about issues with not paying attention to Content-Length on output filtering middleware as well. As I said, will reply later when have some time to focus. Right now I have a 2 year old to keep amused. Graham 2009/11/27 James Y Knight <[hidden email]>: > I move to bless mod_wsgi's definition of WSGI 1.1 [1] as the official definition of WSGI 1.1, which describes how to implement WSGI adapters for both Python 2.x and 3.x. It may not be perfect, but, it's been implemented twice, and seems ot have no fatal flaws (it doesn't do any lossy transforms, so any issues are irritations at worst). The basis for this definition is also described in the "WSGI 1.0 Ammendments" [2] page. > > The definitions as they stand are clear enough to understand and implement, but not currently in spec-worthy language. (e.g. it says "should" and "may" in a colloquial fashion, but actually means MUST in some places and SHOULD in others, as defined by RFC 2119) > > Thus, I'd like to suggest that Graham (if he's willing?) should reformat the "Definition"/"Ammendments" as an actual diff against the current PEP 333. Then, I will recommend adopting that document as an actual standard WSGI 1.1, to replace PEP 333. > > This discussion has gone on long enough, and it doesn't really matter as much to have the perfect API, as it does to have a standard. > > James > > [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X > [2] http://www.wsgi.org/wsgi/Amendments_1.0 > > _______________________________________________ > Web-SIG mailing list > [hidden email] > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com > Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
After reading my prior blog posts where I explained my reasoning
behind the changes, I will acknowledge that I haven't explained some stuff very well and people are failing to understand or getting wrong idea about why something is being suggested. I still believe there are though underlying problems there in the WSGI specification and right now, more by luck than design is various stuff working. In some cases such as readline(), the majority of WSGI applications/frameworks are in violation of the WSGI 1.0 specification due to their reliance on cgi.FieldStorage which makes calls to readline() with an argument. Either way, since there seemed to be objections at some level on every point, and since I really really have no enthusiasm for this stuff any more or of fighting for any change, I retract my personal interest in having any of the amendments as part of a WSGI 1.1 specification and will remove all that detail from mod_wsgi documentation. I will instead replace it with a separate page describing mod_wsgi compliance with WSGI 1.0 specification and highlighting those specific features which are in common, or not so common use, via mod_wsgi and which actually mean that people are writing applications incompatible with the WSGI 1.0 specification. To ensure compliance I could well raise Python exceptions for any use which isn't WSGI 1.0 compliant, but I have already learnt from where I tried get people to write portable WSGI applications by giving errors on certain use of stdin and stdout, that it is a pointless battle. All it got was a long list of users who believe mod_wsgi is broken even though if they read the actual documentation they would find it was their own software which was suspect or at least wasn't portable to all WSGI hosting mechanisms. This would only get worse if exceptions were raised for use of readline() with an argument and use of read() with no argument or argument of -1. Short story is that there are a fair few people who are just lazy, they will always write stuff the way the want to and not how it should be written. They will always blame other peoples code for being wrong before acknowledging they themselves are wrong. The only answer I therefore need out of WEB-SIG is whether the qualifications about how Python 3.X is to be supported are going to be an amendment to WSGI 1.0 or as a separate WSGI 1.1 update and whether if the latter whether the WSGI 1.1 tag will also have meaning for Python 2.X. I need an answer to this so I know whether to withdraw mod_wsgi 3.0 from download and replace it with a mod_wsgi 4.0 which changes the wsgi.version tuple being passed, for both Python 2.X and Python 3.X, from (1, 1) back to original (1, 0), given that some opinion seems to be that any interface changes can only really be performed as part of WSGI 2.0 and so I would be wrong in using (1, 1). If don't see an answer, then guess I will just have to revert it back to (1, 0) to be safe and to avoid any accusations that am highjacking the process. An answer sooner rather than later would be appreciated on the wsgi.version issue. Graham 2009/11/28 Graham Dumpleton <[hidden email]>: > Please ensure you have also all read: > > http://blog.dscpl.com.au/2009/10/details-on-wsgi-10-amendmentsclarificat.html > > I will post again later in detail when have some time to explain a few > more points not mentioned in that post and where people aren't quite > understanding the reasoning for doing things. > > One very quick comment about read(). > > Allowing read() with no argument is no different to a user saying > read(environ['CONTENT_LENGTH']). Because a WSGI adapter/middleware is > going to have to track bytes read to ensure can return an empty string > as end sentinel, it will know length remaining and would internally > for read() with no argument do read(remaining_bytes). As such no real > differences in inefficiencies as far as memory use goes for > implementing read() because of need to implement end sentinel. > > Also, you have concerns about read() with no argument, but frankly > readline() with no argument, which is already required, is much worse > because you cant really track bytes read and just read to end of > input. This is because they only want to read to end of line and so > reading all input is going to blow out memory use unreasonably as you > speculate for read(). As such, a readline() implementation is likely > to read in blocks and internally buffer where read() doesn't > necessarily have to. > > It may also be pertinent to read: > > http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html > > as from memory it talks about issues with not paying attention to > Content-Length on output filtering middleware as well. > > As I said, will reply later when have some time to focus. Right now I > have a 2 year old to keep amused. > > Graham > > 2009/11/27 James Y Knight <[hidden email]>: >> I move to bless mod_wsgi's definition of WSGI 1.1 [1] as the official definition of WSGI 1.1, which describes how to implement WSGI adapters for both Python 2.x and 3.x. It may not be perfect, but, it's been implemented twice, and seems ot have no fatal flaws (it doesn't do any lossy transforms, so any issues are irritations at worst). The basis for this definition is also described in the "WSGI 1.0 Ammendments" [2] page. >> >> The definitions as they stand are clear enough to understand and implement, but not currently in spec-worthy language. (e.g. it says "should" and "may" in a colloquial fashion, but actually means MUST in some places and SHOULD in others, as defined by RFC 2119) >> >> Thus, I'd like to suggest that Graham (if he's willing?) should reformat the "Definition"/"Ammendments" as an actual diff against the current PEP 333. Then, I will recommend adopting that document as an actual standard WSGI 1.1, to replace PEP 333. >> >> This discussion has gone on long enough, and it doesn't really matter as much to have the perfect API, as it does to have a standard. >> >> James >> >> [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X >> [2] http://www.wsgi.org/wsgi/Amendments_1.0 >> >> _______________________________________________ >> Web-SIG mailing list >> [hidden email] >> Web SIG: http://www.python.org/sigs/web-sig >> Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com >> > Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
2009/11/29 Graham Dumpleton <[hidden email]>:
> After reading my prior blog posts where I explained my reasoning > behind the changes, I will acknowledge that I haven't explained some > stuff very well and people are failing to understand or getting wrong > idea about why something is being suggested. > > I still believe there are though underlying problems there in the WSGI > specification and right now, more by luck than design is various stuff > working. In some cases such as readline(), the majority of WSGI > applications/frameworks are in violation of the WSGI 1.0 specification > due to their reliance on cgi.FieldStorage which makes calls to > readline() with an argument. > > Either way, since there seemed to be objections at some level on every > point, and since I really really have no enthusiasm for this stuff any > more or of fighting for any change, I retract my personal interest in > having any of the amendments as part of a WSGI 1.1 specification and > will remove all that detail from mod_wsgi documentation. I will > instead replace it with a separate page describing mod_wsgi compliance > with WSGI 1.0 specification and highlighting those specific features > which are in common, or not so common use, via mod_wsgi and which > actually mean that people are writing applications incompatible with > the WSGI 1.0 specification. > > To ensure compliance I could well raise Python exceptions for any use > which isn't WSGI 1.0 compliant, but I have already learnt from where I > tried get people to write portable WSGI applications by giving errors > on certain use of stdin and stdout, that it is a pointless battle. All > it got was a long list of users who believe mod_wsgi is broken even > though if they read the actual documentation they would find it was > their own software which was suspect or at least wasn't portable to > all WSGI hosting mechanisms. This would only get worse if exceptions > were raised for use of readline() with an argument and use of read() > with no argument or argument of -1. Short story is that there are a > fair few people who are just lazy, they will always write stuff the > way the want to and not how it should be written. They will always > blame other peoples code for being wrong before acknowledging they > themselves are wrong. > > The only answer I therefore need out of WEB-SIG is whether the > qualifications about how Python 3.X is to be supported are going to be > an amendment to WSGI 1.0 or as a separate WSGI 1.1 update and whether > if the latter whether the WSGI 1.1 tag will also have meaning for > Python 2.X. > > I need an answer to this so I know whether to withdraw mod_wsgi 3.0 > from download and replace it with a mod_wsgi 4.0 which changes the > wsgi.version tuple being passed, for both Python 2.X and Python 3.X, > from (1, 1) back to original (1, 0), given that some opinion seems to > be that any interface changes can only really be performed as part of > WSGI 2.0 and so I would be wrong in using (1, 1). > > If don't see an answer, then guess I will just have to revert it back > to (1, 0) to be safe and to avoid any accusations that am highjacking > the process. > > An answer sooner rather than later would be appreciated on the > wsgi.version issue. Answering my own question, it is actually obvious that it has to be called (1, 0). This is because wsgiref in Python 3.X already calls it (1, 0) and don't have much choice to be in agreement with that. I will therefore replace mod_wsgi 3.0 with a 4.0 release that reverts it to (1, 0) from (1, 1) and all the other stuff about amendments can be ignored. Graham > 2009/11/28 Graham Dumpleton <[hidden email]>: >> Please ensure you have also all read: >> >> http://blog.dscpl.com.au/2009/10/details-on-wsgi-10-amendmentsclarificat.html >> >> I will post again later in detail when have some time to explain a few >> more points not mentioned in that post and where people aren't quite >> understanding the reasoning for doing things. >> >> One very quick comment about read(). >> >> Allowing read() with no argument is no different to a user saying >> read(environ['CONTENT_LENGTH']). Because a WSGI adapter/middleware is >> going to have to track bytes read to ensure can return an empty string >> as end sentinel, it will know length remaining and would internally >> for read() with no argument do read(remaining_bytes). As such no real >> differences in inefficiencies as far as memory use goes for >> implementing read() because of need to implement end sentinel. >> >> Also, you have concerns about read() with no argument, but frankly >> readline() with no argument, which is already required, is much worse >> because you cant really track bytes read and just read to end of >> input. This is because they only want to read to end of line and so >> reading all input is going to blow out memory use unreasonably as you >> speculate for read(). As such, a readline() implementation is likely >> to read in blocks and internally buffer where read() doesn't >> necessarily have to. >> >> It may also be pertinent to read: >> >> http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html >> >> as from memory it talks about issues with not paying attention to >> Content-Length on output filtering middleware as well. >> >> As I said, will reply later when have some time to focus. Right now I >> have a 2 year old to keep amused. >> >> Graham >> >> 2009/11/27 James Y Knight <[hidden email]>: >>> I move to bless mod_wsgi's definition of WSGI 1.1 [1] as the official definition of WSGI 1.1, which describes how to implement WSGI adapters for both Python 2.x and 3.x. It may not be perfect, but, it's been implemented twice, and seems ot have no fatal flaws (it doesn't do any lossy transforms, so any issues are irritations at worst). The basis for this definition is also described in the "WSGI 1.0 Ammendments" [2] page. >>> >>> The definitions as they stand are clear enough to understand and implement, but not currently in spec-worthy language. (e.g. it says "should" and "may" in a colloquial fashion, but actually means MUST in some places and SHOULD in others, as defined by RFC 2119) >>> >>> Thus, I'd like to suggest that Graham (if he's willing?) should reformat the "Definition"/"Ammendments" as an actual diff against the current PEP 333. Then, I will recommend adopting that document as an actual standard WSGI 1.1, to replace PEP 333. >>> >>> This discussion has gone on long enough, and it doesn't really matter as much to have the perfect API, as it does to have a standard. >>> >>> James >>> >>> [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X >>> [2] http://www.wsgi.org/wsgi/Amendments_1.0 >>> >>> _______________________________________________ >>> Web-SIG mailing list >>> [hidden email] >>> Web SIG: http://www.python.org/sigs/web-sig >>> Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com >>> >> > Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
In reply to this post by Graham Dumpleton-2
On Nov 28, 2009, at 10:44 PM, Graham Dumpleton wrote:
> Either way, since there seemed to be objections at some level on every > point, and since I really really have no enthusiasm for this stuff any > more or of fighting for any change, I retract my personal interest in > having any of the amendments as part of a WSGI 1.1 specification and > will remove all that detail from mod_wsgi documentation [...] > If don't see an answer, then guess I will just have to revert it back > to (1, 0) to be safe and to avoid any accusations that am highjacking > the process. > > An answer sooner rather than later would be appreciated on the > wsgi.version issue. I'd rather appreciate it if you held off on making such changes until either this discussion either peters out or is resolved. You sound somewhat negative, but it seems to me that there's actually quite close to being a consensus on adopting most of your proposal. Changing the proposal out from under us doesn't really help things. The next step here is clearly for someone to redraft the changes as a diff against PEP 333. If you do not have any interest in being that person, please make that clear, so someone else can step up to do so. James _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
2009/11/29 James Y Knight <[hidden email]>:
> On Nov 28, 2009, at 10:44 PM, Graham Dumpleton wrote: >> Either way, since there seemed to be objections at some level on every >> point, and since I really really have no enthusiasm for this stuff any >> more or of fighting for any change, I retract my personal interest in >> having any of the amendments as part of a WSGI 1.1 specification and >> will remove all that detail from mod_wsgi documentation > > > [...] > >> If don't see an answer, then guess I will just have to revert it back >> to (1, 0) to be safe and to avoid any accusations that am highjacking >> the process. >> >> An answer sooner rather than later would be appreciated on the >> wsgi.version issue. > > I'd rather appreciate it if you held off on making such changes until either this discussion either peters out or is resolved. You sound somewhat negative, but it seems to me that there's actually quite close to being a consensus on adopting most of your proposal. Changing the proposal out from under us doesn't really help things. > > The next step here is clearly for someone to redraft the changes as a diff against PEP 333. If you do not have any interest in being that person, please make that clear, so someone else can step up to do so. No I do not want a part in drafting any changes, I just want to move on from all this stuff and starting working on other projects. Since though some don't seem to understand the reasons for the changes then you will find it hard to find some who is in a position to be able to do them. You probably really are just better off worrying about Python 3.X support and accept that tinkering at edges of WSGI 1.0 on other issues is not going to solve all the WSGI issues. As PJE suggest, leave that to an interface incompatible update so that you don't have this whole problem of what version existing components support. Graham _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
In reply to this post by Graham Dumpleton-2
Graham Dumpleton wrote:
> Answering my own question, it is actually obvious that it has to be > called (1, 0). This is because wsgiref in Python 3.X already calls it > (1, 0) and don't have much choice to be in agreement with that. wsgiref.simple_server in Python 3 to date is not something that anyone should worry about being compatible with. It is a 2to3 hack that cannot meaningfully claim to represent wsgi version anything. Careless use of urllib.parse.unquote causes 3.0's simple_server not to work at all, and 3.1's to mangle the path by treating it as UTF-8 instead of ISO-8859-1, as 'WSGI 1.1' proposed and mod_wsgi (and even mod_cgi via wsgiref.CGIHandler) delivered. Yes, I'm always going on about Unicode paths. I'm fed up of shipping apps with a page-long deployment note about fixing them. It pains me that in so many years both this and "What do we do about Python 3?" still haven't been addressed. mod_wsgi 3.0 already has more traction than wsgiref 3.1 and I would prefer not to see more farcical reverse-progress at this point. For what it's worth my responses on the issues of this thread. But at this point I really just want a BDFL to just come and do it, whatever it is. A new WSGI, whatever the version number, is massively overdue. >> 1. The 'readline()' function of 'wsgi.input' may optionally take a size hint. Yes. Obviously. Bad practice but unavoidable now. Should have been a 1.0 amendment a long time ago. >> 2. The 'wsgi.input' must provide an empty string as end of input stream marker. >> 3. The size argument to 'read()' function of 'wsgi.input' would be optional and if not supplied the function would return all available request content. >> 4. The 'wsgi.file_wrapper' supplied by the WSGI adapter must honour the Content-Length response header and must only return from the file that amount of content. +0. Seems reasonable but don't massively care. Presumably an application must refuse to run on 1.0 if it requires these behaviours? >> 5. Any WSGI application or middleware should not return more data than specified by the Content-Length response header if defined. >> 6. The WSGI adapter must not pass on to the server any data above what the Content-Length response header defines if supplied. Yes. -- And Clover mailto:[hidden email] http://www.doxdesk.com/ _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
In reply to this post by James Y Knight
On Nov 29, 2009, at 12:40 AM, James Y Knight wrote:
> The next step here is clearly for someone to redraft the changes as a diff against PEP 333. If you do not have any interest in being that person, please make that clear, so someone else can step up to do so. Okay, not sensing any other volunteers here...I guess it's all me. The intention of this spec update is to be compatible with existing middleware/applications when running on Python 2.X. Apps/middleware running on python 3.X require changes in any case, and this specification will tell them exactly what to expect. That Python 3.X middleware and WSGI adapters will have to deal with both bytestrings and unicode strings in many parts of the API (output status code, output headers, output response iterable/write callback) will add some complexity, but that's life. Any WSGI implementations on Python 3.X claiming compliance to WSGI 1.0 are most likely broken, and its behavior cannot be relied upon. Too bad about wsgiref. As self-appointed author, I am going to take a stand and say that both the python3-related string-type specifications, and the additional requirements except #3 (read() with no-args) and #4 (file_wrapper looking at Content-Length), will be included. And it will be called WSGI 1.1. Back to the list of "extra requirements": #1: (readline with an arg) must be included, despite the potential for breakage. That ship has already sailed, the breakage has already occurred, it's already required. Disagreement here really is of no consequence. #2: (wsgi.input() must return EOF at EOF): I do not believe will break any middleware. It will require some changes in some WSGI adapter implementations, but that's acceptable. If you have a real-life example of middleware that would break here, show it. So this will be included. #3 is not actually required for anything; at best it's an extra convenience; repeatedly reading until EOF will work just as well. Furthermore, the API change has the potential to break some middleware in Python 2.X, so I'll take the safe road and not make the change. The purpose behind #4 is essentially included in #6, and so is not needed as a separate requirement. #5 and #6 are uncontroversial and of no impact to an already-correct implementation. They will be included. I'll send a diff of the actual wording changes once I've written it. James _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
In reply to this post by James Y Knight
James Y Knight ha scritto:
> I move to bless mod_wsgi's definition of WSGI 1.1 [1] > [...] > > [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X Hi. Just a few questions. It is true that HTTP headers can be encoded assuming latin-1; and they can be encoded using PEP 383. However what about URI (that is, for PATH_INFO and the like)? For URI (if I remember correctly) the suggested encoding is UTF-8, so URLS should be decoded using url.decode('utf-8', 'surrogateescape') Is this correct? Now another question. Let's consider the `wsgiref.util.application_uri` function def application_uri(environ): url = environ['wsgi.url_scheme']+'://' from urllib.parse import quote if environ.get('HTTP_HOST'): url += environ['HTTP_HOST'] else: url += environ['SERVER_NAME'] if environ['wsgi.url_scheme'] == 'https': if environ['SERVER_PORT'] != '443': url += ':' + environ['SERVER_PORT'] else: if environ['SERVER_PORT'] != '80': url += ':' + environ['SERVER_PORT'] url += quote(environ.get('SCRIPT_NAME') or '/') return url There is a potential problem, here, with the quote function. This function does the following: def quote(string, safe='/', encoding=None, errors=None): if isinstance(string, str): if encoding is None: encoding = 'utf-8' if errors is None: errors = 'strict' string = string.encode(encoding, errors) This means that if we use surrogateescape, the informations about original bytes is lost here. This can be easily fixed by changing the application_uri function, but this also means that a WSGI application will not work with Python 3.1.x. Finally, a question about cookies. Cookie data SHOULD be transparent to the server/gateway; however WSGI is going to assume that data is encoded in latin-1. I don't know what the HTTP/Cookie spec says about this. However, from a WSGI application point of view, the cookie data can, as an example, contain some text encoded in UTF-8; this means that the application must first encode the data: cookie_bytes = cookie.encode('latin-1', 'surrogateescape') and then decode it using UTF-8: my_cookie_data = cookie_bytes.decode('utf-8') This is a bit unreasonable, but I don't know if this is a common practice (I do this, just to make an example). Manlio Perillo _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
Manlio Perillo wrote:
> However what about URI (that is, for PATH_INFO and the like)? > For URI (if I remember correctly) the suggested encoding is UTF-8, so > URLS should be decoded using > url.decode('utf-8', 'surrogateescape') > Is this correct? The currently-discussed proposal is ISO-8859-1, allowing the real bytes to be trivially extracted. This is consistent with the other headers and would be my preferred approach. Python 3.1's wsgiref.simple_server, on the other hand, blindly uses urllib.unquote, which defaults to UTF-8 without surrogateescape, mangling any non-UTF-8 input. I don't really care whether UTF-8+surrogateescape or ISO-8859-1 encoding is blessed. But *something* needs to be blessed. An encoding, an alternative undecoded path_info, both, something else... just *something*. > Let's consider the `wsgiref.util.application_uri` function > There is a potential problem, here, with the quote function. Yes. wsgiref is broken in Python 3.1. Not quite as broken as it was in 3.0, but still broken. Until we can come to a Pronouncement on what WSGI *is* in Python 3, it is meaningless anyway. > Cookie data SHOULD be transparent to the server/gateway; however WSGI is > going to assume that data is encoded in latin-1. Yeah. This is no big deal because non-ASCII characters in cookies are already broken everywhere(*). Given this and other limitations on what characters can go in cookies, they are habitually encoded using ad-hoc mechanisms handled by the application (typically a round of URL-encoding). *: in particular: - Opera and Chrome send non-ASCII cookie characters in UTF-8. - IE encodes using the system codepage (which can never be UTF-8), mangling any characters that don't fit in the codepage through the traditional Windows 'similar replacement character' scheme. - Mozilla uses the low byte of each UTF-16 code point (so ISO-8859-1 gets through but everything else is mangled) - Safari refuses to send any cookie containing non-ASCII characters. > I don't know what the HTTP/Cookie spec says about this. The traditional interpretation of RFC2616 is that headers are ISO-8859-1. You will notice that no browser correctly follows this. ...sigh. -- And Clover mailto:[hidden email] http://www.doxdesk.com/ -- And Clover mailto:[hidden email] http://www.doxdesk.com/ _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
And Clover ha scritto:
> [...] >> Cookie data SHOULD be transparent to the server/gateway; however WSGI is >> going to assume that data is encoded in latin-1. > > Yeah. This is no big deal because non-ASCII characters in cookies are > already broken everywhere(*). Given this and other limitations on what > characters can go in cookies, they are habitually encoded using ad-hoc > mechanisms handled by the application (typically a round of URL-encoding). > > *: in particular: > > - Opera and Chrome send non-ASCII cookie characters in UTF-8. > - IE encodes using the system codepage (which can never be UTF-8), > mangling any characters that don't fit in the codepage through the > traditional Windows 'similar replacement character' scheme. > - Mozilla uses the low byte of each UTF-16 code point (so ISO-8859-1 > gets through but everything else is mangled) > - Safari refuses to send any cookie containing non-ASCII characters. > Thanks for this summary. I think it should go in a wiki or in a separate document (like rationale) to the WSGI spec. However this should never happen with cookie, since cookie data is opaque to browser, and it MUST send it "as is". What you describe happen with other headers containing TEXT. And now I understand that strange behaviour of Firefox with non latin-1 strings in username, in HTTP Basic Authentication. > [...] Regards Manlio _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
In reply to this post by and-py
On Dec 3, 2009, at 1:35 PM, And Clover wrote:
> Manlio Perillo wrote: > >> However what about URI (that is, for PATH_INFO and the like)? >> For URI (if I remember correctly) the suggested encoding is UTF-8, so >> URLS should be decoded using > >> url.decode('utf-8', 'surrogateescape') > >> Is this correct? > > The currently-discussed proposal is ISO-8859-1, allowing the real bytes to be trivially extracted. This is consistent with the other headers and would be my preferred approach. Right, for WSGI 1.1 on Python 3.x, 8859-1 strings is the plan. Other, more ideologically pure options can be discussed for an incompatible revision of WSGI (e.g. the hypothetical 2.0). BTW: I hope to have a first draft of the changes by Monday. (But don't beat up on me if it's delayed; I am working on it.) James _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
In reply to this post by and-py
On Thu, Dec 03, 2009 at 07:35:14PM +0100, And Clover wrote:
> >I don't know what the HTTP/Cookie spec says about this. > > The traditional interpretation of RFC2616 is that headers are ISO-8859-1. > > You will notice that no browser correctly follows this. The RFC 2109 & 2965 say that a cookie's value can be anything: > The VALUE is opaque to the user agent and may be anything the origin > server chooses to send, possibly in a server-selected printable ASCII > encoding. Theoricaly you could put something like: 'foo\n\0bar' in a cookie. Also a cookie can include comments which have to be encoded using ... UTF-8: > Comment=value > OPTIONAL. Because cookies can be used to derive or store > private information about a user, the value of the Comment > attribute allows an origin server to document how it intends to > use the cookie. The user can inspect the information to decide > whether to initiate or continue a session with this cookie. > Characters in value MUST be in UTF-8 encoding. -- Henry PrĂȘcheur _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
In reply to this post by and-py
And Clover ha scritto:
> Manlio Perillo wrote: > >> However what about URI (that is, for PATH_INFO and the like)? >> For URI (if I remember correctly) the suggested encoding is UTF-8, so >> URLS should be decoded using > >> url.decode('utf-8', 'surrogateescape') > >> Is this correct? > > The currently-discussed proposal is ISO-8859-1, allowing the real bytes > to be trivially extracted. This is consistent with the other headers and > would be my preferred approach. > There is something that I don't understand. Some HTTP headers, like Accept-Language, contains data described as `token`, where: token = 1*<any CHAR except CTLs or separators> So a token, IMHO, is an opaque string, and it SHOULD not decoded. In Python 3.x it SHOULD be a byte string. Text content is described as `TEXT`, where: The TEXT rule is only used for descriptive field contents and values that are not intended to be interpreted by the message parser. Words of *TEXT MAY contain characters from character sets other than ISO- 8859-1 [22] only when encoded according to the rules of RFC 2047 [14]. TEXT = <any OCTET except CTLs, but including LWS> The only type of data where TEXT can be used is `quoted-string`. A `quoted-string` only appears in well specified portions of an header. So, IMHO, it is *not* correct for a WSGI middleware, to return all HTTP headers as Unicode strings. This is up to the application/framework, that must parse each header, split it in component and handle them as more appropriate (as byte string, Unicode string or instance of some other data type). > [...] Regards Manlio _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
|
On Thu, Dec 03, 2009 at 09:15:06PM +0100, Manlio Perillo wrote:
> There is something that I don't understand. > > Some HTTP headers, like Accept-Language, contains data described as > `token`, where: > > token = 1*<any CHAR except CTLs or separators> > > So a token, IMHO, is an opaque string, and it SHOULD not decoded. > In Python 3.x it SHOULD be a byte string. I think this is more an issue that frameworks should deal with. By decoding every headers value to latin-1: * It keeps WSGI simple. Simple is good. * WSGI sticks to what RFC 2616 (Hypertext Transfer Protocol -- HTTP/1.1) says. WSGI is about HTTP, but that doesn't necessarily includes all other standards extending HTTP. * It's possible to convert latin-1 strings to bytes without losing data. -- Henry PrĂȘcheur _______________________________________________ Web-SIG mailing list [hidden email] Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com |
| Powered by Nabble | Edit this page |
