PEP 444 Goals

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

PEP 444 Goals

Timothy Farrell-2
Hello web-sig.  My name is Timothy Farrell.  I'm the developer of the Rocket web server.  I understand that most of you are more experienced and passionate than myself.  But I'm come here because I want to see certain things standardized.  I'm pretty new to this forum but I've read through all the recent discussions on PEP 444.  That being said, I'll try to take a humble approach.

It seems to me that the spec that Alice is working on could be something great but the problems are not well defined (in the PEP).  This causes confusion about what the goals are.  There's some disagreement about whether or not certain features should be in PEP 444.  I think those people have a different idea for what PEP 444 ought to be.  The first thing that should be done is clearly defining the shortcomings with PEP 3333 that PEP 444 seeks to address and limit our PEP 444 discussions to solving those problems.

Since Alice is rewriting the PEP perhaps we should all sit back for a second until we have a PEP to work off of.  That will help the discussion be a little more focused.

Sorry if I've stepped on anyone's toes.

-tim
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Alice Bevan–McGregor
> It seems to me that the spec that Alice is working on could be
> something great but the problems are not well defined (in the PEP).  
> This causes confusion about what the goals are.

For completeness sake, here's a slightly simplified Abstract:

:: A proposed second-generation standard interface between web servers
and Python 2.6+ and 3.1+ applications.

The rationale for even having such an interface is outlined in PEP 333.

Ignoring async for the moment, the goals of the PEP 444 rewrite are:

:: Clear separation of "narrative" from "rules to be followed".  This
allows developers of both servers and applications to easily run
through a confomance "check list".

:: Isolation of examples and rationale to improve readability of the
core rulesets.

:: Clarification of often mis-interpreted rules from PEP 333 (and those
carried over in 3333).

:: Elimination of unintentional non-conformance, esp. re: cgi.FieldStorage.

:: Massive simplification of call flow.  Replacing start_response with
a returned 3-tuple immensely simplifies the task of middleware that
needs to capture HTTP status or manipulate (or even examine) response
headers. [1]

:: Reduction of re-implementation / NIH syndrome by incorporating the
most common (1%) of features most often relegated to middleware or
functional helpers.  Unicode decoding of a small handful of values (CGI
values that pull from the request URI) is the biggest example. [2, 3]

:: Cross-compatibility considerations.  The definition and use of
native strings vs. byte strings is the biggest example of this in the
rewrite.

:: Making optional (and thus rarely-implemented) features non-optional.
 E.g. server support for HTTP/1.1 with clarifications for interfacing
applications to 1.1 servers.  Thus pipelining, chunked encoding, et.
al. as per the HTTP 1.1 RFC.

There are likely others I can't think of at the moment.  ;)  If I
remember anything else as I wake up more fully (caffeine zombie, here)
I'll post an additional reply.

Footnotes:

[1] This also happens to be a very Pythonic solution.

[2] This does not mean WSGI 2 will attempt to "compete" with
frameworks; merely reduce the multiplication of effort for the common
denominator.

[3] Filters are covered under re-implementation.

> Since Alice is rewriting the PEP perhaps we should all sit back for a
> second until we have a PEP to work off of.  That will help the
> discussion be a little more focused.

I'll have a direct translation of my current rewritten draft into ReST
for incorporation on the Python.org website within a few hours.  
Unfortunately, in the short term, it still doesn't include a high-level
goal overview, though will incorporate the consensus (thus far) on
removing the ability to return unicode response data.

> Sorry if I've stepped on anyone's toes.

No worries; you do raise a very valid point.

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

James Y Knight
On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
> :: Making optional (and thus rarely-implemented) features non-optional. E.g. server support for HTTP/1.1 with clarifications for interfacing applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al. as per the HTTP 1.1 RFC.

Requirements on the HTTP compliance of the server don't really have any place in the WSGI spec. You should be able to be WSGI compliant even if you don't use the HTTP transport at all (e.g. maybe you just send around requests via SCGI).

The original spec got this right: chunking etc are something which is not relevant to the wsgi application code -- it is up to the server to implement the HTTP transport according to the HTTP spec, if it's purporting to be an HTTP server.

James
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Alice Bevan–McGregor
On 2011-01-06 13:06:36 -0800, James Y Knight said:

> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>> :: Making optional (and thus rarely-implemented) features non-optional.
>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>> applications to 1.1 servers.  Thus pipelining, chunked encoding, et.
>> al. as per the HTTP 1.1 RFC.
>
> Requirements on the HTTP compliance of the server don't really have any
> place in the WSGI spec. You should be able to be WSGI compliant even if
> you don't use the HTTP transport at all (e.g. maybe you just send
> around requests via SCGI).
> The original spec got this right: chunking etc are something which is
> not relevant to the wsgi application code -- it is up to the server to
> implement the HTTP transport according to the HTTP spec, if it's
> purporting to be an HTTP server.

Chunking is actually quite relevant to the specification, as WSGI and
PEP 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;)
allow for chunked bodies regardless of higher-level support for
chunking.  The body iterator.  Previously you /had/ to define a length,
with chunked encoding at the server level, you don't.

I agree, however, that not all gateways will be able to implement the
relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google
search, seems to support it as well. I should re-word it as:

"For those servers capable of HTTP/1.1 features the implementation of
such features is required."

+1

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

James Y Knight

On Jan 6, 2011, at 4:56 PM, Alice Bevan–McGregor wrote:

> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>
>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>> :: Making optional (and thus rarely-implemented) features non-optional. E.g. server support for HTTP/1.1 with clarifications for interfacing applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al. as per the HTTP 1.1 RFC.
>> Requirements on the HTTP compliance of the server don't really have any place in the WSGI spec. You should be able to be WSGI compliant even if you don't use the HTTP transport at all (e.g. maybe you just send around requests via SCGI).
>> The original spec got this right: chunking etc are something which is not relevant to the wsgi application code -- it is up to the server to implement the HTTP transport according to the HTTP spec, if it's purporting to be an HTTP server.
>
> Chunking is actually quite relevant to the specification, as WSGI and PEP 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for chunked bodies regardless of higher-level support for chunking.  The body iterator.  Previously you /had/ to define a length, with chunked encoding at the server level, you don't.

No you don't -- HTTP 1.0 allows indeterminate-length output. The server simply must close the connection to indicate the end of the response if either the client version HTTP/1.0, or the server doesn't implement HTTP/1.1.

James
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Alice Bevan–McGregor
On 2011-01-06 14:01:09 -0800, James Y Knight said:

> No you don't -- HTTP 1.0 allows indeterminate-length output. The server
> simply must close the connection to indicate the end of the response if
> either the client version HTTP/1.0, or the server doesn't implement
> HTTP/1.1.

Ah, you are correct.  There was something, somewhere I was reading
related to WSGI about requiring content-length... but no matter.

Interestingly enough, HTTP/1.0 also supports pipelining (though
obviously not if content-length is missing) via the `Connection:
keep-alive` header.  HTTP/1.1 mandates keep-alive by default (which is
a good thing, IMHO) and offers a work-around for missing content-length
to preserve the connection: chunked encoding.  Add to that 100-Continue
(allowing delayed /transfer/ of the request body until the first
wsgi.input.read() operation) and allows proper, full URLs to be
requested, amongst other goodies.

Arguing against mandated HTTP/1.1 support (where possible) seems...
silly to me.  HTTP/1.1 has been around for a long time (adopted by the
major browsers in 1996), is well understood, is /trivial/ to implement
(I managed it as part of my 172 Python opcode HTTP server
implementation), and Just Makes Sense.

If there can be a good technical reason why the adapted language ("if
possible, it's required") can not be used, I'll definitely re-consider
this point.  Considering that detection is easy (SERVER_PROTOCOL ==
"1.0"), adaption by the application to either case is easy (detect and
if not present consume the body_iter and determine length) and it's a
15 year old standard: welcome to the 20'th century.  ;)

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Graham Dumpleton-2
In reply to this post by Alice Bevan–McGregor
On 7 January 2011 08:56, Alice Bevan–McGregor <[hidden email]> wrote:

> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>
>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>>
>>> :: Making optional (and thus rarely-implemented) features non-optional.
>>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>>> applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al. as
>>> per the HTTP 1.1 RFC.
>>
>> Requirements on the HTTP compliance of the server don't really have any
>> place in the WSGI spec. You should be able to be WSGI compliant even if you
>> don't use the HTTP transport at all (e.g. maybe you just send around
>> requests via SCGI).
>> The original spec got this right: chunking etc are something which is not
>> relevant to the wsgi application code -- it is up to the server to implement
>> the HTTP transport according to the HTTP spec, if it's purporting to be an
>> HTTP server.
>
> Chunking is actually quite relevant to the specification, as WSGI and PEP
> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for
> chunked bodies regardless of higher-level support for chunking.  The body
> iterator.  Previously you /had/ to define a length, with chunked encoding at
> the server level, you don't.
>
> I agree, however, that not all gateways will be able to implement the
> relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google search,
> seems to support it as well. I should re-word it as:
>
> "For those servers capable of HTTP/1.1 features the implementation of such
> features is required."

I would question whether FASTCGI, SCGI or AJP support the concept of
chunking of responses to the extent that the application can prepare
the final content including chunks as required by the HTTP
specification. Further, in Apache at least, the output from a web
application served via those protocols is still pushed through the
Apache output filter chain so as to allow the filters to modify the
response, eg., apply compression using mod_deflate. As a consequence,
the standard HTTP 'CHUNK' output filter is still a part of the output
filter stack. This means that were a web application to try and do
chunking itself, then Apache would rechunk such that the original
chunking became part of the content, rather than the transfer
encoding.

So, in order to be able to achieve what I think you want, with a web
application being able to do chunking itself, you would need to modify
the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
also like mod_cgi and mod_cgid of Apache.

The only WSGI implementation I know of for Apache where you might even
be able to do what you want is uWSGI. This is because I believe from
memory it uses a mode in Apache by default called assbackwords. What
this allows is for the output from the web application to bypass the
Apache output filter stack and directly control the raw HTTP output.
This gives uWSGI a little bit less overhead in Apache, but at the loss
of the ability to actually use Apache output filters and for Apache to
fix up response headers in any way. There is a flag in uWSGI which can
optionally be set to make it use the more traditional mode and not use
assbackwords.

Thus, I believe you would be fighting against server implementations
such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
chunking to be supported at the level of the web application.

About all you can do is ensure that the WSGI specification doesn't
include anything in it which would prevent a web application
harnessing indirectly such a feature as chunking where the web server
supports it.

As it is, it isn't chunked responses which is even the problem,
because if a underlying web server supports chunking for responses,
all you need to do is not set the content length.

The problem area with chunking is the request content as the way that
the WSGI specification is written prevents being able to have chunked
request content. I have described the issue previously and made
suggestions about alternate way that wsgi.input could be used.

Graham

> +1
>
>        - Alice.
>
>
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Graham Dumpleton-2
One other comment about HTTP/1.1 features.

You will always be battling to have some HTTP/1.1 features work in a
controllable way. This is because WSGI gateways/adapters aren't often
directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
AJP, CGI etc. In this sort of situation you are at the mercy of what
the modules implementing those protocols do, or even are hamstrung by
how those protocols work.

The classic example is 100-continue processing. This simply cannot
work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
mechanisms where proxying is performed as the protocol being used
doesn't implement a notion of end to end signalling in respect of
100-continue.

The current WSGI specification acknowledges that by saying:

"""
Servers and gateways that implement HTTP 1.1 must provide transparent
support for HTTP 1.1's "expect/continue" mechanism. This may be done
in any of several ways:

* Respond to requests containing an Expect: 100-continue request with
an immediate "100 Continue" response, and proceed normally.
* Proceed with the request normally, but provide the application with
a wsgi.input stream that will send the "100 Continue" response if/when
the application first attempts to read from the input stream. The read
request must then remain blocked until the client responds.
* Wait until the client decides that the server does not support
expect/continue, and sends the request body on its own. (This is
suboptimal, and is not recommended.)
"""

If you are going to try and push for full visibility of HTTP/1.1 and
an ability to control it at the application level then you will fail
with 100-continue to start with.

So, although option 2 above would be the most ideal and is giving the
application control, specifically the ability to send an error
response based on request headers alone, and with reading the response
and triggering the 100-continue, it isn't practical to require it, as
the majority of hosting mechanisms for WSGI wouldn't even be able to
implement it that way.

The same goes for any other feature, there is no point mandating a
feature that can only be realistically implementing on a minority of
implementations. This would be even worse where dependence on such a
feature would mean that the WSGI application would no longer be
portable to another WSGI server and destroys the notion that WSGI
provides a portable interface.

This isn't just restricted to HTTP/1.1 features either, but also
applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
that are directly hooked into the URL parsing of the base HTTP server
can provide that information, which basically means that only pure
Python HTTP/WSGI servers are likely able to provide it without
guessing, and in that case such servers usually are always used where
WSGI application mounted at root anyway.

Graham

On 7 January 2011 09:29, Graham Dumpleton <[hidden email]> wrote:

> On 7 January 2011 08:56, Alice Bevan–McGregor <[hidden email]> wrote:
>> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>>
>>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>>>
>>>> :: Making optional (and thus rarely-implemented) features non-optional.
>>>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>>>> applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al. as
>>>> per the HTTP 1.1 RFC.
>>>
>>> Requirements on the HTTP compliance of the server don't really have any
>>> place in the WSGI spec. You should be able to be WSGI compliant even if you
>>> don't use the HTTP transport at all (e.g. maybe you just send around
>>> requests via SCGI).
>>> The original spec got this right: chunking etc are something which is not
>>> relevant to the wsgi application code -- it is up to the server to implement
>>> the HTTP transport according to the HTTP spec, if it's purporting to be an
>>> HTTP server.
>>
>> Chunking is actually quite relevant to the specification, as WSGI and PEP
>> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for
>> chunked bodies regardless of higher-level support for chunking.  The body
>> iterator.  Previously you /had/ to define a length, with chunked encoding at
>> the server level, you don't.
>>
>> I agree, however, that not all gateways will be able to implement the
>> relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google search,
>> seems to support it as well. I should re-word it as:
>>
>> "For those servers capable of HTTP/1.1 features the implementation of such
>> features is required."
>
> I would question whether FASTCGI, SCGI or AJP support the concept of
> chunking of responses to the extent that the application can prepare
> the final content including chunks as required by the HTTP
> specification. Further, in Apache at least, the output from a web
> application served via those protocols is still pushed through the
> Apache output filter chain so as to allow the filters to modify the
> response, eg., apply compression using mod_deflate. As a consequence,
> the standard HTTP 'CHUNK' output filter is still a part of the output
> filter stack. This means that were a web application to try and do
> chunking itself, then Apache would rechunk such that the original
> chunking became part of the content, rather than the transfer
> encoding.
>
> So, in order to be able to achieve what I think you want, with a web
> application being able to do chunking itself, you would need to modify
> the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
> also like mod_cgi and mod_cgid of Apache.
>
> The only WSGI implementation I know of for Apache where you might even
> be able to do what you want is uWSGI. This is because I believe from
> memory it uses a mode in Apache by default called assbackwords. What
> this allows is for the output from the web application to bypass the
> Apache output filter stack and directly control the raw HTTP output.
> This gives uWSGI a little bit less overhead in Apache, but at the loss
> of the ability to actually use Apache output filters and for Apache to
> fix up response headers in any way. There is a flag in uWSGI which can
> optionally be set to make it use the more traditional mode and not use
> assbackwords.
>
> Thus, I believe you would be fighting against server implementations
> such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
> chunking to be supported at the level of the web application.
>
> About all you can do is ensure that the WSGI specification doesn't
> include anything in it which would prevent a web application
> harnessing indirectly such a feature as chunking where the web server
> supports it.
>
> As it is, it isn't chunked responses which is even the problem,
> because if a underlying web server supports chunking for responses,
> all you need to do is not set the content length.
>
> The problem area with chunking is the request content as the way that
> the WSGI specification is written prevents being able to have chunked
> request content. I have described the issue previously and made
> suggestions about alternate way that wsgi.input could be used.
>
> Graham
>
>> +1
>>
>>        - Alice.
>>
>>
>> _______________________________________________
>> Web-SIG mailing list
>> [hidden email]
>> Web SIG: http://www.python.org/sigs/web-sig
>> Unsubscribe:
>> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>>
>
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Alex Grönholm-3
07.01.2011 01:14, Graham Dumpleton kirjoitti:
One other comment about HTTP/1.1 features.

You will always be battling to have some HTTP/1.1 features work in a
controllable way. This is because WSGI gateways/adapters aren't often
directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
AJP, CGI etc. In this sort of situation you are at the mercy of what
the modules implementing those protocols do, or even are hamstrung by
how those protocols work.

The classic example is 100-continue processing. This simply cannot
work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
mechanisms where proxying is performed as the protocol being used
doesn't implement a notion of end to end signalling in respect of
100-continue.
I think we need some concrete examples to figure out what is and isn't possible with WSGI 1.0.1.
My motivation for participating in this discussion can be summed up in that I want the following two applications to work properly:

- PlasmaDS (Flex Messaging implementation)
- WebDAV

The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS. Interoperability with the existing implementation requires that both the request and response use chunked transfer encoding, to achieve bidirectional streaming. I don't really care how this happens, I just want to make sure that there is nothing preventing it.

The WebDAV spec, on the other hand, says (http://www.webdav.org/specs/rfc2518.html#STATUS_102):

The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it. This status code SHOULD only be sent when the server has a reasonable expectation that the request will take significant time to complete. As guidance, if a method is taking longer than 20 seconds (a reasonable, but arbitrary value) to process the server SHOULD return a 102 (Processing) response. The server MUST send a final response after the request has been completed.

Again, I don't care how this is done as long as it's possible.
The current WSGI specification acknowledges that by saying:

"""
Servers and gateways that implement HTTP 1.1 must provide transparent
support for HTTP 1.1's "expect/continue" mechanism. This may be done
in any of several ways:

* Respond to requests containing an Expect: 100-continue request with
an immediate "100 Continue" response, and proceed normally.
* Proceed with the request normally, but provide the application with
a wsgi.input stream that will send the "100 Continue" response if/when
the application first attempts to read from the input stream. The read
request must then remain blocked until the client responds.
* Wait until the client decides that the server does not support
expect/continue, and sends the request body on its own. (This is
suboptimal, and is not recommended.)
"""

If you are going to try and push for full visibility of HTTP/1.1 and
an ability to control it at the application level then you will fail
with 100-continue to start with.

So, although option 2 above would be the most ideal and is giving the
application control, specifically the ability to send an error
response based on request headers alone, and with reading the response
and triggering the 100-continue, it isn't practical to require it, as
the majority of hosting mechanisms for WSGI wouldn't even be able to
implement it that way.

The same goes for any other feature, there is no point mandating a
feature that can only be realistically implementing on a minority of
implementations. This would be even worse where dependence on such a
feature would mean that the WSGI application would no longer be
portable to another WSGI server and destroys the notion that WSGI
provides a portable interface.

This isn't just restricted to HTTP/1.1 features either, but also
applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
that are directly hooked into the URL parsing of the base HTTP server
can provide that information, which basically means that only pure
Python HTTP/WSGI servers are likely able to provide it without
guessing, and in that case such servers usually are always used where
WSGI application mounted at root anyway.

Graham

On 7 January 2011 09:29, Graham Dumpleton [hidden email] wrote:
On 7 January 2011 08:56, Alice Bevan–McGregor [hidden email] wrote:
On 2011-01-06 13:06:36 -0800, James Y Knight said:

On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
:: Making optional (and thus rarely-implemented) features non-optional.
E.g. server support for HTTP/1.1 with clarifications for interfacing
applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al. as
per the HTTP 1.1 RFC.
Requirements on the HTTP compliance of the server don't really have any
place in the WSGI spec. You should be able to be WSGI compliant even if you
don't use the HTTP transport at all (e.g. maybe you just send around
requests via SCGI).
The original spec got this right: chunking etc are something which is not
relevant to the wsgi application code -- it is up to the server to implement
the HTTP transport according to the HTTP spec, if it's purporting to be an
HTTP server.
Chunking is actually quite relevant to the specification, as WSGI and PEP
444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for
chunked bodies regardless of higher-level support for chunking.  The body
iterator.  Previously you /had/ to define a length, with chunked encoding at
the server level, you don't.

I agree, however, that not all gateways will be able to implement the
relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google search,
seems to support it as well. I should re-word it as:

"For those servers capable of HTTP/1.1 features the implementation of such
features is required."
I would question whether FASTCGI, SCGI or AJP support the concept of
chunking of responses to the extent that the application can prepare
the final content including chunks as required by the HTTP
specification. Further, in Apache at least, the output from a web
application served via those protocols is still pushed through the
Apache output filter chain so as to allow the filters to modify the
response, eg., apply compression using mod_deflate. As a consequence,
the standard HTTP 'CHUNK' output filter is still a part of the output
filter stack. This means that were a web application to try and do
chunking itself, then Apache would rechunk such that the original
chunking became part of the content, rather than the transfer
encoding.

So, in order to be able to achieve what I think you want, with a web
application being able to do chunking itself, you would need to modify
the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
also like mod_cgi and mod_cgid of Apache.

The only WSGI implementation I know of for Apache where you might even
be able to do what you want is uWSGI. This is because I believe from
memory it uses a mode in Apache by default called assbackwords. What
this allows is for the output from the web application to bypass the
Apache output filter stack and directly control the raw HTTP output.
This gives uWSGI a little bit less overhead in Apache, but at the loss
of the ability to actually use Apache output filters and for Apache to
fix up response headers in any way. There is a flag in uWSGI which can
optionally be set to make it use the more traditional mode and not use
assbackwords.

Thus, I believe you would be fighting against server implementations
such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
chunking to be supported at the level of the web application.

About all you can do is ensure that the WSGI specification doesn't
include anything in it which would prevent a web application
harnessing indirectly such a feature as chunking where the web server
supports it.

As it is, it isn't chunked responses which is even the problem,
because if a underlying web server supports chunking for responses,
all you need to do is not set the content length.

The problem area with chunking is the request content as the way that
the WSGI specification is written prevents being able to have chunked
request content. I have described the issue previously and made
suggestions about alternate way that wsgi.input could be used.

Graham

+1

       - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe:
http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com


      
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Graham Dumpleton-2
2011/1/7 Alex Grönholm <[hidden email]>:

> 07.01.2011 01:14, Graham Dumpleton kirjoitti:
>
> One other comment about HTTP/1.1 features.
>
> You will always be battling to have some HTTP/1.1 features work in a
> controllable way. This is because WSGI gateways/adapters aren't often
> directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
> AJP, CGI etc. In this sort of situation you are at the mercy of what
> the modules implementing those protocols do, or even are hamstrung by
> how those protocols work.
>
> The classic example is 100-continue processing. This simply cannot
> work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
> mechanisms where proxying is performed as the protocol being used
> doesn't implement a notion of end to end signalling in respect of
> 100-continue.
>
> I think we need some concrete examples to figure out what is and isn't
> possible with WSGI 1.0.1.
> My motivation for participating in this discussion can be summed up in that
> I want the following two applications to work properly:
>
> - PlasmaDS (Flex Messaging implementation)
> - WebDAV
>
> The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS.
> Interoperability with the existing implementation requires that both the
> request and response use chunked transfer encoding, to achieve bidirectional
> streaming. I don't really care how this happens, I just want to make sure
> that there is nothing preventing it.

That can only be done by changing the rules around wsgi.input is used.
I'll try and find a reference to where I have posted information about
this before, otherwise I'll write something up again about it.

> The WebDAV spec, on the other hand, says
> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>
> The 102 (Processing) status code is an interim response used to inform the
> client that the server has accepted the complete request, but has not yet
> completed it. This status code SHOULD only be sent when the server has a
> reasonable expectation that the request will take significant time to
> complete. As guidance, if a method is taking longer than 20 seconds (a
> reasonable, but arbitrary value) to process the server SHOULD return a 102
> (Processing) response. The server MUST send a final response after the
> request has been completed.

That I don't offhand see a way of being able to do as protocols like
SCGI and CGI definitely don't allow interim status. I am suspecting
that FASTCGI and AJP don't allow it either.

I'll have to even do some digging as to how you would even handle that
in Apache with a normal Apache handler.

Graham

> Again, I don't care how this is done as long as it's possible.
>
> The current WSGI specification acknowledges that by saying:
>
> """
> Servers and gateways that implement HTTP 1.1 must provide transparent
> support for HTTP 1.1's "expect/continue" mechanism. This may be done
> in any of several ways:
>
> * Respond to requests containing an Expect: 100-continue request with
> an immediate "100 Continue" response, and proceed normally.
> * Proceed with the request normally, but provide the application with
> a wsgi.input stream that will send the "100 Continue" response if/when
> the application first attempts to read from the input stream. The read
> request must then remain blocked until the client responds.
> * Wait until the client decides that the server does not support
> expect/continue, and sends the request body on its own. (This is
> suboptimal, and is not recommended.)
> """
>
> If you are going to try and push for full visibility of HTTP/1.1 and
> an ability to control it at the application level then you will fail
> with 100-continue to start with.
>
> So, although option 2 above would be the most ideal and is giving the
> application control, specifically the ability to send an error
> response based on request headers alone, and with reading the response
> and triggering the 100-continue, it isn't practical to require it, as
> the majority of hosting mechanisms for WSGI wouldn't even be able to
> implement it that way.
>
> The same goes for any other feature, there is no point mandating a
> feature that can only be realistically implementing on a minority of
> implementations. This would be even worse where dependence on such a
> feature would mean that the WSGI application would no longer be
> portable to another WSGI server and destroys the notion that WSGI
> provides a portable interface.
>
> This isn't just restricted to HTTP/1.1 features either, but also
> applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
> that are directly hooked into the URL parsing of the base HTTP server
> can provide that information, which basically means that only pure
> Python HTTP/WSGI servers are likely able to provide it without
> guessing, and in that case such servers usually are always used where
> WSGI application mounted at root anyway.
>
> Graham
>
> On 7 January 2011 09:29, Graham Dumpleton <[hidden email]>
> wrote:
>
> On 7 January 2011 08:56, Alice Bevan–McGregor <[hidden email]> wrote:
>
> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>
> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>
> :: Making optional (and thus rarely-implemented) features non-optional.
> E.g. server support for HTTP/1.1 with clarifications for interfacing
> applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al. as
> per the HTTP 1.1 RFC.
>
> Requirements on the HTTP compliance of the server don't really have any
> place in the WSGI spec. You should be able to be WSGI compliant even if you
> don't use the HTTP transport at all (e.g. maybe you just send around
> requests via SCGI).
> The original spec got this right: chunking etc are something which is not
> relevant to the wsgi application code -- it is up to the server to implement
> the HTTP transport according to the HTTP spec, if it's purporting to be an
> HTTP server.
>
> Chunking is actually quite relevant to the specification, as WSGI and PEP
> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for
> chunked bodies regardless of higher-level support for chunking.  The body
> iterator.  Previously you /had/ to define a length, with chunked encoding at
> the server level, you don't.
>
> I agree, however, that not all gateways will be able to implement the
> relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google search,
> seems to support it as well. I should re-word it as:
>
> "For those servers capable of HTTP/1.1 features the implementation of such
> features is required."
>
> I would question whether FASTCGI, SCGI or AJP support the concept of
> chunking of responses to the extent that the application can prepare
> the final content including chunks as required by the HTTP
> specification. Further, in Apache at least, the output from a web
> application served via those protocols is still pushed through the
> Apache output filter chain so as to allow the filters to modify the
> response, eg., apply compression using mod_deflate. As a consequence,
> the standard HTTP 'CHUNK' output filter is still a part of the output
> filter stack. This means that were a web application to try and do
> chunking itself, then Apache would rechunk such that the original
> chunking became part of the content, rather than the transfer
> encoding.
>
> So, in order to be able to achieve what I think you want, with a web
> application being able to do chunking itself, you would need to modify
> the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
> also like mod_cgi and mod_cgid of Apache.
>
> The only WSGI implementation I know of for Apache where you might even
> be able to do what you want is uWSGI. This is because I believe from
> memory it uses a mode in Apache by default called assbackwords. What
> this allows is for the output from the web application to bypass the
> Apache output filter stack and directly control the raw HTTP output.
> This gives uWSGI a little bit less overhead in Apache, but at the loss
> of the ability to actually use Apache output filters and for Apache to
> fix up response headers in any way. There is a flag in uWSGI which can
> optionally be set to make it use the more traditional mode and not use
> assbackwords.
>
> Thus, I believe you would be fighting against server implementations
> such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
> chunking to be supported at the level of the web application.
>
> About all you can do is ensure that the WSGI specification doesn't
> include anything in it which would prevent a web application
> harnessing indirectly such a feature as chunking where the web server
> supports it.
>
> As it is, it isn't chunked responses which is even the problem,
> because if a underlying web server supports chunking for responses,
> all you need to do is not set the content length.
>
> The problem area with chunking is the request content as the way that
> the WSGI specification is written prevents being able to have chunked
> request content. I have described the issue previously and made
> suggestions about alternate way that wsgi.input could be used.
>
> Graham
>
> +1
>
>        - Alice.
>
>
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi
>
>
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
>
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Graham Dumpleton-2
2011/1/7 Graham Dumpleton <[hidden email]>:

> 2011/1/7 Alex Grönholm <[hidden email]>:
>> 07.01.2011 01:14, Graham Dumpleton kirjoitti:
>>
>> One other comment about HTTP/1.1 features.
>>
>> You will always be battling to have some HTTP/1.1 features work in a
>> controllable way. This is because WSGI gateways/adapters aren't often
>> directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
>> AJP, CGI etc. In this sort of situation you are at the mercy of what
>> the modules implementing those protocols do, or even are hamstrung by
>> how those protocols work.
>>
>> The classic example is 100-continue processing. This simply cannot
>> work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
>> mechanisms where proxying is performed as the protocol being used
>> doesn't implement a notion of end to end signalling in respect of
>> 100-continue.
>>
>> I think we need some concrete examples to figure out what is and isn't
>> possible with WSGI 1.0.1.
>> My motivation for participating in this discussion can be summed up in that
>> I want the following two applications to work properly:
>>
>> - PlasmaDS (Flex Messaging implementation)
>> - WebDAV
>>
>> The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS.
>> Interoperability with the existing implementation requires that both the
>> request and response use chunked transfer encoding, to achieve bidirectional
>> streaming. I don't really care how this happens, I just want to make sure
>> that there is nothing preventing it.
>
> That can only be done by changing the rules around wsgi.input is used.
> I'll try and find a reference to where I have posted information about
> this before, otherwise I'll write something up again about it.

BTW, even if WSGI specification were changed to allow handling of
chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or
mod_wsgi daemon mode. Also not likely to work on uWSGI either.

This is because all of these work on the expectation that the complete
request body can be written across to the separate application process
before actually reading the response from the application.

In other words, both way streaming is not possible.

The only solution which would allow this with Apache is mod_wsgi
embedded mode, which in mod_wsgi 3.X already has an optional feature
which can be enabled so as to allow you to step out of current bounds
of the WSGI specification and use wsgi.input as I will explain, to do
this both way streaming.

Pure Python HTTP/WSGI servers which are a front facing server could
also be modified to handle this is WSGI specification were changed,
but whether those same will work if put behind a web proxy will depend
on how the front end web proxy works.

Graham

>> The WebDAV spec, on the other hand, says
>> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>>
>> The 102 (Processing) status code is an interim response used to inform the
>> client that the server has accepted the complete request, but has not yet
>> completed it. This status code SHOULD only be sent when the server has a
>> reasonable expectation that the request will take significant time to
>> complete. As guidance, if a method is taking longer than 20 seconds (a
>> reasonable, but arbitrary value) to process the server SHOULD return a 102
>> (Processing) response. The server MUST send a final response after the
>> request has been completed.
>
> That I don't offhand see a way of being able to do as protocols like
> SCGI and CGI definitely don't allow interim status. I am suspecting
> that FASTCGI and AJP don't allow it either.
>
> I'll have to even do some digging as to how you would even handle that
> in Apache with a normal Apache handler.
>
> Graham
>
>> Again, I don't care how this is done as long as it's possible.
>>
>> The current WSGI specification acknowledges that by saying:
>>
>> """
>> Servers and gateways that implement HTTP 1.1 must provide transparent
>> support for HTTP 1.1's "expect/continue" mechanism. This may be done
>> in any of several ways:
>>
>> * Respond to requests containing an Expect: 100-continue request with
>> an immediate "100 Continue" response, and proceed normally.
>> * Proceed with the request normally, but provide the application with
>> a wsgi.input stream that will send the "100 Continue" response if/when
>> the application first attempts to read from the input stream. The read
>> request must then remain blocked until the client responds.
>> * Wait until the client decides that the server does not support
>> expect/continue, and sends the request body on its own. (This is
>> suboptimal, and is not recommended.)
>> """
>>
>> If you are going to try and push for full visibility of HTTP/1.1 and
>> an ability to control it at the application level then you will fail
>> with 100-continue to start with.
>>
>> So, although option 2 above would be the most ideal and is giving the
>> application control, specifically the ability to send an error
>> response based on request headers alone, and with reading the response
>> and triggering the 100-continue, it isn't practical to require it, as
>> the majority of hosting mechanisms for WSGI wouldn't even be able to
>> implement it that way.
>>
>> The same goes for any other feature, there is no point mandating a
>> feature that can only be realistically implementing on a minority of
>> implementations. This would be even worse where dependence on such a
>> feature would mean that the WSGI application would no longer be
>> portable to another WSGI server and destroys the notion that WSGI
>> provides a portable interface.
>>
>> This isn't just restricted to HTTP/1.1 features either, but also
>> applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
>> that are directly hooked into the URL parsing of the base HTTP server
>> can provide that information, which basically means that only pure
>> Python HTTP/WSGI servers are likely able to provide it without
>> guessing, and in that case such servers usually are always used where
>> WSGI application mounted at root anyway.
>>
>> Graham
>>
>> On 7 January 2011 09:29, Graham Dumpleton <[hidden email]>
>> wrote:
>>
>> On 7 January 2011 08:56, Alice Bevan–McGregor <[hidden email]> wrote:
>>
>> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>>
>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>
>> :: Making optional (and thus rarely-implemented) features non-optional.
>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>> applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al. as
>> per the HTTP 1.1 RFC.
>>
>> Requirements on the HTTP compliance of the server don't really have any
>> place in the WSGI spec. You should be able to be WSGI compliant even if you
>> don't use the HTTP transport at all (e.g. maybe you just send around
>> requests via SCGI).
>> The original spec got this right: chunking etc are something which is not
>> relevant to the wsgi application code -- it is up to the server to implement
>> the HTTP transport according to the HTTP spec, if it's purporting to be an
>> HTTP server.
>>
>> Chunking is actually quite relevant to the specification, as WSGI and PEP
>> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for
>> chunked bodies regardless of higher-level support for chunking.  The body
>> iterator.  Previously you /had/ to define a length, with chunked encoding at
>> the server level, you don't.
>>
>> I agree, however, that not all gateways will be able to implement the
>> relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google search,
>> seems to support it as well. I should re-word it as:
>>
>> "For those servers capable of HTTP/1.1 features the implementation of such
>> features is required."
>>
>> I would question whether FASTCGI, SCGI or AJP support the concept of
>> chunking of responses to the extent that the application can prepare
>> the final content including chunks as required by the HTTP
>> specification. Further, in Apache at least, the output from a web
>> application served via those protocols is still pushed through the
>> Apache output filter chain so as to allow the filters to modify the
>> response, eg., apply compression using mod_deflate. As a consequence,
>> the standard HTTP 'CHUNK' output filter is still a part of the output
>> filter stack. This means that were a web application to try and do
>> chunking itself, then Apache would rechunk such that the original
>> chunking became part of the content, rather than the transfer
>> encoding.
>>
>> So, in order to be able to achieve what I think you want, with a web
>> application being able to do chunking itself, you would need to modify
>> the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
>> also like mod_cgi and mod_cgid of Apache.
>>
>> The only WSGI implementation I know of for Apache where you might even
>> be able to do what you want is uWSGI. This is because I believe from
>> memory it uses a mode in Apache by default called assbackwords. What
>> this allows is for the output from the web application to bypass the
>> Apache output filter stack and directly control the raw HTTP output.
>> This gives uWSGI a little bit less overhead in Apache, but at the loss
>> of the ability to actually use Apache output filters and for Apache to
>> fix up response headers in any way. There is a flag in uWSGI which can
>> optionally be set to make it use the more traditional mode and not use
>> assbackwords.
>>
>> Thus, I believe you would be fighting against server implementations
>> such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
>> chunking to be supported at the level of the web application.
>>
>> About all you can do is ensure that the WSGI specification doesn't
>> include anything in it which would prevent a web application
>> harnessing indirectly such a feature as chunking where the web server
>> supports it.
>>
>> As it is, it isn't chunked responses which is even the problem,
>> because if a underlying web server supports chunking for responses,
>> all you need to do is not set the content length.
>>
>> The problem area with chunking is the request content as the way that
>> the WSGI specification is written prevents being able to have chunked
>> request content. I have described the issue previously and made
>> suggestions about alternate way that wsgi.input could be used.
>>
>> Graham
>>
>> +1
>>
>>        - Alice.
>>
>>
>> _______________________________________________
>> Web-SIG mailing list
>> [hidden email]
>> Web SIG: http://www.python.org/sigs/web-sig
>> Unsubscribe:
>> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>>
>> _______________________________________________
>> Web-SIG mailing list
>> [hidden email]
>> Web SIG: http://www.python.org/sigs/web-sig
>> Unsubscribe:
>> http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi
>>
>>
>> _______________________________________________
>> Web-SIG mailing list
>> [hidden email]
>> Web SIG: http://www.python.org/sigs/web-sig
>> Unsubscribe:
>> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>>
>>
>
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Alex Grönholm-3
07.01.2011 04:09, Graham Dumpleton kirjoitti:

> 2011/1/7 Graham Dumpleton<[hidden email]>:
>> 2011/1/7 Alex Grönholm<[hidden email]>:
>>> 07.01.2011 01:14, Graham Dumpleton kirjoitti:
>>>
>>> One other comment about HTTP/1.1 features.
>>>
>>> You will always be battling to have some HTTP/1.1 features work in a
>>> controllable way. This is because WSGI gateways/adapters aren't often
>>> directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
>>> AJP, CGI etc. In this sort of situation you are at the mercy of what
>>> the modules implementing those protocols do, or even are hamstrung by
>>> how those protocols work.
>>>
>>> The classic example is 100-continue processing. This simply cannot
>>> work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
>>> mechanisms where proxying is performed as the protocol being used
>>> doesn't implement a notion of end to end signalling in respect of
>>> 100-continue.
>>>
>>> I think we need some concrete examples to figure out what is and isn't
>>> possible with WSGI 1.0.1.
>>> My motivation for participating in this discussion can be summed up in that
>>> I want the following two applications to work properly:
>>>
>>> - PlasmaDS (Flex Messaging implementation)
>>> - WebDAV
>>>
>>> The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS.
>>> Interoperability with the existing implementation requires that both the
>>> request and response use chunked transfer encoding, to achieve bidirectional
>>> streaming. I don't really care how this happens, I just want to make sure
>>> that there is nothing preventing it.
>> That can only be done by changing the rules around wsgi.input is used.
>> I'll try and find a reference to where I have posted information about
>> this before, otherwise I'll write something up again about it.
> BTW, even if WSGI specification were changed to allow handling of
> chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or
> mod_wsgi daemon mode. Also not likely to work on uWSGI either.
>
> This is because all of these work on the expectation that the complete
> request body can be written across to the separate application process
> before actually reading the response from the application.
>
> In other words, both way streaming is not possible.
>
> The only solution which would allow this with Apache is mod_wsgi
> embedded mode, which in mod_wsgi 3.X already has an optional feature
> which can be enabled so as to allow you to step out of current bounds
> of the WSGI specification and use wsgi.input as I will explain, to do
> this both way streaming.
>
> Pure Python HTTP/WSGI servers which are a front facing server could
> also be modified to handle this is WSGI specification were changed,
> but whether those same will work if put behind a web proxy will depend
> on how the front end web proxy works.
Then I suppose this needs to be standardized in PEP 444, wouldn't you agree?

> Graham
>
>>> The WebDAV spec, on the other hand, says
>>> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>>>
>>> The 102 (Processing) status code is an interim response used to inform the
>>> client that the server has accepted the complete request, but has not yet
>>> completed it. This status code SHOULD only be sent when the server has a
>>> reasonable expectation that the request will take significant time to
>>> complete. As guidance, if a method is taking longer than 20 seconds (a
>>> reasonable, but arbitrary value) to process the server SHOULD return a 102
>>> (Processing) response. The server MUST send a final response after the
>>> request has been completed.
>> That I don't offhand see a way of being able to do as protocols like
>> SCGI and CGI definitely don't allow interim status. I am suspecting
>> that FASTCGI and AJP don't allow it either.
>>
>> I'll have to even do some digging as to how you would even handle that
>> in Apache with a normal Apache handler.
>>
>> Graham
>>
>>> Again, I don't care how this is done as long as it's possible.
>>>
>>> The current WSGI specification acknowledges that by saying:
>>>
>>> """
>>> Servers and gateways that implement HTTP 1.1 must provide transparent
>>> support for HTTP 1.1's "expect/continue" mechanism. This may be done
>>> in any of several ways:
>>>
>>> * Respond to requests containing an Expect: 100-continue request with
>>> an immediate "100 Continue" response, and proceed normally.
>>> * Proceed with the request normally, but provide the application with
>>> a wsgi.input stream that will send the "100 Continue" response if/when
>>> the application first attempts to read from the input stream. The read
>>> request must then remain blocked until the client responds.
>>> * Wait until the client decides that the server does not support
>>> expect/continue, and sends the request body on its own. (This is
>>> suboptimal, and is not recommended.)
>>> """
>>>
>>> If you are going to try and push for full visibility of HTTP/1.1 and
>>> an ability to control it at the application level then you will fail
>>> with 100-continue to start with.
>>>
>>> So, although option 2 above would be the most ideal and is giving the
>>> application control, specifically the ability to send an error
>>> response based on request headers alone, and with reading the response
>>> and triggering the 100-continue, it isn't practical to require it, as
>>> the majority of hosting mechanisms for WSGI wouldn't even be able to
>>> implement it that way.
>>>
>>> The same goes for any other feature, there is no point mandating a
>>> feature that can only be realistically implementing on a minority of
>>> implementations. This would be even worse where dependence on such a
>>> feature would mean that the WSGI application would no longer be
>>> portable to another WSGI server and destroys the notion that WSGI
>>> provides a portable interface.
>>>
>>> This isn't just restricted to HTTP/1.1 features either, but also
>>> applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
>>> that are directly hooked into the URL parsing of the base HTTP server
>>> can provide that information, which basically means that only pure
>>> Python HTTP/WSGI servers are likely able to provide it without
>>> guessing, and in that case such servers usually are always used where
>>> WSGI application mounted at root anyway.
>>>
>>> Graham
>>>
>>> On 7 January 2011 09:29, Graham Dumpleton<[hidden email]>
>>> wrote:
>>>
>>> On 7 January 2011 08:56, Alice Bevan–McGregor<[hidden email]>  wrote:
>>>
>>> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>>>
>>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>>
>>> :: Making optional (and thus rarely-implemented) features non-optional.
>>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>>> applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al. as
>>> per the HTTP 1.1 RFC.
>>>
>>> Requirements on the HTTP compliance of the server don't really have any
>>> place in the WSGI spec. You should be able to be WSGI compliant even if you
>>> don't use the HTTP transport at all (e.g. maybe you just send around
>>> requests via SCGI).
>>> The original spec got this right: chunking etc are something which is not
>>> relevant to the wsgi application code -- it is up to the server to implement
>>> the HTTP transport according to the HTTP spec, if it's purporting to be an
>>> HTTP server.
>>>
>>> Chunking is actually quite relevant to the specification, as WSGI and PEP
>>> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for
>>> chunked bodies regardless of higher-level support for chunking.  The body
>>> iterator.  Previously you /had/ to define a length, with chunked encoding at
>>> the server level, you don't.
>>>
>>> I agree, however, that not all gateways will be able to implement the
>>> relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google search,
>>> seems to support it as well. I should re-word it as:
>>>
>>> "For those servers capable of HTTP/1.1 features the implementation of such
>>> features is required."
>>>
>>> I would question whether FASTCGI, SCGI or AJP support the concept of
>>> chunking of responses to the extent that the application can prepare
>>> the final content including chunks as required by the HTTP
>>> specification. Further, in Apache at least, the output from a web
>>> application served via those protocols is still pushed through the
>>> Apache output filter chain so as to allow the filters to modify the
>>> response, eg., apply compression using mod_deflate. As a consequence,
>>> the standard HTTP 'CHUNK' output filter is still a part of the output
>>> filter stack. This means that were a web application to try and do
>>> chunking itself, then Apache would rechunk such that the original
>>> chunking became part of the content, rather than the transfer
>>> encoding.
>>>
>>> So, in order to be able to achieve what I think you want, with a web
>>> application being able to do chunking itself, you would need to modify
>>> the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
>>> also like mod_cgi and mod_cgid of Apache.
>>>
>>> The only WSGI implementation I know of for Apache where you might even
>>> be able to do what you want is uWSGI. This is because I believe from
>>> memory it uses a mode in Apache by default called assbackwords. What
>>> this allows is for the output from the web application to bypass the
>>> Apache output filter stack and directly control the raw HTTP output.
>>> This gives uWSGI a little bit less overhead in Apache, but at the loss
>>> of the ability to actually use Apache output filters and for Apache to
>>> fix up response headers in any way. There is a flag in uWSGI which can
>>> optionally be set to make it use the more traditional mode and not use
>>> assbackwords.
>>>
>>> Thus, I believe you would be fighting against server implementations
>>> such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
>>> chunking to be supported at the level of the web application.
>>>
>>> About all you can do is ensure that the WSGI specification doesn't
>>> include anything in it which would prevent a web application
>>> harnessing indirectly such a feature as chunking where the web server
>>> supports it.
>>>
>>> As it is, it isn't chunked responses which is even the problem,
>>> because if a underlying web server supports chunking for responses,
>>> all you need to do is not set the content length.
>>>
>>> The problem area with chunking is the request content as the way that
>>> the WSGI specification is written prevents being able to have chunked
>>> request content. I have described the issue previously and made
>>> suggestions about alternate way that wsgi.input could be used.
>>>
>>> Graham
>>>
>>> +1
>>>
>>>         - Alice.
>>

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Graham Dumpleton-2
2011/1/7 Alex Grönholm <[hidden email]>:

> 07.01.2011 04:09, Graham Dumpleton kirjoitti:
>>
>> 2011/1/7 Graham Dumpleton<[hidden email]>:
>>>
>>> 2011/1/7 Alex Grönholm<[hidden email]>:
>>>>
>>>> 07.01.2011 01:14, Graham Dumpleton kirjoitti:
>>>>
>>>> One other comment about HTTP/1.1 features.
>>>>
>>>> You will always be battling to have some HTTP/1.1 features work in a
>>>> controllable way. This is because WSGI gateways/adapters aren't often
>>>> directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
>>>> AJP, CGI etc. In this sort of situation you are at the mercy of what
>>>> the modules implementing those protocols do, or even are hamstrung by
>>>> how those protocols work.
>>>>
>>>> The classic example is 100-continue processing. This simply cannot
>>>> work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
>>>> mechanisms where proxying is performed as the protocol being used
>>>> doesn't implement a notion of end to end signalling in respect of
>>>> 100-continue.
>>>>
>>>> I think we need some concrete examples to figure out what is and isn't
>>>> possible with WSGI 1.0.1.
>>>> My motivation for participating in this discussion can be summed up in
>>>> that
>>>> I want the following two applications to work properly:
>>>>
>>>> - PlasmaDS (Flex Messaging implementation)
>>>> - WebDAV
>>>>
>>>> The PlasmaDS project is the planned Python counterpart to Adobe's
>>>> BlazeDS.
>>>> Interoperability with the existing implementation requires that both the
>>>> request and response use chunked transfer encoding, to achieve
>>>> bidirectional
>>>> streaming. I don't really care how this happens, I just want to make
>>>> sure
>>>> that there is nothing preventing it.
>>>
>>> That can only be done by changing the rules around wsgi.input is used.
>>> I'll try and find a reference to where I have posted information about
>>> this before, otherwise I'll write something up again about it.
>>
>> BTW, even if WSGI specification were changed to allow handling of
>> chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or
>> mod_wsgi daemon mode. Also not likely to work on uWSGI either.
>>
>> This is because all of these work on the expectation that the complete
>> request body can be written across to the separate application process
>> before actually reading the response from the application.
>>
>> In other words, both way streaming is not possible.
>>
>> The only solution which would allow this with Apache is mod_wsgi
>> embedded mode, which in mod_wsgi 3.X already has an optional feature
>> which can be enabled so as to allow you to step out of current bounds
>> of the WSGI specification and use wsgi.input as I will explain, to do
>> this both way streaming.
>>
>> Pure Python HTTP/WSGI servers which are a front facing server could
>> also be modified to handle this is WSGI specification were changed,
>> but whether those same will work if put behind a web proxy will depend
>> on how the front end web proxy works.
>
> Then I suppose this needs to be standardized in PEP 444, wouldn't you agree?

Huh! Not sure you understand what I am saying. Even if you changed the
WSGI specification to allow for it, the bulk of implementations
wouldn't be able to support it. The WSGI specification has no
influence over distinct protocols such as FASTCGI, SCGI, AJP or CGI or
proxy implementations and so cant be used to force them to be changed.

So, as much as I would like to see WSGI specification changed to allow
it, others may not on the basis that there is no point if few
implementations could support it.

Graham

>> Graham
>>
>>>> The WebDAV spec, on the other hand, says
>>>> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>>>>
>>>> The 102 (Processing) status code is an interim response used to inform
>>>> the
>>>> client that the server has accepted the complete request, but has not
>>>> yet
>>>> completed it. This status code SHOULD only be sent when the server has a
>>>> reasonable expectation that the request will take significant time to
>>>> complete. As guidance, if a method is taking longer than 20 seconds (a
>>>> reasonable, but arbitrary value) to process the server SHOULD return a
>>>> 102
>>>> (Processing) response. The server MUST send a final response after the
>>>> request has been completed.
>>>
>>> That I don't offhand see a way of being able to do as protocols like
>>> SCGI and CGI definitely don't allow interim status. I am suspecting
>>> that FASTCGI and AJP don't allow it either.
>>>
>>> I'll have to even do some digging as to how you would even handle that
>>> in Apache with a normal Apache handler.
>>>
>>> Graham
>>>
>>>> Again, I don't care how this is done as long as it's possible.
>>>>
>>>> The current WSGI specification acknowledges that by saying:
>>>>
>>>> """
>>>> Servers and gateways that implement HTTP 1.1 must provide transparent
>>>> support for HTTP 1.1's "expect/continue" mechanism. This may be done
>>>> in any of several ways:
>>>>
>>>> * Respond to requests containing an Expect: 100-continue request with
>>>> an immediate "100 Continue" response, and proceed normally.
>>>> * Proceed with the request normally, but provide the application with
>>>> a wsgi.input stream that will send the "100 Continue" response if/when
>>>> the application first attempts to read from the input stream. The read
>>>> request must then remain blocked until the client responds.
>>>> * Wait until the client decides that the server does not support
>>>> expect/continue, and sends the request body on its own. (This is
>>>> suboptimal, and is not recommended.)
>>>> """
>>>>
>>>> If you are going to try and push for full visibility of HTTP/1.1 and
>>>> an ability to control it at the application level then you will fail
>>>> with 100-continue to start with.
>>>>
>>>> So, although option 2 above would be the most ideal and is giving the
>>>> application control, specifically the ability to send an error
>>>> response based on request headers alone, and with reading the response
>>>> and triggering the 100-continue, it isn't practical to require it, as
>>>> the majority of hosting mechanisms for WSGI wouldn't even be able to
>>>> implement it that way.
>>>>
>>>> The same goes for any other feature, there is no point mandating a
>>>> feature that can only be realistically implementing on a minority of
>>>> implementations. This would be even worse where dependence on such a
>>>> feature would mean that the WSGI application would no longer be
>>>> portable to another WSGI server and destroys the notion that WSGI
>>>> provides a portable interface.
>>>>
>>>> This isn't just restricted to HTTP/1.1 features either, but also
>>>> applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
>>>> that are directly hooked into the URL parsing of the base HTTP server
>>>> can provide that information, which basically means that only pure
>>>> Python HTTP/WSGI servers are likely able to provide it without
>>>> guessing, and in that case such servers usually are always used where
>>>> WSGI application mounted at root anyway.
>>>>
>>>> Graham
>>>>
>>>> On 7 January 2011 09:29, Graham Dumpleton<[hidden email]>
>>>> wrote:
>>>>
>>>> On 7 January 2011 08:56, Alice Bevan–McGregor<[hidden email]>
>>>>  wrote:
>>>>
>>>> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>>>>
>>>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>>>
>>>> :: Making optional (and thus rarely-implemented) features non-optional.
>>>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>>>> applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al.
>>>> as
>>>> per the HTTP 1.1 RFC.
>>>>
>>>> Requirements on the HTTP compliance of the server don't really have any
>>>> place in the WSGI spec. You should be able to be WSGI compliant even if
>>>> you
>>>> don't use the HTTP transport at all (e.g. maybe you just send around
>>>> requests via SCGI).
>>>> The original spec got this right: chunking etc are something which is
>>>> not
>>>> relevant to the wsgi application code -- it is up to the server to
>>>> implement
>>>> the HTTP transport according to the HTTP spec, if it's purporting to be
>>>> an
>>>> HTTP server.
>>>>
>>>> Chunking is actually quite relevant to the specification, as WSGI and
>>>> PEP
>>>> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow
>>>> for
>>>> chunked bodies regardless of higher-level support for chunking.  The
>>>> body
>>>> iterator.  Previously you /had/ to define a length, with chunked
>>>> encoding at
>>>> the server level, you don't.
>>>>
>>>> I agree, however, that not all gateways will be able to implement the
>>>> relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google
>>>> search,
>>>> seems to support it as well. I should re-word it as:
>>>>
>>>> "For those servers capable of HTTP/1.1 features the implementation of
>>>> such
>>>> features is required."
>>>>
>>>> I would question whether FASTCGI, SCGI or AJP support the concept of
>>>> chunking of responses to the extent that the application can prepare
>>>> the final content including chunks as required by the HTTP
>>>> specification. Further, in Apache at least, the output from a web
>>>> application served via those protocols is still pushed through the
>>>> Apache output filter chain so as to allow the filters to modify the
>>>> response, eg., apply compression using mod_deflate. As a consequence,
>>>> the standard HTTP 'CHUNK' output filter is still a part of the output
>>>> filter stack. This means that were a web application to try and do
>>>> chunking itself, then Apache would rechunk such that the original
>>>> chunking became part of the content, rather than the transfer
>>>> encoding.
>>>>
>>>> So, in order to be able to achieve what I think you want, with a web
>>>> application being able to do chunking itself, you would need to modify
>>>> the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
>>>> also like mod_cgi and mod_cgid of Apache.
>>>>
>>>> The only WSGI implementation I know of for Apache where you might even
>>>> be able to do what you want is uWSGI. This is because I believe from
>>>> memory it uses a mode in Apache by default called assbackwords. What
>>>> this allows is for the output from the web application to bypass the
>>>> Apache output filter stack and directly control the raw HTTP output.
>>>> This gives uWSGI a little bit less overhead in Apache, but at the loss
>>>> of the ability to actually use Apache output filters and for Apache to
>>>> fix up response headers in any way. There is a flag in uWSGI which can
>>>> optionally be set to make it use the more traditional mode and not use
>>>> assbackwords.
>>>>
>>>> Thus, I believe you would be fighting against server implementations
>>>> such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
>>>> chunking to be supported at the level of the web application.
>>>>
>>>> About all you can do is ensure that the WSGI specification doesn't
>>>> include anything in it which would prevent a web application
>>>> harnessing indirectly such a feature as chunking where the web server
>>>> supports it.
>>>>
>>>> As it is, it isn't chunked responses which is even the problem,
>>>> because if a underlying web server supports chunking for responses,
>>>> all you need to do is not set the content length.
>>>>
>>>> The problem area with chunking is the request content as the way that
>>>> the WSGI specification is written prevents being able to have chunked
>>>> request content. I have described the issue previously and made
>>>> suggestions about alternate way that wsgi.input could be used.
>>>>
>>>> Graham
>>>>
>>>> +1
>>>>
>>>>        - Alice.
>>>
>
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

James Y Knight
In reply to this post by Alex Grönholm-3

On Jan 6, 2011, at 7:46 PM, Alex Grönholm wrote:

The WebDAV spec, on the other hand, says (http://www.webdav.org/specs/rfc2518.html#STATUS_102):

The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it. This status code SHOULD only be sent when the server has a reasonable expectation that the request will take significant time to complete. As guidance, if a method is taking longer than 20 seconds (a reasonable, but arbitrary value) to process the serverSHOULD return a 102 (Processing) response. The server MUST send a final response after the request has been completed.

Again, I don't care how this is done as long as it's possible.

This pretty much has to be generated by the server implementation. One thing that could be done in WSGI is a callback function inserted into the environ to suggest to the server that it generate a certain 1xx response. That is, something like:
  if 'wsgi.intermediate_response' in environ:
    environ['wsgi.intermediate_response'](102, {'Random-Header': 'Whatever'})

If a server implements this, it should probably ignore any requests from the app to send a 100 or 101 response. The server should be free to ignore the request, or not implement it. Given that the only actual use case (WebDAV) is rather rare and marks it as a SHOULD, I don't see any real practical issues with it being optional.

The other thing that could be done is simply have a server-side configuration to allow sending 102 after *any* request takes > 20 seconds to process. That wouldn't require any changes to WSGI.

I'd note that HTTP/1.1 clients are *required* to be able to handle any number of 1xx responses followed by a final response, so it's supposed to be perfectly safe for a server to always send a 102 as a response to any request, no matter what the app is, or what client user-agent is (so long as it advertised HTTP/1.1), or even whether the resource has anything to do with WebDAV. Of course, I'm willing to bet that's patently false back here in the Real World -- no doubt plenty of "HTTP/1.1" clients incorrectly barf on 1xx responses.

James

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Graham Dumpleton-2
2011/1/7 James Y Knight <[hidden email]>:

>
> On Jan 6, 2011, at 7:46 PM, Alex Grönholm wrote:
>
> The WebDAV spec, on the other hand, says
> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>
> The 102 (Processing) status code is an interim response used to inform the
> client that the server has accepted the complete request, but has not yet
> completed it. This status code SHOULD only be sent when the server has a
> reasonable expectation that the request will take significant time to
> complete. As guidance, if a method is taking longer than 20 seconds (a
> reasonable, but arbitrary value) to process the serverSHOULD return a 102
> (Processing) response. The server MUST send a final response after the
> request has been completed.
>
> Again, I don't care how this is done as long as it's possible.
>
> This pretty much has to be generated by the server implementation. One thing
> that could be done in WSGI is a callback function inserted into the environ
> to suggest to the server that it generate a certain 1xx response. That is,
> something like:
>   if 'wsgi.intermediate_response' in environ:
>     environ['wsgi.intermediate_response'](102, {'Random-Header':
> 'Whatever'})
> If a server implements this, it should probably ignore any requests from the
> app to send a 100 or 101 response. The server should be free to ignore the
> request, or not implement it. Given that the only actual use case (WebDAV)
> is rather rare and marks it as a SHOULD, I don't see any real practical
> issues with it being optional.
> The other thing that could be done is simply have a server-side
> configuration to allow sending 102 after *any* request takes > 20 seconds to
> process. That wouldn't require any changes to WSGI.
> I'd note that HTTP/1.1 clients are *required* to be able to handle any
> number of 1xx responses followed by a final response, so it's supposed to be
> perfectly safe for a server to always send a 102 as a response to any
> request, no matter what the app is, or what client user-agent is (so long as
> it advertised HTTP/1.1), or even whether the resource has anything to do
> with WebDAV. Of course, I'm willing to bet that's patently false back here
> in the Real World -- no doubt plenty of "HTTP/1.1" clients incorrectly barf
> on 1xx responses.

FWIW, Apache provides ap_send_interim_response() to allow interim status.

This is used by mod_proxy, but no where else in Apache core code. So,
you would be fine if proxying to a pure Python HTTP/WSGI server which
could generate interim responses, but would be out of luck with
FASTCGI, SCGI, AJP, CGI and any modules which do custom proxying using
own protocol such as uWSGI or mod_wsgi daemon mode.

In all the latter, the wire protocols for proxy connection would
themselves need to be modified as well as module implementation, which
isn't going to happen for any of those which are generic protocols.

Graham
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Alex Grönholm-3
In reply to this post by Graham Dumpleton-2
07.01.2011 04:55, Graham Dumpleton kirjoitti:

> 2011/1/7 Alex Grönholm<[hidden email]>:
>> 07.01.2011 04:09, Graham Dumpleton kirjoitti:
>>> 2011/1/7 Graham Dumpleton<[hidden email]>:
>>>> 2011/1/7 Alex Grönholm<[hidden email]>:
>>>>> 07.01.2011 01:14, Graham Dumpleton kirjoitti:
>>>>>
>>>>> One other comment about HTTP/1.1 features.
>>>>>
>>>>> You will always be battling to have some HTTP/1.1 features work in a
>>>>> controllable way. This is because WSGI gateways/adapters aren't often
>>>>> directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
>>>>> AJP, CGI etc. In this sort of situation you are at the mercy of what
>>>>> the modules implementing those protocols do, or even are hamstrung by
>>>>> how those protocols work.
>>>>>
>>>>> The classic example is 100-continue processing. This simply cannot
>>>>> work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
>>>>> mechanisms where proxying is performed as the protocol being used
>>>>> doesn't implement a notion of end to end signalling in respect of
>>>>> 100-continue.
>>>>>
>>>>> I think we need some concrete examples to figure out what is and isn't
>>>>> possible with WSGI 1.0.1.
>>>>> My motivation for participating in this discussion can be summed up in
>>>>> that
>>>>> I want the following two applications to work properly:
>>>>>
>>>>> - PlasmaDS (Flex Messaging implementation)
>>>>> - WebDAV
>>>>>
>>>>> The PlasmaDS project is the planned Python counterpart to Adobe's
>>>>> BlazeDS.
>>>>> Interoperability with the existing implementation requires that both the
>>>>> request and response use chunked transfer encoding, to achieve
>>>>> bidirectional
>>>>> streaming. I don't really care how this happens, I just want to make
>>>>> sure
>>>>> that there is nothing preventing it.
>>>> That can only be done by changing the rules around wsgi.input is used.
>>>> I'll try and find a reference to where I have posted information about
>>>> this before, otherwise I'll write something up again about it.
>>> BTW, even if WSGI specification were changed to allow handling of
>>> chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or
>>> mod_wsgi daemon mode. Also not likely to work on uWSGI either.
>>>
>>> This is because all of these work on the expectation that the complete
>>> request body can be written across to the separate application process
>>> before actually reading the response from the application.
>>>
>>> In other words, both way streaming is not possible.
>>>
>>> The only solution which would allow this with Apache is mod_wsgi
>>> embedded mode, which in mod_wsgi 3.X already has an optional feature
>>> which can be enabled so as to allow you to step out of current bounds
>>> of the WSGI specification and use wsgi.input as I will explain, to do
>>> this both way streaming.
>>>
>>> Pure Python HTTP/WSGI servers which are a front facing server could
>>> also be modified to handle this is WSGI specification were changed,
>>> but whether those same will work if put behind a web proxy will depend
>>> on how the front end web proxy works.
>> Then I suppose this needs to be standardized in PEP 444, wouldn't you agree?
> Huh! Not sure you understand what I am saying. Even if you changed the
> WSGI specification to allow for it, the bulk of implementations
> wouldn't be able to support it. The WSGI specification has no
> influence over distinct protocols such as FASTCGI, SCGI, AJP or CGI or
> proxy implementations and so cant be used to force them to be changed.
I believe I understand what you are saying, but I don't want to restrict
the freedom of the developer just because of some implementations that
can't support some particular feature. If you need to do streaming, use
a server that supports it, obviously! If Java can do it, why can't we? I
would hate having to rely on a non-standard implementation if we have
the possibility to standardize this in a specification.

> So, as much as I would like to see WSGI specification changed to allow
> it, others may not on the basis that there is no point if few
> implementations could support it.
>
> Graham
>
>>> Graham
>>>
>>>>> The WebDAV spec, on the other hand, says
>>>>> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>>>>>
>>>>> The 102 (Processing) status code is an interim response used to inform
>>>>> the
>>>>> client that the server has accepted the complete request, but has not
>>>>> yet
>>>>> completed it. This status code SHOULD only be sent when the server has a
>>>>> reasonable expectation that the request will take significant time to
>>>>> complete. As guidance, if a method is taking longer than 20 seconds (a
>>>>> reasonable, but arbitrary value) to process the server SHOULD return a
>>>>> 102
>>>>> (Processing) response. The server MUST send a final response after the
>>>>> request has been completed.
>>>> That I don't offhand see a way of being able to do as protocols like
>>>> SCGI and CGI definitely don't allow interim status. I am suspecting
>>>> that FASTCGI and AJP don't allow it either.
>>>>
>>>> I'll have to even do some digging as to how you would even handle that
>>>> in Apache with a normal Apache handler.
>>>>
>>>> Graham
>>>>
>>>>> Again, I don't care how this is done as long as it's possible.
>>>>>
>>>>> The current WSGI specification acknowledges that by saying:
>>>>>
>>>>> """
>>>>> Servers and gateways that implement HTTP 1.1 must provide transparent
>>>>> support for HTTP 1.1's "expect/continue" mechanism. This may be done
>>>>> in any of several ways:
>>>>>
>>>>> * Respond to requests containing an Expect: 100-continue request with
>>>>> an immediate "100 Continue" response, and proceed normally.
>>>>> * Proceed with the request normally, but provide the application with
>>>>> a wsgi.input stream that will send the "100 Continue" response if/when
>>>>> the application first attempts to read from the input stream. The read
>>>>> request must then remain blocked until the client responds.
>>>>> * Wait until the client decides that the server does not support
>>>>> expect/continue, and sends the request body on its own. (This is
>>>>> suboptimal, and is not recommended.)
>>>>> """
>>>>>
>>>>> If you are going to try and push for full visibility of HTTP/1.1 and
>>>>> an ability to control it at the application level then you will fail
>>>>> with 100-continue to start with.
>>>>>
>>>>> So, although option 2 above would be the most ideal and is giving the
>>>>> application control, specifically the ability to send an error
>>>>> response based on request headers alone, and with reading the response
>>>>> and triggering the 100-continue, it isn't practical to require it, as
>>>>> the majority of hosting mechanisms for WSGI wouldn't even be able to
>>>>> implement it that way.
>>>>>
>>>>> The same goes for any other feature, there is no point mandating a
>>>>> feature that can only be realistically implementing on a minority of
>>>>> implementations. This would be even worse where dependence on such a
>>>>> feature would mean that the WSGI application would no longer be
>>>>> portable to another WSGI server and destroys the notion that WSGI
>>>>> provides a portable interface.
>>>>>
>>>>> This isn't just restricted to HTTP/1.1 features either, but also
>>>>> applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
>>>>> that are directly hooked into the URL parsing of the base HTTP server
>>>>> can provide that information, which basically means that only pure
>>>>> Python HTTP/WSGI servers are likely able to provide it without
>>>>> guessing, and in that case such servers usually are always used where
>>>>> WSGI application mounted at root anyway.
>>>>>
>>>>> Graham
>>>>>
>>>>> On 7 January 2011 09:29, Graham Dumpleton<[hidden email]>
>>>>> wrote:
>>>>>
>>>>> On 7 January 2011 08:56, Alice Bevan–McGregor<[hidden email]>
>>>>>   wrote:
>>>>>
>>>>> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>>>>>
>>>>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>>>>
>>>>> :: Making optional (and thus rarely-implemented) features non-optional.
>>>>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>>>>> applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al.
>>>>> as
>>>>> per the HTTP 1.1 RFC.
>>>>>
>>>>> Requirements on the HTTP compliance of the server don't really have any
>>>>> place in the WSGI spec. You should be able to be WSGI compliant even if
>>>>> you
>>>>> don't use the HTTP transport at all (e.g. maybe you just send around
>>>>> requests via SCGI).
>>>>> The original spec got this right: chunking etc are something which is
>>>>> not
>>>>> relevant to the wsgi application code -- it is up to the server to
>>>>> implement
>>>>> the HTTP transport according to the HTTP spec, if it's purporting to be
>>>>> an
>>>>> HTTP server.
>>>>>
>>>>> Chunking is actually quite relevant to the specification, as WSGI and
>>>>> PEP
>>>>> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow
>>>>> for
>>>>> chunked bodies regardless of higher-level support for chunking.  The
>>>>> body
>>>>> iterator.  Previously you /had/ to define a length, with chunked
>>>>> encoding at
>>>>> the server level, you don't.
>>>>>
>>>>> I agree, however, that not all gateways will be able to implement the
>>>>> relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google
>>>>> search,
>>>>> seems to support it as well. I should re-word it as:
>>>>>
>>>>> "For those servers capable of HTTP/1.1 features the implementation of
>>>>> such
>>>>> features is required."
>>>>>
>>>>> I would question whether FASTCGI, SCGI or AJP support the concept of
>>>>> chunking of responses to the extent that the application can prepare
>>>>> the final content including chunks as required by the HTTP
>>>>> specification. Further, in Apache at least, the output from a web
>>>>> application served via those protocols is still pushed through the
>>>>> Apache output filter chain so as to allow the filters to modify the
>>>>> response, eg., apply compression using mod_deflate. As a consequence,
>>>>> the standard HTTP 'CHUNK' output filter is still a part of the output
>>>>> filter stack. This means that were a web application to try and do
>>>>> chunking itself, then Apache would rechunk such that the original
>>>>> chunking became part of the content, rather than the transfer
>>>>> encoding.
>>>>>
>>>>> So, in order to be able to achieve what I think you want, with a web
>>>>> application being able to do chunking itself, you would need to modify
>>>>> the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
>>>>> also like mod_cgi and mod_cgid of Apache.
>>>>>
>>>>> The only WSGI implementation I know of for Apache where you might even
>>>>> be able to do what you want is uWSGI. This is because I believe from
>>>>> memory it uses a mode in Apache by default called assbackwords. What
>>>>> this allows is for the output from the web application to bypass the
>>>>> Apache output filter stack and directly control the raw HTTP output.
>>>>> This gives uWSGI a little bit less overhead in Apache, but at the loss
>>>>> of the ability to actually use Apache output filters and for Apache to
>>>>> fix up response headers in any way. There is a flag in uWSGI which can
>>>>> optionally be set to make it use the more traditional mode and not use
>>>>> assbackwords.
>>>>>
>>>>> Thus, I believe you would be fighting against server implementations
>>>>> such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
>>>>> chunking to be supported at the level of the web application.
>>>>>
>>>>> About all you can do is ensure that the WSGI specification doesn't
>>>>> include anything in it which would prevent a web application
>>>>> harnessing indirectly such a feature as chunking where the web server
>>>>> supports it.
>>>>>
>>>>> As it is, it isn't chunked responses which is even the problem,
>>>>> because if a underlying web server supports chunking for responses,
>>>>> all you need to do is not set the content length.
>>>>>
>>>>> The problem area with chunking is the request content as the way that
>>>>> the WSGI specification is written prevents being able to have chunked
>>>>> request content. I have described the issue previously and made
>>>>> suggestions about alternate way that wsgi.input could be used.
>>>>>
>>>>> Graham
>>>>>
>>>>> +1
>>>>>
>>>>>         - Alice.
>> _______________________________________________
>> Web-SIG mailing list
>> [hidden email]
>> Web SIG: http://www.python.org/sigs/web-sig
>> Unsubscribe:
>> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>>

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

PJ Eby
In reply to this post by Timothy Farrell-2
At 12:52 PM 1/6/2011 -0800, Alice Bevan­McGregor wrote:

>Ignoring async for the moment, the goals of the PEP 444 rewrite are:
>
>:: Clear separation of "narrative" from "rules to be
>followed".  This allows developers of both servers and applications
>to easily run through a confomance "check list".
>
>:: Isolation of examples and rationale to improve readability of the
>core rulesets.
>
>:: Clarification of often mis-interpreted rules from PEP 333 (and
>those carried over in 3333).
>
>:: Elimination of unintentional non-conformance, esp. re: cgi.FieldStorage.
>
>:: Massive simplification of call flow.  Replacing start_response
>with a returned 3-tuple immensely simplifies the task of middleware
>that needs to capture HTTP status or manipulate (or even examine)
>response headers. [1]

A big +1 to all the above as goals.


>:: Reduction of re-implementation / NIH syndrome by incorporating
>the most common (1%) of features most often relegated to middleware
>or functional helpers.

Note that nearly every application-friendly feature you add will
increase the burden on both server developers and middleware
developers, which ironically means that application developers
actually end up with fewer options.


>   Unicode decoding of a small handful of values (CGI values that
> pull from the request URI) is the biggest example. [2, 3]

Does that mean you plan to make the other values bytes, then?  Or
will they be unicode-y-bytes as well?  What happens for additional
server-provided variables?

The PEP 3333 choice was for uniformity.  At one point, I advocated
simply using surrogateescape coding, but this couldn't be made
uniform across Python versions and maintain compatibility.

Unfortunately, even with the move to 2.6+, this problem remains,
unless server providers are required to register a surrogateescape
error handler -- which I'm not even sure can be done in Python 2.x.


>:: Cross-compatibility considerations.  The definition and use of
>native strings vs. byte strings is the biggest example of this in the rewrite.

I'm not sure what you mean here.  Do you mean "portability of WSGI 2
code samples across Python versions (esp. 2.x vs. 3.x)?"

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Alice Bevan–McGregor
In reply to this post by Alice Bevan–McGregor
On 2011-01-06 14:14:32 -0800, Alice Bevan–McGregor said:
> There was something, somewhere I was reading related to WSGI about
> requiring content-length... but no matter.

Right, I remember now: the HTTP 1.0 specification.  (Honestly not
trying to sound sarcastic!)  See:

        http://www.w3.org/Protocols/HTTP/1.0/draft-ietf-http-spec.html#Entity-Body

However, after testing every browser on my system (from Links and
ELinks, through Firefox, Chrome, Safari, Konqueror, and Dillo) across
the following test code, I find that they all handle a missing
content-length in the same way: reading the socket until it closes.

        http://pastie.textmate.org/1435415

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

James Y Knight

On Jan 7, 2011, at 12:13 AM, Alice Bevan–McGregor wrote:

> On 2011-01-06 14:14:32 -0800, Alice Bevan–McGregor said:
>> There was something, somewhere I was reading related to WSGI about requiring content-length... but no matter.
>
> Right, I remember now: the HTTP 1.0 specification.  (Honestly not trying to sound sarcastic!)  See:
>
> http://www.w3.org/Protocols/HTTP/1.0/draft-ietf-http-spec.html#Entity-Body

You've misread that section. In HTTP/1.0, *requests* were required to have a Content-Length if they had a body (HTTP 1.1 fixed that with chunked request support). Responses have never had that restriction: they have always (even since before HTTP 1.0) been allowed to omit Content-Length and terminate by closing the socket.

HTTP 1.1 didn't really add any new functionality to *responses* by adding chunking, simply bit of efficiency and error detection ability.

James
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 Goals

Alice Bevan–McGregor
On 2011-01-06 21:26:32 -0800, James Y Knight said:
> You've misread that section. In HTTP/1.0, *requests* were required to
> have a Content-Length if they had a body (HTTP 1.1 fixed that with
> chunked request support). Responses have never had that restriction:
> they have always (even since before HTTP 1.0) been allowed to omit
> Content-Length and terminate by closing the socket.

Ah ha, that explains my confusion, then! Thank you.

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
12