Most WSGI servers close connections to early.

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Most WSGI servers close connections to early.

Marcel Hellkamp-2
I just discovered a problem that affects most WSGI server
implementations and most current web-browsers (tested with wsgiref,
paste, firefox, chrome, wget and curl):

If the server closes the connection while the client is still uploading
data via POST or PUT, the browser displays an error message ('Connection
closed') and does not display the response sent by the server.

The error occurs if an application chooses to not process a form
submissions before returning to the WSGI server. This is quite rare in
real world scenarios, but hard to debug because the server logs the
request as successfully sent to the client.

To reproduce the problem, run the following script, visit
http://localhost:8080/ and upload a big file::



from wsgiref.simple_server import make_server

def application(environ, start_response):
    start_response('200 OK', [('Content-Type', 'text/html')])
    return ["""
    <form method='post' enctype='multipart/form-data'>
      Upload bog file:
      <input type='file' name='file' />
      <input type='submit' />
    </form>
    """]

server = make_server('localhost', 8080, application)
server.serve_forever()




I would like to add a warning to the WSGI/web3 specification to address
this issue:

"An application should read all available data from
`environ['wsgi.input']` on POST or PUT requests, even if it does not
process that data. Otherwise, the client might fail to complete the
request and not display the response."

--
Mit freundlichen Grüßen
Marcel Hellkamp

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Most WSGI servers close connections to early.

Robert Brewer-4
Marcel Hellkamp wrote:

> I just discovered a problem that affects most WSGI server
> implementations and most current web-browsers (tested with wsgiref,
> paste, firefox, chrome, wget and curl):
>
> If the server closes the connection while the client is still uploading
> data via POST or PUT, the browser displays an error message
> ('Connection
> closed') and does not display the response sent by the server.
>
> The error occurs if an application chooses to not process a form
> submissions before returning to the WSGI server. This is quite rare in
> real world scenarios, but hard to debug because the server logs the
> request as successfully sent to the client.
>
> To reproduce the problem, run the following script, visit
> http://localhost:8080/ and upload a big file::
>
>
>
> from wsgiref.simple_server import make_server
>
> def application(environ, start_response):
>     start_response('200 OK', [('Content-Type', 'text/html')])
>     return ["""
>     <form method='post' enctype='multipart/form-data'>
>       Upload bog file:
>       <input type='file' name='file' />
>       <input type='submit' />
>     </form>
>     """]
>
> server = make_server('localhost', 8080, application)
> server.serve_forever()
>
>
>
>
> I would like to add a warning to the WSGI/web3 specification to address
> this issue:
>
> "An application should read all available data from
> `environ['wsgi.input']` on POST or PUT requests, even if it does not
> process that data. Otherwise, the client might fail to complete the
> request and not display the response."

Indeed. CherryPy has protected against this for some time. But it shouldn't be the burden of *applications* to do this; the WSGI "origin" server can do so quite easily.

However, the caveat requires a caveat: servers must still be able to protect themselves from malicious clients. In practice, that means allowing servers to close the connection without reading the entire request body if a certain number of bytes is exceeded.


Robert Brewer
[hidden email]
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Most WSGI servers close connections to early.

Benoit Chesneau-3
In reply to this post by Marcel Hellkamp-2
On Wed, Sep 22, 2010 at 2:46 PM, Marcel Hellkamp <[hidden email]> wrote:

> I just discovered a problem that affects most WSGI server
> implementations and most current web-browsers (tested with wsgiref,
> paste, firefox, chrome, wget and curl):
>
> If the server closes the connection while the client is still uploading
> data via POST or PUT, the browser displays an error message ('Connection
> closed') and does not display the response sent by the server.
>
> The error occurs if an application chooses to not process a form
> submissions before returning to the WSGI server. This is quite rare in
> real world scenarios, but hard to debug because the server logs the
> request as successfully sent to the client.
>
> To reproduce the problem, run the following script, visit
> http://localhost:8080/ and upload a big file::
>
>
>
> from wsgiref.simple_server import make_server
>
> def application(environ, start_response):
>    start_response('200 OK', [('Content-Type', 'text/html')])
>    return ["""
>    <form method='post' enctype='multipart/form-data'>
>      Upload bog file:
>      <input type='file' name='file' />
>      <input type='submit' />
>    </form>
>    """]
>
> server = make_server('localhost', 8080, application)
> server.serve_forever()
>
>
>
>
> I would like to add a warning to the WSGI/web3 specification to address
> this issue:
>
> "An application should read all available data from
> `environ['wsgi.input']` on POST or PUT requests, even if it does not
> process that data. Otherwise, the client might fail to complete the
> request and not display the response."
>
> --
> Mit freundlichen Grüßen
> Marcel Hellkamp
>
Your application and client should be aware of Expect: 100-Continue header :

http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

- benoît

(resent, because web-sig doesn't set well the default reply-to)
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Most WSGI servers close connections to early.

Benoit Chesneau-3
In reply to this post by Robert Brewer-4
On Wed, Sep 22, 2010 at 5:34 PM, Robert Brewer <[hidden email]> wrote:

> However, the caveat requires a caveat: servers must still be able to protect themselves from malicious clients. In practice, that means allowing servers to close the connection without reading the entire request body if a certain number of bytes is exceeded.
>
I don't see how it could be the responsability of the server. Can you
develop a little ? The server shouldn't interfere in the HTTP request
imo.

- benpît
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Most WSGI servers close connections to early.

PJ Eby
In reply to this post by Robert Brewer-4
At 08:34 AM 9/22/2010 -0700, Robert Brewer wrote:

>Marcel Hellkamp wrote:
> > I would like to add a warning to the WSGI/web3 specification to address
> > this issue:
> >
> > "An application should read all available data from
> > `environ['wsgi.input']` on POST or PUT requests, even if it does not
> > process that data. Otherwise, the client might fail to complete the
> > request and not display the response."
>
>Indeed. CherryPy has protected against this for some time. But it
>shouldn't be the burden of *applications* to do this; the WSGI
>"origin" server can do so quite easily.
>
>However, the caveat requires a caveat: servers must still be able to
>protect themselves from malicious clients. In practice, that means
>allowing servers to close the connection without reading the entire
>request body if a certain number of bytes is exceeded.

We can certainly add warnings, although these are both more of a
"best practices" advisory rather than a part of the spec per se.

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Most WSGI servers close connections to early.

Robert Brewer-4
In reply to this post by Benoit Chesneau-3
Benoit Chesneau wrote:

> On Wed, Sep 22, 2010 at 5:34 PM, Robert Brewer <[hidden email]>
> wrote:
> > However, the caveat requires a caveat: servers must still be able to
> protect themselves from malicious clients. In practice, that means
> allowing servers to close the connection without reading the entire
> request body if a certain number of bytes is exceeded.
>
> I don't see how it could be the responsability of the server. Can you
> develop a little ? The server shouldn't interfere in the HTTP request
> imo.

Well since the "origin server" is the only component in the architecture
that's *actually* having an HTTP conversation with the client, calling
it "interference" seems a bit skewed. ;) RFC 2616 8.2.3 says:

"If an origin server receives a request that does not include an
Expect request-header field with the "100-continue" expectation,
the request includes a request body, and the server responds
with a final status code before reading the entire request body
from the transport connection, then the server SHOULD NOT close
the transport connection until it has read the entire request,
or until the client closes the connection. Otherwise, the client
might not reliably receive the response message. However, this
requirement is not be construed as preventing a server from
defending itself against denial-of-service attacks, or from
badly broken client implementations."

The way CherryPy implements this is to wrap the socket file before
handing it to wsgi.input. That wrapper understands Content-Length (and
another understands Transfer-Encoding), and won't allow any component
that calls wsgi.input.read(n) to read past the Content-Length limit.
[This also allows components to call read() without a size argument yet
not timeout on the socket, as specified in recent proposals.]

The server can be configured to have a maximum number of bytes it will
allow to be read--if Content-Length exceeds that number, the server
immediately responds with 413 Request Entity Too Large. It doesn't read
the rest of the request entity, because it's too big and could cause a
DoS. If clients can't read the response because they're still blocked
sending a request that's too big, there's not really any way to get
around that if the client didn't send an Expect request header.

If the Content-Length is not too large, and the application returns
(normally or exceptionally), and the wrapper has not recorded that the
bytes read equals the Content-Length, then the server will consume the
remaining bytes and throw them away before sending the response headers.

I just noticed it doesn't do that if it's going to close the conn. Not
sure why. Maybe it should.


Robert Brewer
[hidden email]
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Most WSGI servers close connections to early.

and-py
In reply to this post by Marcel Hellkamp-2
On 09/22/2010 02:46 PM, Marcel Hellkamp wrote:
> "An application should read all available data from
> `environ['wsgi.input']` on POST or PUT requests, even if it does not
> process that data. Otherwise, the client might fail to complete the
> request and not display the response."

Oh, it's worse than that. In practice the application needs to read all
available data from the request body before producing output.

If you send too much response without reading the whole request body in
some environments, you can deadlock. The web server is buffering the
input stream for the request body and also the output stream from the
app. This needs to be done[1] to avoid sending an HTTP response before
the request is complete.

If those are limited-size buffers[2] and you fill the output buffer with
response without clearing enough of the input buffer that the browser
can finish sending the request, you'll be blocking indefinitely on write.

[1] possibly unless HTTP pipelining is in effect? not sure, haven't tested.

[2] and certainly in IIS they are. The output buffer is 8K IIRC. It's
easy to overflow that and get a mysterious non-responsive script because
an error happens and spits out a debugging page before the form-reading
library has had a chance to consume the input.

--
And Clover
mailto:[hidden email]
http://www.doxdesk.com/
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com