wsgiserver and keep-alive timeouts

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

wsgiserver and keep-alive timeouts

JeffB-4

This post builds upon what was noted here:

http://groups.google.com/group/cherrypy-devel/browse_thread/thread/b3d88b5e1293e03a

but that was nearly a year ago and I thought I might offer up some
fresh thoughts.

In short, I am seeing the same issues mentioned in the above post,
where the WSGI server goes into a "blocked" state if the number of
concurrent connections exceeds the size of the thread pool.  This
leads to 10 second (or longer) delays on the browser-side.  I have a
very short example if one is interested but this behavior can be
easily duplicated in 3.1.2 simply by reducing the "server.thread_pool"
value down to 2 (or even better, 1) and attempting to load a page with
a lot of static content.

Again, as stated in the aforementioned post, the crux of the issue is
how the WSGI server deals with keep-alive connections.

Having looked over the code it appears that the problem comes down to
the use of blocking sockets.  By blocking on a recv, the calling
thread is tied up until data arrives or a timeout occurs.  On a keep-
alive connection, once all pipelined requests have been received and
processed, it's unlikely that any further data will arrive.  However,
we can't know this for sure so we have to try to keep reading - for
how long seems to be an arbitrary decision - the current value is 10
seconds (modified by "server.socket_timeout").

This architecture fundamentally limits the number of concurrent
connections that can be processed in a reasonable amount of time to,
at most, the number of threads in the thread pool.  One can increase
the number of threads but this is not efficient, for various reasons.
One could also reduce the timeout value but on a slow connection, this
could lead to dropped requests and other ugliness.

I suppose one could argue that if performance is important, use Apache
+ the Cherrypy application framework and be done with it.  However, I
happen to like the compactness and simplicity of the all-in-one
solution Cherrypy provides out-of-the-box - it's what drew me to it in
the first place.  So, I want something that will scale reasonably well
(beyond 10 concurrent users, which is where the system would choke in
the default config).

To fix this, I think two fundamental changes to the WSGI server are
required:

1. Use non-blocking sockets.  This will allow threads to return
immediately if there is no pending data to be read or if data cannot
be immediately sent, thereby potentially improving thread
utilization.  However, this requires a second change...

2. The use of non-blocking sockets fundamentally changes the flow of
certain parts of the server logic.  Specifically, there is no longer a
guarantee that any data will arrive when recv() is called.  In fact,
it may be tens of milliseconds (or longer) before something useful
arrives.  We don't want the thread just sitting there idle while it
waits.  Instead, it should relinquish control of the connection and
come back to it some time later, moving on to other pending
connections.  However, this means each connection needs to keep track
of its state so that when data finally does arrive, it can continue
where it left off.  This can be accomplished with a state machine
inside of the HTTPRequest class.

The WSGI server is already pretty compact.  While the changes above
are intrusive, they can easily be implemented without a full rewrite
of the file.  The majority of the changes would be to three classes:
HTTPConnection, HTTPRequest and CP_fileobject.

I would be interested in hearing the thoughts of the CP team in
regards to this problem.  I'm sure there are issues that I have not
considered in my analysis.

Thanks,

Jeff


Related tickets (there are probably others):

http://www.cherrypy.org/ticket/764
http://cherrypy.org/ticket/539            (btw, i don't think this is
a good idea...)

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "cherrypy-devel" group.
To post to this group, send email to [hidden email]
To unsubscribe from this group, send email to [hidden email]
For more options, visit this group at http://groups.google.com/group/cherrypy-devel
-~----------~----~----~----~------~----~------~--~---

Reply | Threaded
Open this post in threaded view
|

Re: wsgiserver and keep-alive timeouts

Christian Wyglendowski-2

On Thu, Jul 30, 2009 at 1:50 PM, JeffB<[hidden email]> wrote:
> This architecture fundamentally limits the number of concurrent
> connections that can be processed in a reasonable amount of time to,
> at most, the number of threads in the thread pool.  One can increase
> the number of threads but this is not efficient, for various reasons.

I'm interested to hear why you think this isn't efficient.  I know
there are limits and async servers can handle many more concurrent
connections, but there are tradeoffs when you go that route.

How many concurrent connections are you looking to maintain?  I'd
suggest bumping up the # of threads and doing some load testing to see
if there are real issues (maybe you already have?).  I think that
might make for an interesting analysis.

Christian
http://www.dowski.com

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "cherrypy-devel" group.
To post to this group, send email to [hidden email]
To unsubscribe from this group, send email to [hidden email]
For more options, visit this group at http://groups.google.com/group/cherrypy-devel
-~----------~----~----~----~------~----~------~--~---

Reply | Threaded
Open this post in threaded view
|

Re: wsgiserver and keep-alive timeouts

JeffB-4


On Jul 30, 12:15 pm, Christian Wyglendowski <[hidden email]>
wrote:
> I'm interested to hear why you think this isn't efficient.  I know
> there are limits and async servers can handle many more concurrent
> connections, but there are tradeoffs when you go that route.

There are tradeoffs for both cases, to be sure but, I believe, in
general, that most servers out there (web or otherwise) that expect
high concurrent connection rates (>100, say) would not use a one-
thread-per-connection model.  The overhead for managing that level of
threads becomes onerous and the CPU starts spending more time task-
switching than doing real work.  I mean, it's small, I don't want to
overstate the impact, but like all things, it adds up over time and
ultimately leads to a system that is less efficient than it could be.

For my work, 20 threads would probably be more than adequate for the
expected load - this is a simple in-house app, nothing fancy.  I could
just bump up the thread count and move on.

I guess it's a matter of principal.  The idea of a thread sitting
blocked for 10 seconds waiting on data that is never going to arrive,
tying up a critical resource, just seems wrong.

Jeff
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "cherrypy-devel" group.
To post to this group, send email to [hidden email]
To unsubscribe from this group, send email to [hidden email]
For more options, visit this group at http://groups.google.com/group/cherrypy-devel
-~----------~----~----~----~------~----~------~--~---