When must applications call the WSGI start_response callable.

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

When must applications call the WSGI start_response callable.

Jim Fulton
I'm a bit unclear about the timing of the start_response call.
I think this is because the PEP is unclear, but perhaps I missed
something.

It doesn't appear that the PEP says when the start_response callable
must be called.  It gives several examples. In most, the callback is
called when the application is called, but in one example, the
callback is called in the __iter__ of the result of calling the
application.

Here's what I think the PEP should say (something like):

"The start_response callback must be:

- called when the application is called,

- called when the result iterator is computed, or

- it must be called asynchronously, typically from an application
   thread.

Normally an application will call the start_response callable when the
application is called or when the result iterator is constructed, as
shown in the first 2 examples. An application, or more commonly, a
middleware component that provides it's own thread management might
delay starting the response.  A server should not begin iterating
over the result until the start_response callable has been called."

Why do I want this?  It appears that this would be needed to enable
middleware components that manage application threads.  I can imagine
though that there aren't any existing servers that handle what I've
suggested correctly.

I do think it would be straightforward for servers to handle this
correctly, especially for asynchronous servers like Twisted
and ayncore-based servers.  Perhaps this could be an optional feature
of the servers.  Servers supporting this feature would be prepared to
delay response output until start_response is called.  Servers unable
to do this would generate errors if start_response hasn't been called
by the time the result iterator has been constructed.

In any case, I think the PEP needs to specify more clearly when
start_response can be called.

Jim

--
Jim Fulton           mailto:[hidden email]       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: When must applications call the WSGI start_response callable.

ianb
Jim Fulton wrote:

> I'm a bit unclear about the timing of the start_response call.
> I think this is because the PEP is unclear, but perhaps I missed
> something.
>
> It doesn't appear that the PEP says when the start_response callable
> must be called.  It gives several examples. In most, the callback is
> called when the application is called, but in one example, the
> callback is called in the __iter__ of the result of calling the
> application.
>
> Here's what I think the PEP should say (something like):
>
> "The start_response callback must be:
>
> - called when the application is called,
>
> - called when the result iterator is computed, or
>
> - it must be called asynchronously, typically from an application
>    thread.
>
> Normally an application will call the start_response callable when the
> application is called or when the result iterator is constructed, as
> shown in the first 2 examples. An application, or more commonly, a
> middleware component that provides it's own thread management might
> delay starting the response.  A server should not begin iterating
> over the result until the start_response callable has been called."

My impression is that it is the application's responsibility to call
start_response before the first item is returned from the iterator, and
it is an error if it does not.

However, in paste.lint
(http://svn.pythonpaste.org/Paste/trunk/paste/lint.py) I check that
start_response is called before the application returns the iterator.
So I guess, at least where I've been inserting paste.lint, that I
haven't encountered other examples in practice.  But then most of the
places I've used it, I wrote the application, and so I've never felt
compelled to use a different order.

If that's not correct, I'd like to update paste.lint.

> Why do I want this?  It appears that this would be needed to enable
> middleware components that manage application threads.  I can imagine
> though that there aren't any existing servers that handle what I've
> suggested correctly.
>
> I do think it would be straightforward for servers to handle this
> correctly, especially for asynchronous servers like Twisted
> and ayncore-based servers.  Perhaps this could be an optional feature
> of the servers.  Servers supporting this feature would be prepared to
> delay response output until start_response is called.  Servers unable
> to do this would generate errors if start_response hasn't been called
> by the time the result iterator has been constructed.

I suppose this wouldn't be particularly bad for threaded or multiprocess
servers either -- they use a thread/process until the request is
completed regardless of what happens.  I can see how it could be used to
greater effect in an asynchronous server.  However, I'd rather it not be
optional, as most WSGI apps won't do this, and so servers won't get good
testing on this or may just not implement it, and then some apps and
some servers won't be compatible.

--
Ian Bicking  /  [hidden email]  /  http://blog.ianbicking.org
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: When must applications call the WSGI start_response callable.

James Y Knight
In reply to this post by Jim Fulton
On Dec 15, 2005, at 3:01 PM, Jim Fulton wrote:
> Normally an application will call the start_response callable when the
> application is called or when the result iterator is constructed, as
> shown in the first 2 examples. An application, or more commonly, a
> middleware component that provides it's own thread management might
> delay starting the response.  A server should not begin iterating
> over the result until the start_response callable has been called."

But it's my understanding that this is valid:

     def test_calledStartResponseLate(self):
         def application(environ, start_response):
             start_response("200 OK", {})
             yield "Foo"

start_response is called _inside_ the first iteration of the result.  
So the server has to iterate at least once, even if start_response  
was not called...

I was led to believe this was a valid thing to do from the following  
wording:
> (Note: the application must invoke the start_response() callable  
> before the iterable yields its first body string, so that the  
> server can send the headers before any body content. However, this  
> invocation may be performed by the iterable's first iteration, so  
> servers must not assume that start_response() has been called  
> before they begin iterating over the iterable.)

James
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: When must applications call the WSGI start_response callable.

PJ Eby
In reply to this post by Jim Fulton
At 03:01 PM 12/15/2005 -0500, Jim Fulton wrote:
>I'm a bit unclear about the timing of the start_response call.
>I think this is because the PEP is unclear, but perhaps I missed
>something.
>
>It doesn't appear that the PEP says when the start_response callable
>must be called.  It gives several examples. In most, the callback is
>called when the application is called, but in one example, the
>callback is called in the __iter__ of the result of calling the
>application.

Hm.  I thought there was something there saying that it had to be called by
the time the first value is yielded by the iterable, but it's not
explicit.  The example *server* in the PEP, however, raises an
AssertionError if you violate this rule.


>Here's what I think the PEP should say (something like):
>
>"The start_response callback must be:
>
>- called when the application is called,
>
>- called when the result iterator is computed, or
>
>- it must be called asynchronously, typically from an application
>    thread.

-1 on enabling asynchrony here; it would enormously complicate the design
of servers.  WSGI is a purely synchronous protocol.  Any asynchrony within
an application must be masked from the server.


>Normally an application will call the start_response callable when the
>application is called or when the result iterator is constructed, as
>shown in the first 2 examples. An application, or more commonly, a
>middleware component that provides it's own thread management might
>delay starting the response.  A server should not begin iterating
>over the result until the start_response callable has been called."

This would completely break the existing design.  Note in particular that
some applications do not call start_response until they're in their first
iterator next() call; notably any generator-based WSGI apps will do this.


>Why do I want this?  It appears that this would be needed to enable
>middleware components that manage application threads.

No, it's not needed.  Such middleware would simply have to return iterators
that communicate with the other threads (e.g. via a queue).  These
iterators would simply have to block until output is available.


>   I can imagine
>though that there aren't any existing servers that handle what I've
>suggested correctly.

There probably aren't *any*, actually.


>I do think it would be straightforward for servers to handle this
>correctly, especially for asynchronous servers like Twisted
>and ayncore-based servers.  Perhaps this could be an optional feature
>of the servers.  Servers supporting this feature would be prepared to
>delay response output until start_response is called.  Servers unable
>to do this would generate errors if start_response hasn't been called
>by the time the result iterator has been constructed.

About a year ago, there was some discussion of designing such an optional
"async server" API extension to allow basically the same sort of thing; the
only part of the idea that was incorporated, is that an iterator is allowed
to yield empty strings to suggest to an async server that it should do
other things for a while before trying to get another block from the iterator.

The main thing that kept the async API from gelling was that there was
nobody with adequate use cases to motivate the definition.  Perhaps that
has changed now.


>In any case, I think the PEP needs to specify more clearly when
>start_response can be called.

It's tempting at this point to allow start_response() to occur at any time
until the first non-empty string is yielded, rather than the first
string.  This would make your thread-management middleware possible, but
unfortunately would require a protocol version change, from 1.0 to
1.1.  Servers in the field (especially those based on the wsgiref.handlers
module) currently require start_response() to be called before the first
string, so your middleware couldn't rely on this feature unless it was
either optional or a "1.1" feature.

On the other hand, it would probably make more sense to define a server
extension like 'wsgi_async.delayed_start'.  If present, this would be a
special value you could  return to indicate that you'll actually respond
later.  So the threading middleware might look like:

     def threader_mw(environ, start_response):
         if 'wsgi_async.delayed_start' in environ:
             # add environ+start_response to threadqueue
             return environ['wsgi_async.delayed_start']
         else:
             # run request synchronously

The threads would then have to use write() to send data.

Anyway, this would allow async servers to let apps handle their own thread
pooling, although in the general case I think it's a lousy idea.  An async
server like Twisted already has a thread pooling facility, and
application-specific pools would just duplicate that and waste
resources.  Meanwhile, this hypothetical threading middleware seems like
useless overhead for synchronous servers.

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: When must applications call the WSGI start_response callable.

Jim Fulton
In reply to this post by James Y Knight
James Y Knight wrote:

> On Dec 15, 2005, at 3:01 PM, Jim Fulton wrote:
>
>> Normally an application will call the start_response callable when the
>> application is called or when the result iterator is constructed, as
>> shown in the first 2 examples. An application, or more commonly, a
>> middleware component that provides it's own thread management might
>> delay starting the response.  A server should not begin iterating
>> over the result until the start_response callable has been called."
>
>
> But it's my understanding that this is valid:
>
>     def test_calledStartResponseLate(self):
>         def application(environ, start_response):
>             start_response("200 OK", {})
>             yield "Foo"
>
> start_response is called _inside_ the first iteration of the result.  So
> the server has to iterate at least once, even if start_response  was not
> called...
>
> I was led to believe this was a valid thing to do from the following  
> wording:
>
>> (Note: the application must invoke the start_response() callable  
>> before the iterable yields its first body string, so that the  server
>> can send the headers before any body content. However, this  
>> invocation may be performed by the iterable's first iteration, so  
>> servers must not assume that start_response() has been called  before
>> they begin iterating over the iterable.)

Aargh, I didn't see that, despite looking for it.  I said I may have missed
it.

Hm.

Jim

--
Jim Fulton           mailto:[hidden email]       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: When must applications call the WSGI start_response callable.

PJ Eby
In reply to this post by James Y Knight
At 03:29 PM 12/15/2005 -0500, James Y Knight wrote:
>I was led to believe this was a valid thing to do from the following
>wording:
> > (Note: the application must invoke the start_response() callable
> > before the iterable yields its first body string, so that the
> > server can send the headers before any body content. However, this
> > invocation may be performed by the iterable's first iteration, so
> > servers must not assume that start_response() has been called
> > before they begin iterating over the iterable.)

Aha!  I knew it was in there somewhere.  :)

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: When must applications call the WSGI start_response callable.

Jim Fulton
In reply to this post by ianb
Ian Bicking wrote:
> Jim Fulton wrote:
...

>> Why do I want this?  It appears that this would be needed to enable
>> middleware components that manage application threads.  I can imagine
>> though that there aren't any existing servers that handle what I've
>> suggested correctly.
>>
>> I do think it would be straightforward for servers to handle this
>> correctly, especially for asynchronous servers like Twisted
>> and ayncore-based servers.  Perhaps this could be an optional feature
>> of the servers.  Servers supporting this feature would be prepared to
>> delay response output until start_response is called.  Servers unable
>> to do this would generate errors if start_response hasn't been called
>> by the time the result iterator has been constructed.
>
>
> I suppose this wouldn't be particularly bad for threaded or multiprocess
> servers either -- they use a thread/process until the request is
> completed regardless of what happens.

Exacept that it makes the implementation a bit more complex.

 > I can see how it could be used to
> greater effect in an asynchronous server.  However, I'd rather it not be
> optional, as most WSGI apps won't do this, and so servers won't get good
> testing on this or may just not implement it, and then some apps and
> some servers won't be compatible.

I mostly agree, except that I think this feature may only be useful for
asynchronous servers.

Jim

--
Jim Fulton           mailto:[hidden email]       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com