WSGI and start_response

classic Classic list List threaded Threaded
35 messages Options
12
Reply | Threaded
Open this post in threaded view
|

WSGI and start_response

Manlio Perillo-3
Hi.

Some time ago I objected the decision to remove start_response function
from next version WSGI, using as rationale the fact that without
start_callable, asynchronous extension are impossible to support.

Now I have found that removing start_response will also make impossible
to support coroutines (or, at least, some coroutines usage).

Here is an example (this is the same example I posted few days ago):
http://paste.pocoo.org/show/199202/

Forgetting about the write callable, the problem is that the application
starts to yield data when tmpl.render_unicode function is called.

Please note that this has *nothing* to do with asynchronus applications.
The code should work with *all* WSGI implementations.


In the pasted example, the Mako render_unicode function is "turned" into
a generator, with a simple function that allows to flush the current buffer.


Can someone else confirm that this code is impossible to support in WSGI
2.0?

If my suspect is true, I once again object against removing start_response.

WSGI 1.0 is really a well designed protocol, since it is able to support
both asynchonous application (with a custom extension) and coroutines,
*even* if this was not considered during protocol design.


Thanks  Manlio
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

Aaron Watters-2
someone remind me: where is the canonical WSGI 2 spec?
I assume there is a way to "wrap" WSGI 1 applications
without breaking them?  Or is this the regex-->re fiasco
all over again?

   -- Aaron Watters

--- On Thu, 4/8/10, Manlio Perillo <[hidden email]> wrote:

> From: Manlio Perillo <[hidden email]>
> Subject: [Web-SIG] WSGI and start_response
> To: "'Web SIG'" <[hidden email]>
> Date: Thursday, April 8, 2010, 10:08 AM
> Hi.
>
> Some time ago I objected the decision to remove
> start_response function
> from next version WSGI, using as rationale the fact that
> without
> start_callable, asynchronous extension are impossible to
> support.
>
> Now I have found that removing start_response will also
> make impossible
> to support coroutines (or, at least, some coroutines
> usage).
>
> Here is an example (this is the same example I posted few
> days ago):
> http://paste.pocoo.org/show/199202/
>
> Forgetting about the write callable, the problem is that
> the application
> starts to yield data when tmpl.render_unicode function is
> called.
>
> Please note that this has *nothing* to do with asynchronus
> applications.
> The code should work with *all* WSGI implementations.
>
>
> In the pasted example, the Mako render_unicode function is
> "turned" into
> a generator, with a simple function that allows to flush
> the current buffer.
>
>
> Can someone else confirm that this code is impossible to
> support in WSGI
> 2.0?
>
> If my suspect is true, I once again object against removing
> start_response.
>
> WSGI 1.0 is really a well designed protocol, since it is
> able to support
> both asynchonous application (with a custom extension) and
> coroutines,
> *even* if this was not considered during protocol design.
>
>
> Thanks  Manlio
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com
>
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

PJ Eby
In reply to this post by Manlio Perillo-3
At 04:08 PM 4/8/2010 +0200, Manlio Perillo wrote:

>Hi.
>
>Some time ago I objected the decision to remove start_response function
>from next version WSGI, using as rationale the fact that without
>start_callable, asynchronous extension are impossible to support.
>
>Now I have found that removing start_response will also make impossible
>to support coroutines (or, at least, some coroutines usage).
>
>Here is an example (this is the same example I posted few days ago):
>http://paste.pocoo.org/show/199202/
>
>Forgetting about the write callable, the problem is that the application
>starts to yield data when tmpl.render_unicode function is called.
>
>Please note that this has *nothing* to do with asynchronus applications.
>The code should work with *all* WSGI implementations.
>
>
>In the pasted example, the Mako render_unicode function is "turned" into
>a generator, with a simple function that allows to flush the current buffer.
>
>
>Can someone else confirm that this code is impossible to support in WSGI
>2.0?

I don't understand why it's a problem.  See my previous post here:

http://mail.python.org/pipermail/web-sig/2009-September/003986.html

for a sketch of a WSGI 1-to-2 converter.  It takes a WSGI 1
application callable as the input, and returns a WSGI 2 function.

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

Manlio Perillo-3
In reply to this post by Aaron Watters-2
Aaron Watters ha scritto:
> someone remind me: where is the canonical WSGI 2 spec?

http://wsgi.org/wsgi/WSGI_2.0

> I assume there is a way to "wrap" WSGI 1 applications
> without breaking them?  Or is this the regex-->re fiasco
> all over again?
>

start_response can be implemented by a function that will store the
status code and response headers.

There should be a sample WSGI 2.0 implementation for CGI, and a sample
WSGI 1.0 -> 2.0 adapter.

This adapter should be able to support the coroutine example,
> http://paste.pocoo.org/show/199202/
but I would like to test.

write callable, as far as I know, can not be implemented.

> [...]


Regards  Manlio
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

PJ Eby
At 04:59 PM 4/8/2010 +0200, Manlio Perillo wrote:

>Aaron Watters ha scritto:
> > someone remind me: where is the canonical WSGI 2 spec?
>
>http://wsgi.org/wsgi/WSGI_2.0
>
> > I assume there is a way to "wrap" WSGI 1 applications
> > without breaking them?  Or is this the regex-->re fiasco
> > all over again?
> >
>
>start_response can be implemented by a function that will store the
>status code and response headers.
>
>There should be a sample WSGI 2.0 implementation for CGI, and a sample
>WSGI 1.0 -> 2.0 adapter.
>
>This adapter should be able to support the coroutine example,
> > http://paste.pocoo.org/show/199202/
>but I would like to test.
>
>write callable, as far as I know, can not be implemented.

Implementing it requires greenlets or threads, but it's implementable.  See:

http://mail.python.org/pipermail/web-sig/2009-September/003986.html

(Btw, I've noticed that this early sketch of mine doesn't support the
case where an application is a generator, because start_response
won't have been called when the application returns.  This can be
fixed, but it requires the addition of a wrapper class and a few
other annoying details.  It also doesn't support exc_info properly,
so it's still a ways from being a correct WSGI 1 server
implementation.  Getting rid of all these little variations, though,
is the goal of having a WSGI 2 - it's difficult to write *any*
middleware to be completely WSGI 1 compliant.)

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

Manlio Perillo-3
P.J. Eby ha scritto:

> At 04:59 PM 4/8/2010 +0200, Manlio Perillo wrote:
> [...]
>> There should be a sample WSGI 2.0 implementation for CGI, and a sample
>> WSGI 1.0 -> 2.0 adapter.
>>
>> This adapter should be able to support the coroutine example,
>> > http://paste.pocoo.org/show/199202/
>> but I would like to test.
>>
>> write callable, as far as I know, can not be implemented.
>
> Implementing it requires greenlets or threads, but it's implementable.
> See:
>
> http://mail.python.org/pipermail/web-sig/2009-September/003986.html
>

Right.
In fact, in the example I posted, I implemented the write callable using
greenlets (although the implementation is different).

> (Btw, I've noticed that this early sketch of mine doesn't support the
> case where an application is a generator, because start_response won't
> have been called when the application returns.  This can be fixed, but
> it requires the addition of a wrapper class and a few other annoying
> details.  It also doesn't support exc_info properly, so it's still a
> ways from being a correct WSGI 1 server implementation.  Getting rid of
> all these little variations, though, is the goal of having a WSGI 2 -
> it's difficult to write *any* middleware to be completely WSGI 1
> compliant.)
>

I agree that this is a good goal.
However I don't like the idea of losing support for some features.

With WSGI 2.0 we will end up with:

- WSGI 1.0, a full featured protocol, but with hard to implement
  middlewares
- WSGI 2.0, a simple protocol, with more easy to implement middlewares
  but without support for some "advanced" applications


Both WSGI 1.0 can be implemented on top of WSGI 2.0, and WSGI 2.0 on top
of WSGI 1.0.

The latter should be more "easy" to implement.


I would like to have a WSGI 1.1 specification without the write
callable, and a *standard* adapter that will expose a more simple API
(like WSGI 2.0) so that applications and middlewares can be implemented
using this simple API but you still have the full featured API.

This is important, IMHO.
Because with the next version of WSGI, there will be also support for
Python 3.x.
And if the next version will not have support for the start_response
function, applications that needs Python 3.x and want to use "advance
features" will not be able to rely a standard procotol.



Regards   Manlio
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

PJ Eby
At 05:40 PM 4/8/2010 +0200, Manlio Perillo wrote:
>With WSGI 2.0 we will end up with:
>
>- WSGI 1.0, a full featured protocol, but with hard to implement
>   middlewares
>- WSGI 2.0, a simple protocol, with more easy to implement middlewares
>   but without support for some "advanced" applications

Let me see if I understand what you're saying.  You want to support
suspending an application, without using greenlets or threads.  Under
WSGI 1, you can do this by yielding empty strings before calling
start_response.  Under WSGI 2, you can only do this by directly
suspending execution, e.g. via greenlet or eventlets or some similar
API provided by the server.  Is this your objection?

As far as I know, nobody has actually implemented an async app
facility for WSGI 1, although it sounds like perhaps you're trying to
design or implement such a thing now.  If so, then there's nothing
stopping you from implementing a WSGI 1 server and providing a WSGI 2
adapter, since as you point out, WSGI 2 is easier to implement on top
of WSGI 1 than the other way around.

(Note, however, that if you simply use a greenlet or eventlet-based
API for your async server, then the problem is neatly solved whether
you are using WSGI 1 or 2, and the effective API is a lot cleaner
than yielding empty strings.)

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

Manlio Perillo-3
P.J. Eby ha scritto:

> At 05:40 PM 4/8/2010 +0200, Manlio Perillo wrote:
>> With WSGI 2.0 we will end up with:
>>
>> - WSGI 1.0, a full featured protocol, but with hard to implement
>>   middlewares
>> - WSGI 2.0, a simple protocol, with more easy to implement middlewares
>>   but without support for some "advanced" applications
>
> Let me see if I understand what you're saying.  You want to support
> suspending an application, without using greenlets or threads.

What I'm trying to do is:

* as in the example I posted, turn Mako render function in a generator.

  The reason is that I would lite to to implement support for Nginx
  subrequests.
  During a subrequest, the generated response body is sent directly to
  the client, so it is necessary to be able to flush the Mako buffer

* implement the simple suspend/resume extension, as described here:
  http://comments.gmane.org/gmane.comp.python.twisted.web/632

  Note that my ngx_http_wsgi_module already support asynchronous web
  server, since when the application returns a generator and sending a
  yielded buffer to the client would block, execution of WSGI
  application is suspended, and resumed when the socket is ready to send
  data.

  The suspend/resume extension allows an application to explicitly
  suspend/resume execution, so it is a nice complement for an
  asynchronous server.

  I would like to propose this extension for wsgiorg namespace.


Not that, however, greenlets are still required, since it will make the
code much more usable.

> Under
> WSGI 1, you can do this by yielding empty strings before calling
> start_response.  

No, in this case this is not what I need to do.

I need to call start_response, since the greenlet middleware will yield
data to the caller before the application returns.

> Under WSGI 2, you can only do this by directly
> suspending execution, e.g. via greenlet or eventlets or some similar API
> provided by the server.  Is this your objection?
>

In WSGI 2 what I want to do is not really possible.
The reason is that I don't use greenlets in the C module (I'm not even
sure greenlets can be used in my ngx_http_wsgi module)

Execution is suspended using the "normal" suspend extension.
The problem is with the greenlet middleware that will force a different
code flow.

> As far as I know, nobody has actually implemented an async app facility
> for WSGI 1, although it sounds like perhaps you're trying to design or
> implement such a thing now.  

Right.

My previous attempt was a failure, since the extensions have severe
usability problem.

It is the same problem you have with Twisted deferred. In this case
every function that call a function that use the async extension must be
a generator.

In my new attempt I plan to:

1) Implement the simple suspend/resume extension
2) Implement a Python extension module that wraps the Nginx events
   system.
3) Implement a pure Python WSGI middleware that, using greenlets, will
   enable normal applications to take advantage of Nginx async features.

   This middleware will have the same purpose as the Hub available in
   gevent


> If so, then there's nothing stopping you
> from implementing a WSGI 1 server and providing a WSGI 2 adapter, since
> as you point out, WSGI 2 is easier to implement on top of WSGI 1 than
> the other way around.
>

Yes, this is what I would like to do.

Do you think it will possible to implement all the requirements of WSGI
2 (including Python 3.x support) in a simple adapter on top of WSGI 1.0 ?

And what about applications that need to use the WSGI 1.0 API but
require to run with Python 3.x?


Thanks  Manlio
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

PJ Eby
At 08:06 PM 4/8/2010 +0200, Manlio Perillo wrote:
>What I'm trying to do is:
>
>* as in the example I posted, turn Mako render function in a generator.
>
>   The reason is that I would lite to to implement support for Nginx
>   subrequests.

By subrequest, do you mean that one request is invoking another, like
one WSGI application calling multiple other WSGI applications to
render one page containing contents from more than one?


>   During a subrequest, the generated response body is sent directly to
>   the client, so it is necessary to be able to flush the Mako buffer

I don't quite understand this, since I don't know what Mako is, or,
if it's a template engine, what flushing its buffer would have to do
with WSGI buffering.


> > Under
> > WSGI 1, you can do this by yielding empty strings before calling
> > start_response.
>
>No, in this case this is not what I need to do.

Well, if that's not when you're needing to suspend the application,
then I don't see what you're losing in WSGI 2.


>I need to call start_response, since the greenlet middleware will yield
>data to the caller before the application returns.

I still don't understand you.  In WSGI 1, the only way to suspend
execution (without using greenlets) prior to determining the headers
is to yield empty strings.

I'm beginning to wonder if maybe what you're saying is that you want
to be able to write an application function in the form of a
generator?  If so, be aware that any WSGI 1 app written as:

      def app(environ, start_response):
          start_response(status, headers)
          yield "foo"
          yield "bar"

can be written as a WSGI 2 app thus:

      def app(environ, start_response):
          def respond():
              yield "foo"
              yield "bar"
          return status, headers, respond()

This is also a good time for people to learn that generators are
usually a *very bad* way to write WSGI apps - yielding is for server
push or sending blocks of large files, not tiny strings.  In general,
if you're yielding more than one block, you're almost certainly doing
WSGI wrong.  The typical HTML, XML, or JSON output that's 99% of a
webapp's requests should be transmitted as a single string, rather
than as a series of snippets.

IOW, the absence of generator support in WSGI 2 is a feature, not a bug.


>In my new attempt I plan to:
>
>1) Implement the simple suspend/resume extension
>2) Implement a Python extension module that wraps the Nginx events
>    system.
>3) Implement a pure Python WSGI middleware that, using greenlets, will
>    enable normal applications to take advantage of Nginx async features.

I think maybe I'm understanding a little better now -- you want to
implement the WSGI gateway entirely in C, without using any Python,
and without using the greenlet API directly.

I think I've been unable to understand because I'm thinking in terms
of a server implemented in Python, or at least that has the WSGI part
implemented in Python.


>Do you think it will possible to implement all the requirements of WSGI
>2 (including Python 3.x support) in a simple adapter on top of WSGI 1.0 ?

My practical experience with Python 3 is essentially nonexistent, but
being able to implement WSGI 2 in terms of WSGI 1 is a *design
requirement* for WSGI 2; it's likely that much early use and
development of WSGI 2 will be done through such an adapter.


>And what about applications that need to use the WSGI 1.0 API but
>require to run with Python 3.x?

That's a tougher nut to crack; again, my practical experience with
Python 3 is essentially nonexistent.

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

Manlio Perillo-3
P.J. Eby ha scritto:

> At 08:06 PM 4/8/2010 +0200, Manlio Perillo wrote:
>> What I'm trying to do is:
>>
>> * as in the example I posted, turn Mako render function in a generator.
>>
>>   The reason is that I would lite to to implement support for Nginx
>>   subrequests.
>
> By subrequest, do you mean that one request is invoking another, like
> one WSGI application calling multiple other WSGI applications to render
> one page containing contents from more than one?
>

Yes.

>
>>   During a subrequest, the generated response body is sent directly to
>>   the client, so it is necessary to be able to flush the Mako buffer
>
> I don't quite understand this, since I don't know what Mako is, or, if
> it's a template engine, what flushing its buffer would have to do with
> WSGI buffering.
>

Ah, sorry.

Mako is a template engine.
Suppose I have an HTML template file, and I want to use a sub request.

<html>
  <head>...</head>
  <body>
    <div>${subrequest('/header/'}</div>
    ...
  </body>
</html>


The problem with this code is that, since Mako will buffer all generated
content, the result response body will contain incorrect data.

It will first contain the response body generated by the sub request,
then the content generated from the Mako template (XXX I have not
checked this, but I think it is how it works).

So, when executing a sub request, it is necessary to flush (that is,
send to Nginx, in my case) the content generated from the template
before the sub request is done.

Since Mako does not return a generator (I asked the author, and it was
too hard to implement), I use a greenlet in order to "turn" the Mako
render function in a generator.

>
>> > Under
>> > WSGI 1, you can do this by yielding empty strings before calling
>> > start_response.
>>
>> No, in this case this is not what I need to do.
>
> Well, if that's not when you're needing to suspend the application, then
> I don't see what you're losing in WSGI 2.
>
>
>> I need to call start_response, since the greenlet middleware will yield
>> data to the caller before the application returns.
>
> I still don't understand you.  In WSGI 1, the only way to suspend
> execution (without using greenlets) prior to determining the headers is
> to yield empty strings.
>

Ah, you are right sorry.
But this is not required for the Mako example (I was focusing on that
example).

> I'm beginning to wonder if maybe what you're saying is that you want to
> be able to write an application function in the form of a generator?

The greenlet middleware return a generator, in order to work.

> If
> so, be aware that any WSGI 1 app written as:
>
>      def app(environ, start_response):
>          start_response(status, headers)
>          yield "foo"
>          yield "bar"
>
> can be written as a WSGI 2 app thus:
>
>      def app(environ, start_response):
>          def respond():
>              yield "foo"
>              yield "bar"
>          return status, headers, respond()
>

The problem, as I wrote, is that with the greenlet middleware, the
application needs not to return a generator.

def app(environ):
    tmpl = ...
    body = tmpl.render(...)

    return status, headers, [body]

This is a very simple WSGI application.

But when using the greenlet middleware, and when using the function for
flushing Mako buffer, some data will be yielded *before* the application
returns and status and headers are passed to Nginx.


> This is also a good time for people to learn that generators are usually
> a *very bad* way to write WSGI apps

It's the only way to be able to suspend execution, when the WSGI
implementation is embedded in an async web server not written in Python.

The reason is that you can not use (XXX check me) greenlets in C code,
you should probably use something like http://code.google.com/p/coev/

Greenlets can be used in gevent, as an example, because scheduling is
under control of Python code.
This is not the case with Nginx.

> - yielding is for server push or
> sending blocks of large files, not tiny strings.  

Again, consider the use of sub requests.
yielding a "not large" block is the only choice you have.

Unless, of course, you implement sub request support in pure Python (or
using SSI - Server Side Include).

Another use case is when you have a very large page, and you want to
return some data as soon as possible to avoid the user to abort request
if it takes some time.

Also, note that with Nginx (as with Apache, if I'm not wrong), even if
application yields small strings, the server can still do some buffering
in order to increase performance.

In ngx_http_wsgi_module buffering is optional (and disabled by default).

In the sub request example, it means that if both the main request
response body and sub request response body are small, Nginx can buffer
all the data in memory before sending it to the client (XXX I need to
check this).

> In general, if you're
> yielding more than one block, you're almost certainly doing WSGI wrong.
> The typical HTML, XML, or JSON output that's 99% of a webapp's requests
> should be transmitted as a single string, rather than as a series of
> snippets.
>

> IOW, the absence of generator support in WSGI 2 is a feature, not a bug.
>

What do you mean by absence of generator support?
WSGI 2 applications can still return a generator.

>
>> In my new attempt I plan to:
>>
>> 1) Implement the simple suspend/resume extension
>> 2) Implement a Python extension module that wraps the Nginx events
>>    system.
>> 3) Implement a pure Python WSGI middleware that, using greenlets, will
>>    enable normal applications to take advantage of Nginx async features.
>
> I think maybe I'm understanding a little better now -- you want to
> implement the WSGI gateway entirely in C, without using any Python, and
> without using the greenlet API directly.
>

Right.

> I think I've been unable to understand because I'm thinking in terms of
> a server implemented in Python, or at least that has the WSGI part
> implemented in Python.
>

Yes.
I had a similar problem trying to explain how ngx_http_wsgi_module works
to another person (and I'm not even good at explaining things!).

> [...]


Thanks   Manlio
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

PJ Eby
At 10:18 PM 4/8/2010 +0200, Manlio Perillo wrote:

>Suppose I have an HTML template file, and I want to use a sub request.
>
>...
>${subrequest('/header/'}
>...
>
>The problem with this code is that, since Mako will buffer all generated
>content, the result response body will contain incorrect data.
>
>It will first contain the response body generated by the sub request,
>then the content generated from the Mako template (XXX I have not
>checked this, but I think it is how it works).

Okay, I'm confused even more now.  It seems to me like what you've
just described is something that's fundamentally broken, even if
you're not using WSGI at all.


>So, when executing a sub request, it is necessary to flush (that is,
>send to Nginx, in my case) the content generated from the template
>before the sub request is done.

This seems to only makes sense if you're saying that the subrequest
*has to* send its output directly to the client, rather than to the
parent request.  If the subrequest sends its output to the parent
request (as a sane implementation would), then there is no
problem.  Likewise, if the subrequest is sent to a buffer that's then
inserted into the parent invocation.

Anything else seems utterly insane to me, unless you're basically
taking a bunch of legacy CGI code using 'print' statements and
hacking it into something else.  (Which is still insane, just
differently. ;-) )


>Ah, you are right sorry.
>But this is not required for the Mako example (I was focusing on that
>example).

As far as I can tell, that example is horribly wrong.  ;-)


>But when using the greenlet middleware, and when using the function for
>flushing Mako buffer, some data will be yielded *before* the application
>returns and status and headers are passed to Nginx.

And that's probably because sharing a single output channel between
the parent and child requests is a bad idea.  ;-)

(Specifically, it's an increase in "temporal coupling", I believe.  I
know it's some kind of coupling between functions that's considered
bad, I just don't remember if that's the correct name for it.)


> > This is also a good time for people to learn that generators are usually
> > a *very bad* way to write WSGI apps
>
>It's the only way to be able to suspend execution, when the WSGI
>implementation is embedded in an async web server not written in Python.

It's true that dropping start_response() means you can't yield empty
strings prior to determining your headers, yes.


> > - yielding is for server push or
> > sending blocks of large files, not tiny strings.
>
>Again, consider the use of sub requests.
>yielding a "not large" block is the only choice you have.

No, it isn't.  You can buffer your output and yield empty strings
until you're ready to flush.



>Unless, of course, you implement sub request support in pure Python (or
>using SSI - Server Side Include).

I don't see why it has to be "pure", actually.  It just that the
subrequest needs to send data to the invoker rather than sending it
straight to the client.

That's the bit that's crazy in your example -- it's not a scenario
that WSGI 2 should support, and I'd consider the fact that WSGI 1
lets you do it to be a bug, not a feature.  ;-)

That being said, I can see that removing start_response() closes a
loophole that allows async apps to *potentially* exist under WSGI 1
(as long as you were able to tolerate the resulting crappy API).

However, to fix that crappy API requires greenlets or threads, at
which point you might as well just use WSGI 2.  In the Nginx case,
you can either do WSGI 1 in C and then use an adapter to provide WSGI
2, or you can expose your C API to Python and write a small
greenlets-using Python wrapper to support suspending.  It would look
something like:

     def gateway(request_info, app):
         # set up environ
         run(greenlet(lambda: Finished(app(environ))))

     def run(child):
         while not child.dead:
              data = child.switch()
              if isinstance(data, Finished):
                   send_status(data.status)
                   send_headers(data.headers)
                   send_response(data.response)
              else:
                  perform_appropriate_action_on(data)
                  if data.suspend:
                      # arrange for run(child) to be re-called later, then...
                      return

Suspension now works by switching back to the parent greenlet with
command objects (like Finished()) to tell the run() loop what to
do.  The run() loop is not stateful, so when the task is unsuspended,
you simply call run(child) again.

A similar structure would exist for send_response() - i.e., it's a
loop over the response, can break out of the loop if it needs to
suspend, and arranges for itself to be re-called at the appropriate time.

Voila - you now have asynchronous WSGI 2 support.

Now, whether you actually *want* to do that is a separate question,
but as (I hope) you can see, you definitely *can* do it, and without
needing any greenlet-using code to be in C.  From C, you just call
back into one of the Python top-level loops (run() and
send_response()), which then does the appropriate task switching.


>Another use case is when you have a very large page, and you want to
>return some data as soon as possible to avoid the user to abort request
>if it takes some time.

That's the server push case -- but of course that's not a problem
even in WSGI 2, since the "response" can still be a generator.


>Also, note that with Nginx (as with Apache, if I'm not wrong), even if
>application yields small strings, the server can still do some buffering
>in order to increase performance.

In which case, it's in violation of the WSGI spec.  The spec requires
eparately-yielded strings to be flushed to OS-level buffering.


>What do you mean by absence of generator support?
>WSGI 2 applications can still return a generator.

Yes - but they can't *be* a generator - previously they could, due to
the separate start_response callable.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

Graham Dumpleton-2
On 9 April 2010 07:53, P.J. Eby <[hidden email]> wrote:
>> Also, note that with Nginx (as with Apache, if I'm not wrong), even if
>> application yields small strings, the server can still do some buffering
>> in order to increase performance.
>
> In which case, it's in violation of the WSGI spec.  The spec requires
> eparately-yielded strings to be flushed to OS-level buffering.

True, and Apache/mod_wsgi does best effort on that.

Output filters at Apache level can be a problem with that though as a
flush bucket in Apache bucket chains is a request only and an output
filter can decide not to flush through all data. For example,
mod_deflate may buffer partial data in order to get enough for next
block of compressed data.

This is the exception rather than the norm, and if no such output
filters exists, then separately yield strings should be flushed right
through to the socket.

So, one can try and satisfy that requirement in WSGI, but in practice
it cannot always be achieved because you may have absolutely no
control over the underlying web server.

Graham
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

wsgi and generators (was Re: WSGI and start_response)

chris.dent
In reply to this post by PJ Eby
On Thu, 8 Apr 2010, P.J. Eby wrote:

> This is also a good time for people to learn that generators are usually a
> *very bad* way to write WSGI apps - yielding is for server push or sending
> blocks of large files, not tiny strings.  In general, if you're yielding more
> than one block, you're almost certainly doing WSGI wrong.  The typical HTML,
> XML, or JSON output that's 99% of a webapp's requests should be transmitted
> as a single string, rather than as a series of snippets.

Now the thread that included the quoted bit above has died down a bit, I
wanted to get back to this. I was surprised when I read this as I found
it counter intuitive, different to what I'm doing in practical day to
day WSGI app creation and contrary to what my old school network
services thinking thinks (start getting stuff queued for the pipe as
soon as possible).

The apps I'm creating tend to be HTTP APIs that are trying to be RESTful
and as such they have singular resources I call entities, and aggregates
of those entities I call collections. The APIs provide access to GETting
and PUTting entities and GETting collections.

Whenever a GET request is made on an entity or collection, the entity or
entities involved is serialzed to some string form. When there are many
entities in a collection, yielding their serialized forms makes semantic
sense as well as (it appears) resource utiliziation sense.

I realize I'm able to build up a complete string or yield via a
generator, or a whole bunch of various ways to accomplish things
(which is part of why I like WSGI: that content is just an iterator,
that's a good thing) so I'm not looking for a statement of what is or
isn't possible, but rather opinions. Why is yielding lots of moderately
sized strings *very bad*? Why is it _not_ very bad (as presumably
others think)?

The model I have in my mind is an application where there is a fair
amount of layering and separation between the request handling, the
persistence layer, and the serialization system. When a GET for a
collection happens, it would call the persistence layer, which would
return a generator of entities, which would be passed to the
serialization, which would yield a block of output per entity.

Here's some pseudo code:

     def get_collection(environ, start_response):
         try:
             entities = store.get_collection('something')
         except NoSomething:
             start_response('404 Not Found', [])
             return ['sorry']
         start_response('200 OK' [('Content-Type', 'text/html')])
         # yield a block of html per entity
         return serializer.generate_html_from_entities(entities)

"In general, if you're yielding more than one block, you're almost
certainly doing WSGI wrong."

I don't understand how this is wrong. It appears to allow nice
conceptual separation between the store and serializer while still
allowing the memory (and sometimes cpu) efficiences of generators.

It may be that I'm a special case (some of the serializations can be
quite expansive and expensive), but I would be surprised if that were
so.

So what's going on?

P.S. Speaking of these things, can anyone point me to a JSON tool that
      can yield a string of JSON as a series of blocks? Assuming a data
      structure that is a long list of anonymous dicts,
      json.dumps(the_list) returns a single string. It would be nice, to
      fit in the model above I could yield each dict. Better if I could
      pass the_list as a generator. I can think of ways to create such a
      tool myself, but I'd like to use an existing one if it exists.
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: wsgi and generators (was Re: WSGI and start_response)

Dirkjan Ochtman
On Sat, Apr 10, 2010 at 15:04, Chris Dent <[hidden email]> wrote:
> On Thu, 8 Apr 2010, P.J. Eby wrote:
>> This is also a good time for people to learn that generators are usually a
>> *very bad* way to write WSGI apps - yielding is for server push or sending
>> blocks of large files, not tiny strings.  In general, if you're yielding
>> more than one block, you're almost certainly doing WSGI wrong.  The typical
>> HTML, XML, or JSON output that's 99% of a webapp's requests should be
>> transmitted as a single string, rather than as a series of snippets.

While I agree that it doesn't make sense to yield small strings, it
seems to make perfect sense to chunk up larger buffers (e.g. starting
at several kilobytes). This is something we do when transmitting
Mercurial changesets, for example.

Cheers,

Dirkjan
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: wsgi and generators (was Re: WSGI and start_response)

PJ Eby
In reply to this post by chris.dent
At 02:04 PM 4/10/2010 +0100, Chris Dent wrote:
>I realize I'm able to build up a complete string or yield via a
>generator, or a whole bunch of various ways to accomplish things
>(which is part of why I like WSGI: that content is just an iterator,
>that's a good thing) so I'm not looking for a statement of what is or
>isn't possible, but rather opinions. Why is yielding lots of moderately
>sized strings *very bad*? Why is it _not_ very bad (as presumably
>others think)?

How bad it is depends a lot on the specific middleware, server
architecture, OS, and what else is running on the machine.  The more
layers of architecture you have, the worse the overhead is going to be.

The main reason, though, is that alternating control between your app
and the server means increased request lifetime and worsened average
request completion latency.

Imagine that I have five tasks to work on right now.  Let us say each
takes five units of time to complete.  If I have five units of time
right now, I can either finish one task now, or partially finish
five.  If I work on them in an interleaved way, *none* of the tasks
will be done until twenty-five units have elapsed, and so all tasks
will have a completion latency of 25 units.

If I work on them one at a time, however, then one task will be done
in 5 units, the next in 10, and so on -- for an average latency of
only 15 units.  And that is *not* counting any task switching overhead.

But it's *worse* than that, because by multitasking, my task queue
has five things in it the whole time...  so I am using more memory
and have more management overhead, as well as task switching overhead.

If you translate this to the architecture of a web application, where
the "work" is the server serving up bytes produced by the
application, then you will see that if the application serves up
small chunks, the web server is effectively forced to multitask, and
keep more application instances simultaneously running, with lowered
latency, increased memory usage, etc.

However, if the application hands either its entire output to the
server, then the "task" is already *done* -- the server doesn't need
the thread or child process for that app anymore, and can have it do
something else while the I/O is happening.  The OS is in a better
position to interleave its own I/O with the app's computation, and
the overall request latency is reduced.

Is this a big emergency if your server's mostly idle?  Nope.  Is it a
problem if you're writing a CGI program or some other direct API that
doesn't automatically flush I/O?  Not at all.  I/O buffering works
just fine for making sure that the tasks are handed off in bigger chunks.

But if you're coding up a WSGI framework, you don't really want to
have it sending tiny chunks of data up a stack of middleware, because
WSGI doesn't *have* any buffering, and each chunk is supposed to be
sent *immediately*.

Well-written web frameworks usually do some degree of buffering
already, for API and performance reasons, so for simplicity's sake,
WSGI was spec'd assuming that applications would send data in
already-buffered chunks.

(Specifically, the simplicity of not needing to have an explicit
flushing API, which would otherwise have been necessary if middleware
and servers were allowed to buffer the data, too.)

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: wsgi and generators (was Re: WSGI and start_response)

Graham Dumpleton-2
In reply to this post by chris.dent
On 10 April 2010 23:04, Chris Dent <[hidden email]> wrote:

> On Thu, 8 Apr 2010, P.J. Eby wrote:
>
>> This is also a good time for people to learn that generators are usually a
>> *very bad* way to write WSGI apps - yielding is for server push or sending
>> blocks of large files, not tiny strings.  In general, if you're yielding
>> more than one block, you're almost certainly doing WSGI wrong.  The typical
>> HTML, XML, or JSON output that's 99% of a webapp's requests should be
>> transmitted as a single string, rather than as a series of snippets.
>
> Now the thread that included the quoted bit above has died down a bit, I
> wanted to get back to this. I was surprised when I read this as I found
> it counter intuitive, different to what I'm doing in practical day to
> day WSGI app creation and contrary to what my old school network
> services thinking thinks (start getting stuff queued for the pipe as
> soon as possible).
>
> The apps I'm creating tend to be HTTP APIs that are trying to be RESTful
> and as such they have singular resources I call entities, and aggregates
> of those entities I call collections. The APIs provide access to GETting
> and PUTting entities and GETting collections.
>
> Whenever a GET request is made on an entity or collection, the entity or
> entities involved is serialzed to some string form. When there are many
> entities in a collection, yielding their serialized forms makes semantic
> sense as well as (it appears) resource utiliziation sense.
>
> I realize I'm able to build up a complete string or yield via a
> generator, or a whole bunch of various ways to accomplish things
> (which is part of why I like WSGI: that content is just an iterator,
> that's a good thing) so I'm not looking for a statement of what is or
> isn't possible, but rather opinions. Why is yielding lots of moderately
> sized strings *very bad*? Why is it _not_ very bad (as presumably
> others think)?

Because for a WSGI application, you have absolutely no idea what
actual web server it may run on and what the overheads are of sending
a block of data, let alone many.

In Apache for example, if sent as small blocks, for which a flush has
to occur between each, you have to call into the Apache output filter
bucket chain on every block. This in itself is not an insubstantial
overhead if done many many times. You also have the actual overheads
of writing smalls blocks onto the actual socket.

Let us take an extreme example of a hello world program.

import sys

def application(environ, start_response):
    status = '200 OK'
    output = 'Hello World!'
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    start_response(status, response_headers)

    return [output]

Say for this I can reliably get:

Requests per second:    2122.56 [#/sec] (mean)

Now change the last line of that hello world program, mirroring a
common mistake you see some make, to:

    return output

so that instead of yielding a single string, yields each character in
the string.

About the best I can achieve now is:

Requests per second:    1973.51 [#/sec] (mean)

This example is only a small string and so only a handful of flushes
had to be done. If you break up a large amount of data into many small
bits, the overheads will obviously become worse. More so if you
actually had Apache output filters installed such as mod_deflate which
actually did work on ever flush. In case above there were no output
filters installed.

So, you may get away with it, but you just have to be a bit careful on
how fine grained you do it. Also, since lot of clients are going to be
slow at reading the response, it is questionable how much it would
help anyway. Delaying and sending as complete response may work just
as well or better. Certainly, if using a front end such as nginx,
returning a complete response will allow the WSGI server to off load
the full response quicker because of the way nginx works as buffer.
Dribbling it in bits just means the backend has to do more work.

Overall I would suggest you form complete responses and focus your
effort instead on better application caching so that you can deliver
responses from a cache and avoid the whole need to generate it in the
first place.

Graham
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: wsgi and generators (was Re: WSGI and start_response)

Manlio Perillo-3
In reply to this post by PJ Eby
P.J. Eby ha scritto:

> At 02:04 PM 4/10/2010 +0100, Chris Dent wrote:
>> I realize I'm able to build up a complete string or yield via a
>> generator, or a whole bunch of various ways to accomplish things
>> (which is part of why I like WSGI: that content is just an iterator,
>> that's a good thing) so I'm not looking for a statement of what is or
>> isn't possible, but rather opinions. Why is yielding lots of moderately
>> sized strings *very bad*? Why is it _not_ very bad (as presumably
>> others think)?
>
> How bad it is depends a lot on the specific middleware, server
> architecture, OS, and what else is running on the machine.  The more
> layers of architecture you have, the worse the overhead is going to be.
>
> The main reason, though, is that alternating control between your app
> and the server means increased request lifetime and worsened average
> request completion latency.
>

This is not completely true.
At least this is not how things will work on an asynchronous WSGI
implementation.

It is true that alternating control between your app and server decrease
performance.
This can be verified with:
http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_cooperative.py

However yielding small strings in the application iterator, because the
application does not want to buffer data, will usually not cause the
problems you describe.

Instead, the possible performance problems have been described by Graham.


Moreover, when we speak about latency, we should also consider that web
page are usually served to human users.
In this case, latency is not the only factor to consider.

Is it better for the user to wait 3 seconds for some text to appear on
the browser window, and then wait for other 5 seconds for the complete
page to be rendered, or having to wait 5 seconds for some text to appear
on the browser window?

> [...]
> If you translate this to the architecture of a web application, where
> the "work" is the server serving up bytes produced by the application,
> then you will see that if the application serves up small chunks, the
> web server is effectively forced to multitask, and keep more application
> instances simultaneously running, with lowered latency, increased memory
> usage, etc.
>

Yielding small strings *will* not force multitasking.
This can be verified with:
http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_producer.py

WSGI application will be suspended *only* when data can not be sent to
the OS socket buffer.

Yielding several small strings will *usually* not cause socket buffer
overflow, unless the client is very slow at reading data.

Instead, ironically, you will have a problem when the application yields
several big strings.

In this case it is better to yield only one very big string, but this is
not always feasible.
And I'm not sure if it is worse to keep a very big buffer in memory, or
to send several not small chunks to the client.

> [...]


Regards  Manlio
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

Manlio Perillo-3
In reply to this post by PJ Eby
P.J. Eby ha scritto:

> At 10:18 PM 4/8/2010 +0200, Manlio Perillo wrote:
>> Suppose I have an HTML template file, and I want to use a sub request.
>>
>> ...
>> ${subrequest('/header/'}
>> ...
>>
>> The problem with this code is that, since Mako will buffer all generated
>> content, the result response body will contain incorrect data.
>>
>> It will first contain the response body generated by the sub request,
>> then the content generated from the Mako template (XXX I have not
>> checked this, but I think it is how it works).
>
> Okay, I'm confused even more now.  It seems to me like what you've just
> described is something that's fundamentally broken, even if you're not
> using WSGI at all.
>

If you are referring to Mako being turned in a generator, yes, this
implementation is rather obscure.

I wrote it as a proof of concept.
Before this, I wrote a more polite implementation:
http://paste.pocoo.org/show/201324/

>
>> So, when executing a sub request, it is necessary to flush (that is,
>> send to Nginx, in my case) the content generated from the template
>> before the sub request is done.
>
> This seems to only makes sense if you're saying that the subrequest *has
> to* send its output directly to the client, rather than to the parent
> request.  

Yes, this is how subrequests work in Nginx. And I assume the same is
true for Apache.

> If the subrequest sends its output to the parent request (as a
> sane implementation would), then there is no problem.

You are forgetting that Nginx is not an application server.
Why should the subrequest output returned to the parent?

This would only make it less efficient.

> Likewise, if the
> subrequest is sent to a buffer that's then inserted into the parent
> invocation.
>
> Anything else seems utterly insane to me, unless you're basically taking
> a bunch of legacy CGI code using 'print' statements and hacking it into
> something else.  (Which is still insane, just differently. ;-) )
>

We are talking about subrequest implementation in a efficient web server
written in C, like Nginx and Apache.

>
>> Ah, you are right sorry.
>> But this is not required for the Mako example (I was focusing on that
>> example).
>
> As far as I can tell, that example is horribly wrong.  ;-)
>

I agree ;-)

>
>> But when using the greenlet middleware, and when using the function for
>> flushing Mako buffer, some data will be yielded *before* the application
>> returns and status and headers are passed to Nginx.
>
> And that's probably because sharing a single output channel between the
> parent and child requests is a bad idea.  ;-)
>

No, this is not specific to subrequests.

As an example, here you can find an up to date greenlet adapters:
http://bitbucket.org/mperillo/txwsgi/src/tip/txwsgi/greenlet.py

The ``write_adapter`` **needs** to yield some data before WSGI
application return, because this is how the write callable workd.

The exposed ``gsuspend`` function, instead, will cause an empty string
to be yielded to the server, before the WSGI application returns.

> (Specifically, it's an increase in "temporal coupling", I believe.  I
> know it's some kind of coupling between functions that's considered bad,
> I just don't remember if that's the correct name for it.)
>

Nginx code contains some coupling; I assume this is done because it was
designed with efficiency in mind.

> [...]
> It's true that dropping start_response() means you can't yield empty
> strings prior to determining your headers, yes.
>
>
>> > - yielding is for server push or
>> > sending blocks of large files, not tiny strings.
>>
>> Again, consider the use of sub requests.
>> yielding a "not large" block is the only choice you have.
>
> No, it isn't.  You can buffer your output and yield empty strings until
> you're ready to flush.
>

As I wrote, this will not work if you want to use subrequest support
from Nginx.

>
>
>> Unless, of course, you implement sub request support in pure Python (or
>> using SSI - Server Side Include).
>
> I don't see why it has to be "pure", actually.  It just that the
> subrequest needs to send data to the invoker rather than sending it
> straight to the client.
>

You may say this, but it is not how subrequests are implemented in Nginx
;-).


> That's the bit that's crazy in your example -- it's not a scenario that
> WSGI 2 should support, and I'd consider the fact that WSGI 1 lets you do
> it to be a bug, not a feature.  ;-)
>

Are you referring to the bad Mako example, or to the
``greenlet_adapter`` idea?

> That being said, I can see that removing start_response() closes a
> loophole that allows async apps to *potentially* exist under WSGI 1 (as
> long as you were able to tolerate the resulting crappy API).
>
> However, to fix that crappy API requires greenlets or threads, at which
> point you might as well just use WSGI 2.  In the Nginx case, you can
> either do WSGI 1 in C and then use an adapter to provide WSGI 2, or you
> can expose your C API to Python and write a small greenlets-using Python
> wrapper to support suspending.  

But this is already implemented using the ``greenlet_adapter`` in
txwsgi, and the x-wsgiorg.suspend extension.

And this implementation has the advantage that the greenlet_adapter
works on **every** WSGI implementation that supports the
x-wsgiorg.suspend extension.


> It would look something like:
>
>     def gateway(request_info, app):
>         # set up environ
>         run(greenlet(lambda: Finished(app(environ))))
>
>     def run(child):
>         while not child.dead:
>              data = child.switch()
>              if isinstance(data, Finished):
>                   send_status(data.status)
>                   send_headers(data.headers)
>                   send_response(data.response)
>              else:
>                  perform_appropriate_action_on(data)
>                  if data.suspend:
>                      # arrange for run(child) to be re-called later,
> then...
>                      return
>

I have to actually implement this to check if it works.

This can be done using my txwsgi implementation.

If it can help, I can also implement WSGI 2.0 in txwsgi.  WSGI 1.0 and
WSGI 2.0 stacks will be independent, no adapter will be used (they will
just share most of the code).

> [...]


Regards  Manlio
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

Graham Dumpleton-2
On 13 April 2010 20:41, Manlio Perillo <[hidden email]> wrote:

>>> So, when executing a sub request, it is necessary to flush (that is,
>>> send to Nginx, in my case) the content generated from the template
>>> before the sub request is done.
>>
>> This seems to only makes sense if you're saying that the subrequest *has
>> to* send its output directly to the client, rather than to the parent
>> request.
>
> Yes, this is how subrequests work in Nginx. And I assume the same is
> true for Apache.

No that is not true for Apache. Apache content handlers write output
into what is called a bucket brigade. For a normal sub request this
may be the bucket brigade of the parent request and so be processed by
the output filters of the parent request. You can however code the
mechanics of the sub request to override that and do something else
with the data pushed into that bucket brigade.

Although it can be done it gets a bit complicated to have the data
written back into the bucket brigade pulled back into the context of a
parent request. This is because the data is written from the context
of the sub request where as at same time the parent request is going
to want to pull it. Thus need to use threading and have to fire off
the sub request in its own thread with a queue of some sort being used
to communicate between the two.

So messy, but technically it should be possible with custom Python
code specific to Apache to fire off a subrequest and the result of the
sub request be an iterable which yields data which itself could be
yielded from the context of the parent application such that the
content could then be processed and modified by a WSGI middleware
wrapper.

Graham
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: WSGI and start_response

Benoit Chesneau-3
In reply to this post by PJ Eby
On Thu, Apr 8, 2010 at 4:53 PM, P.J. Eby <[hidden email]> wrote:

> At 04:08 PM 4/8/2010 +0200, Manlio Perillo wrote:
>>
>> Hi.
>>
>> Some time ago I objected the decision to remove start_response function
>> from next version WSGI, using as rationale the fact that without
>> start_callable, asynchronous extension are impossible to support.
>>
>> Now I have found that removing start_response will also make impossible
>> to support coroutines (or, at least, some coroutines usage).
>>
>> Here is an example (this is the same example I posted few days ago):
>> http://paste.pocoo.org/show/199202/
>>
>> Forgetting about the write callable, the problem is that the application
>> starts to yield data when tmpl.render_unicode function is called.
>>
>> Please note that this has *nothing* to do with asynchronus applications.
>> The code should work with *all* WSGI implementations.
>>
>>
>> In the pasted example, the Mako render_unicode function is "turned" into
>> a generator, with a simple function that allows to flush the current
>> buffer.
>>
>>
>> Can someone else confirm that this code is impossible to support in WSGI
>> 2.0?
>
> I don't understand why it's a problem.  See my previous post here:
>
> http://mail.python.org/pipermail/web-sig/2009-September/003986.html
>
> for a sketch of a WSGI 1-to-2 converter.  It takes a WSGI 1 application
> callable as the input, and returns a WSGI 2 function.
>
where is WSGI 2 pep ? I would like to see it first rather than seeig
different implementations.

- benoit
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
12