Server-side async API implementation sketches

classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Server-side async API implementation sketches

PJ Eby
As a semi-proof-of-concept, I whipped these up:

   http://peak.telecommunity.com/DevCenter/AsyncWSGISketch

It's an expanded version of my Coroutine concept, updated with sample
server code for both a synchronous server and an asynchronous
one.  The synchronous "server" is really just a decorator that wraps
a WSGI2 async app with futures support, and handles pauses by simply
waiting for the future to finish.

The asynchronous server is a bit more hand-wavy, in that there are
some bits (clearly marked) that will be server/framework
dependent.  However, they should be straightforward for a specialist
in any given async framework to implement.

What is *most* handwavy at the moment, however, is in the details of
precisely what one is allowed to "yield to".  I've written the
sketches dealing only with PEP 3148 futures, but sockets were also
proposed, and IMO there should be simple support for obtaining data
from wsgi.input.

However, even this part is pretty easy to extrapolate: both server
examples just add more type-testing branches in their
"base_trampoline()" function, copying and modifying the existing
branches that deal with futures.

The entire result is surprisingly compact -- each server weighed in
at about 40 lines, and the common Coroutine class used by both adds
another 60-something lines.

In the limit case, it appears that any WSGI 1 server could provide an
(emulated) async WSGI2 implementation, simply by wrapping WSGI2 apps
with a finished version of the decorator in my sketch.

Or, since users could do it themselves, this would mean that WSGI2
deployment wouldn't be dependent on all server implementers
immediately turning out their own WSGI2 implementations.

True async API implementations would be more involved, of course --
using a WSGI2 decorator on say, Twisted's WSGI1 implementation, would
give you no performance advantages vs. using Twisted's APIs
directly.  But, as soon as someone wrote a Twisted-specific
translation of my async-server sketch, such an app would be portable.

More discussion is still needed, but at this point I'm convinced the
concept is *technically* feasible.  (Whether there's enough need in
the "market" to make it worthwhile, is a separate question.)

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alex Grönholm-3
08.01.2011 23:16, P.J. Eby kirjoitti:

> As a semi-proof-of-concept, I whipped these up:
>
>   http://peak.telecommunity.com/DevCenter/AsyncWSGISketch
>
> It's an expanded version of my Coroutine concept, updated with sample
> server code for both a synchronous server and an asynchronous one.  
> The synchronous "server" is really just a decorator that wraps a WSGI2
> async app with futures support, and handles pauses by simply waiting
> for the future to finish.
>
> The asynchronous server is a bit more hand-wavy, in that there are
> some bits (clearly marked) that will be server/framework dependent.  
> However, they should be straightforward for a specialist in any given
> async framework to implement.
>
> What is *most* handwavy at the moment, however, is in the details of
> precisely what one is allowed to "yield to".  I've written the
> sketches dealing only with PEP 3148 futures, but sockets were also
> proposed, and IMO there should be simple support for obtaining data
> from wsgi.input.
>
> However, even this part is pretty easy to extrapolate: both server
> examples just add more type-testing branches in their
> "base_trampoline()" function, copying and modifying the existing
> branches that deal with futures.
>
> The entire result is surprisingly compact -- each server weighed in at
> about 40 lines, and the common Coroutine class used by both adds
> another 60-something lines.
>
> In the limit case, it appears that any WSGI 1 server could provide an
> (emulated) async WSGI2 implementation, simply by wrapping WSGI2 apps
> with a finished version of the decorator in my sketch.
>
> Or, since users could do it themselves, this would mean that WSGI2
> deployment wouldn't be dependent on all server implementers
> immediately turning out their own WSGI2 implementations.
>
> True async API implementations would be more involved, of course --
> using a WSGI2 decorator on say, Twisted's WSGI1 implementation, would
> give you no performance advantages vs. using Twisted's APIs directly.  
> But, as soon as someone wrote a Twisted-specific translation of my
> async-server sketch, such an app would be portable.
>
> More discussion is still needed, but at this point I'm convinced the
> concept is *technically* feasible.  (Whether there's enough need in
> the "market" to make it worthwhile, is a separate question.)
I'm a bit unclear as to how this will work with async. How do you
propose that an asynchronous application receives the request body?
>
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alice Bevan–McGregor
On 2011-01-08 17:22:44 -0800, Alex Grönholm said:

>> On 2011-01-08 13:16:52 -0800, P.J. Eby said:
>>
>> I've written the sketches dealing only with PEP 3148 futures, but
>> sockets were also proposed, and IMO there should be simple support for
>> obtaining data from wsgi.input.
>
> I'm a bit unclear as to how this will work with async. How do you
> propose that an asynchronous application receives the request body?

In my example https://gist.github.com/770743 (which has been simplified
greatly by P.J. Eby in the "Future- and Generator-Based Async Idea"
thread) for dealing with wsgi.input, I have:

    future = environ['wsgi.executor'].submit(environ['wsgi.input'].read, 4096)
    yield future

While ugly, if you were doing this, you'd likely:

        submit = environ['wsgi.executor'].submit
        input_ = environ['wsgi.input']

    future = yield submit(input_.read, 4096)
    data = future.

That's a bit nicer to read, and simplifies things if you need to make a
number of async calls.

The idea here is that:

:: Your async server subclasses ThreadPoolExecutor.

:: The subclass overloads the submit method.

:: Your submit method detects bound methods on wsgi.input, sockets, and files.

:: If one of the above is detected, create a mock future that defines
'fd' and 'operation' attributes or similar.

:: When yielding the mock future, your async reactor can detect 'fd'
and do the appropriate thing for your async framework.  (Generally
adding the fd to the appropriate select/epoll/kqueue readers/writers
lists.)

:: When the condition is met, set_running_or_notify_cancel (when
internally reading or writing data), set_result, saving the value, and
return the future (filled with its data) back up to the application.

:: The application accepts the future instance as the return value of
yield, and calls result across it to get the data.  (Obviously writes,
if allowed, won't have data, but reads will.)

I hope that clearly identifies my idea on the subject.  Since async
servers will /already/ be implementing their own executors, I don't see
this as too crazy.

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alex Grönholm-3
09.01.2011 04:15, Alice Bevan–McGregor kirjoitti:

> On 2011-01-08 17:22:44 -0800, Alex Grönholm said:
>
>>> On 2011-01-08 13:16:52 -0800, P.J. Eby said:
>>>
>>> I've written the sketches dealing only with PEP 3148 futures, but
>>> sockets were also proposed, and IMO there should be simple support
>>> for obtaining data from wsgi.input.
>>
>> I'm a bit unclear as to how this will work with async. How do you
>> propose that an asynchronous application receives the request body?
>
> In my example https://gist.github.com/770743 (which has been
> simplified greatly by P.J. Eby in the "Future- and Generator-Based
> Async Idea" thread) for dealing with wsgi.input, I have:
>
> future = environ['wsgi.executor'].submit(environ['wsgi.input'].read,
> 4096)
> yield future
>
> While ugly, if you were doing this, you'd likely:
>
> submit = environ['wsgi.executor'].submit
> input_ = environ['wsgi.input']
>
> future = yield submit(input_.read, 4096)
> data = future.
>
> That's a bit nicer to read, and simplifies things if you need to make
> a number of async calls.
>
> The idea here is that:
>
> :: Your async server subclasses ThreadPoolExecutor.
>
> :: The subclass overloads the submit method.
>
> :: Your submit method detects bound methods on wsgi.input, sockets,
> and files.
>
> :: If one of the above is detected, create a mock future that defines
> 'fd' and 'operation' attributes or similar.
>
> :: When yielding the mock future, your async reactor can detect 'fd'
> and do the appropriate thing for your async framework. (Generally
> adding the fd to the appropriate select/epoll/kqueue readers/writers
> lists.)
>
> :: When the condition is met, set_running_or_notify_cancel (when
> internally reading or writing data), set_result, saving the value, and
> return the future (filled with its data) back up to the application.
>
> :: The application accepts the future instance as the return value of
> yield, and calls result across it to get the data. (Obviously writes,
> if allowed, won't have data, but reads will.)
>
> I hope that clearly identifies my idea on the subject. Since async
> servers will /already/ be implementing their own executors, I don't
> see this as too crazy.
-1 on this. Those executors are meant for executing code in a thread
pool. Mandating a magical socket operation filter here would
considerably complicate server implementation.

>
>
> - Alice.
>
>
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

PJ Eby
At 04:40 AM 1/9/2011 +0200, Alex Grönholm wrote:
>09.01.2011 04:15, Alice Bevan­McGregor kirjoitti:
>>I hope that clearly identifies my idea on the subject. Since async
>>servers will /already/ be implementing their own executors, I don't
>>see this as too crazy.
>-1 on this. Those executors are meant for executing code in a thread
>pool. Mandating a magical socket operation filter here would
>considerably complicate server implementation.

Actually, the *reverse* is true.  If you do it the way Alice
proposes, my sketches don't get any more complex, because the
filtering goes in the executor facade or submit function.

Truthfully, I don't really see the point of exposing the map() method
(which is the only other executor method we'd expose), so it probably
makes more sense to just offer a 'wsgi.submit' key...  which can be a
function as follows:

       def submit(callable, *args, **kw):
           ob = getattr(callable, '__self__', None)
           if isinstance(ob, ServerProvidedSocket):  # could be an ABC
                future = MockFuture()
                if callable==ob.read:
                    # set up read callback to fire future
                elif callable==ob.write:
                    # set up write callback to fire future
                return future
           else:
               return real_executor.submit(callable, *args, **kw)

Granted, this might be a rather long function.  However, since it's
essentially an optimization, a given server can decide how many
functions can be shortcut in this way.  The spec may wish to offer a
guarantee or recommendation for specific methods of certain
stdlib-provided types (sockets in particular) and wsgi.input.

Personally, I do think it might be *better* to offer extended
operations on wsgi.input that could be used via yield, e.g. "yield
input.nb_read()".  But of course then the trampoline code has to
recognize those values instead of futures.  Either way works, but
somewhere there is going to be some type-testing (explicit or
implicit) taking place to determine how to suspend and resume the app.

Note, too, that this complexity also only affects servers that want
to offer a truly async API.  A synchronous server has no reason to
pay particular attention to what's in a future, since it can't offer
any performance improvement.

I do think that this sort of API discussion, though, is the most
dangerous part of trying to do an async spec.  That is, I don't
expect that everyone will spontaneously agree on the exact same
API.  Alice's proposal (simply submitting object methods) has the
advantage of severely limiting the scope of API discussions.  ;-)

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

PJ Eby
In reply to this post by Alice Bevan–McGregor
At 06:15 PM 1/8/2011 -0800, Alice Bevan­McGregor wrote:

>On 2011-01-08 17:22:44 -0800, Alex Grönholm said:
>>>On 2011-01-08 13:16:52 -0800, P.J. Eby said:
>>>I've written the sketches dealing only with PEP 3148 futures, but
>>>sockets were also proposed, and IMO there should be simple support
>>>for obtaining data from wsgi.input.
>>I'm a bit unclear as to how this will work with async. How do you
>>propose that an asynchronous application receives the request body?
>
>In my example https://gist.github.com/770743 (which has been
>simplified greatly by P.J. Eby in the "Future- and Generator-Based
>Async Idea" thread) for dealing with wsgi.input, I have:
>
>    future = environ['wsgi.executor'].submit(environ['wsgi.input'].read, 4096)
>    yield future
>
>While ugly, if you were doing this, you'd likely:
>
>         submit = environ['wsgi.executor'].submit
>         input_ = environ['wsgi.input']
>
>    future = yield submit(input_.read, 4096)
>    data = future.

I don't quite understand the above -- in my sketch, the above would be:

     data = yield submit(input._read, 4096)

It looks like your original sketch wants to call .result() on the
future, whereas in my version, the return value of yielding a future
is the result (or an error is thrown if the result was an error).

Is there some reason I'm missing, for why you'd want to explicitly
fetch the result in a separate step?

Meanwhile, thinking about Alex's question, ISTM that if WSGI 2 is
asynchronous, then the wsgi.input object should probably just have
read(), readline() etc. methods that simply return (possibly-mock)
futures.  That's *much* better than having to do all that submit()
crud just to read data from wsgi.input().

OTOH, if you want to use the cgi module to parse a form POST from the
input, you're going to need to write an async version of it in that
case, or else feed the entire operation to an executor...  but then
the methods would need to be synchronous...  *argh*.

I'm starting to not like this idea at all.  Alex has actually
pinpointed a very weak spot in the scheme, which is that if
wsgi.input is synchronous, you destroy the asynchrony, but if it's
asynchronous, you can't use it with any normal code that operates on a stream.

I don't see any immediate fixes for this problem, so I'll let it
marinate in the back of my mind for a while.  This might be the
achilles heel for the whole idea of a low-rent async WSGI.

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alex Grönholm-3
09.01.2011 05:45, P.J. Eby kirjoitti:

> At 06:15 PM 1/8/2011 -0800, Alice Bevan­McGregor wrote:
>> On 2011-01-08 17:22:44 -0800, Alex Grönholm said:
>>>> On 2011-01-08 13:16:52 -0800, P.J. Eby said:
>>>> I've written the sketches dealing only with PEP 3148 futures, but
>>>> sockets were also proposed, and IMO there should be simple support
>>>> for obtaining data from wsgi.input.
>>> I'm a bit unclear as to how this will work with async. How do you
>>> propose that an asynchronous application receives the request body?
>>
>> In my example https://gist.github.com/770743 (which has been
>> simplified greatly by P.J. Eby in the "Future- and Generator-Based
>> Async Idea" thread) for dealing with wsgi.input, I have:
>>
>>    future =
>> environ['wsgi.executor'].submit(environ['wsgi.input'].read, 4096)
>>    yield future
>>
>> While ugly, if you were doing this, you'd likely:
>>
>>         submit = environ['wsgi.executor'].submit
>>         input_ = environ['wsgi.input']
>>
>>    future = yield submit(input_.read, 4096)
>>    data = future.
>
> I don't quite understand the above -- in my sketch, the above would be:
>
>     data = yield submit(input._read, 4096)
>
> It looks like your original sketch wants to call .result() on the
> future, whereas in my version, the return value of yielding a future
> is the result (or an error is thrown if the result was an error).
I cooked up a simple do-nothing middleware example which Alice decorated
with some comments:
https://gist.github.com/771398

A new feature here is that the application itself yields a (status,
headers) tuple and then chunks of the body (or futures).

>
>
> Is there some reason I'm missing, for why you'd want to explicitly
> fetch the result in a separate step?
>
> Meanwhile, thinking about Alex's question, ISTM that if WSGI 2 is
> asynchronous, then the wsgi.input object should probably just have
> read(), readline() etc. methods that simply return (possibly-mock)
> futures.  That's *much* better than having to do all that submit()
> crud just to read data from wsgi.input().
>
> OTOH, if you want to use the cgi module to parse a form POST from the
> input, you're going to need to write an async version of it in that
> case, or else feed the entire operation to an executor...  but then
> the methods would need to be synchronous...  *argh*.
>
> I'm starting to not like this idea at all.  Alex has actually
> pinpointed a very weak spot in the scheme, which is that if wsgi.input
> is synchronous, you destroy the asynchrony, but if it's asynchronous,
> you can't use it with any normal code that operates on a stream.
I liked the idea of having a separate async_read() method in wsgi.input,
which would set the underlying socket in nonblocking mode and return a
future. The event loop would watch the socket and read data into a
buffer and trigger the callback when the given amount of data has been
read. Conversely, .read() would set the socket in blocking mode. What
kinds of problems would this cause?

>
>
> I don't see any immediate fixes for this problem, so I'll let it
> marinate in the back of my mind for a while.  This might be the
> achilles heel for the whole idea of a low-rent async WSGI.
>
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alice Bevan–McGregor
On 2011-01-08 20:06:19 -0800, Alex Grönholm said:

> I liked the idea of having a separate async_read() method in
> wsgi.input, which would set the underlying socket in nonblocking mode
> and return a future. The event loop would watch the socket and read
> data into a buffer and trigger the callback when the given amount of
> data has been read. Conversely, .read() would set the socket in
> blocking mode. What kinds of problems would this cause?

Manipulating the underlying socket is potentially dangerous
(pipelining) and, in fact, not possible AFIK while being
PEP444-compliant.  When the request body is fully consumed, additional
attempts to read _must_ return empty strings.  Thus raw sockets are
right out at a high level; internal to the reactor this may be
possible, however.  It'd be interesting to adapt marrow.io to using
futures in this way as an experiment.

OTOH, if you utilize callbacks extensively (as m.s.http does) you run
into the problem of data passing.  Your application is called (wrapped
in middleware), sets up some futures and callbacks, then returns.  No
returned data.  Middleware just got shot in the foot.  The server,
also, got shot in the foot.  How can it get a resopnse tuple back from
a callback?  How can middleware be utilized?  That's a weird problem to
wrap my head around.  Blocking the application pending the results of
various socket operations is something that would have to be mandated
to avoid this issue.  :/

Multiple in-flight reads would also be problematic; you may end up with
buffer interleaving issues.  (e.g. job A reads 128 bytes at a time and
has been requested to return 4KB, job B does the same... what happens
to the data?)  Then you begin to involve locking...

Notice that my write_body method [1], writes using async, passing the
iterable to the callback which is itself.  This is after-the-fact
(after the request has been returned) and is A-OK, though would need to
be updated heavily to support the ideas of async floating around right
now.  I'm also extremely careful to never have multiple async callbacks
pending (and thus never have muliple "jobs" for a single connection
working at once).

        - Alice.

[1]
https://github.com/pulp/marrow.server.http/blob/draft/marrow/server/http/protocol.py#L313-332 



_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alice Bevan–McGregor
In reply to this post by PJ Eby
On 2011-01-08 19:34:41 -0800, P.J. Eby said:

> At 04:40 AM 1/9/2011 +0200, Alex Grönholm wrote:
>> 09.01.2011 04:15, Alice Bevan­McGregor kirjoitti:
>>> I hope that clearly identifies my idea on the subject. Since
>>> async>>servers will /already/ be implementing their own executors, I
>>> don't>>see this as too crazy.
>> -1 on this. Those executors are meant for executing code in a
>> thread>pool. Mandating a magical socket operation filter here
>> would>considerably complicate server implementation.
>
> Actually, the *reverse* is true.  If you do it the way Alice proposes,
> my sketches don't get any more complex, because the filtering goes in
> the executor facade or submit function.

Indeed; the executor is what then adds the file descriptor to the
underlying server async reactor (select/epoll/kqueue/other).  In the
case of the Marrow server, this would utilize a reactor callback (some
might say "deferred") to update the Future instance with the data,
setting completion status, executing callbacks, etc.  One might even be
able to use a threading.Event (or whatever is the opposite of a lock)
to wake up blocking .result() calls, even if not multi-threaded
(greenthreads, etc.).

Of course, adding the file descriptor to a pure async reactor then
.result() blocking on it from your application would result in a
deadlock; the .result() would never complete as the reactor would never
get a chance to perform the pending request.  (This is why Marrow
requires threading be enabled globally before adding an executor to the
environment; this requires rather explicit documentation.)  This
problem is solved completely by yielding the future instance (pausing
the application) to let the reactor do its thing.  (Yielding the future
becomes a replacement for the blocking behaviour of future.result().)

Effectively what I propose adds emulation of threading on top of async
by mutating an Executor.  (The Executor would be a mixed
threading+async executor.)

I suggest bubbling a future back up the yield stack instead of the
actual result to allow the application (or middleware, or whatever
happened to yield the future) to capture exceptions generated by the
future'd request.  Bubbling the future instance avoids excessive
exception handling cruft in each middleware layer; and I see no real
issue with this.  AFIK, you can use a shorthand (possibly wrapped in a
try: block) if all you care about is the result:

    data = (yield my_future).result()

> Truthfully, I don't really see the point of exposing the map() method
> (which is the only other executor method we'd expose), so it probably
> makes more sense to just offer a 'wsgi.submit' key... which can be a
> function as follows: [snip]

True; the executor itself could easily be hidden behind the filter.  In
a multi-threaded environment, however, the map call poses no problem,
and can be quite useful.  (E.g. with one of my use cases for inclusion
of an executor in the environment: image scaling.)

> Granted, this might be a rather long function.  However, since it's
> essentially an optimization, a given server can decide how many
> functions can be shortcut in this way.  The spec may wish to offer a
> guarantee or recommendation for specific methods of certain
> stdlib-provided types (sockets in particular) and wsgi.input.

+1

> Personally, I do think it might be *better* to offer extended
> operations on wsgi.input that could be used via yield, e.g. "yield
> input.nb_read()".  But of course then the trampoline code has
> torecognize those values instead of futures.

Because wsgi.input is provided by the server, and the executor is
provided by the server, is there a reason why these extended functions
couldn't return... futures?  :)

> Note, too, that this complexity also only affects servers that want to
> offer a truly async API.  A synchronous server has no reason to pay
> particular attention to what's in a future, since it can't offer any
> performance improvement.

I feel a sync server and async server should provide the same API for
accessing the input.  E.g. the application/middleware must be agnostic
to the server in this regard.  This is why a little bit of magic goes a
long way.  The following code would work on any WSGI2 stack that offers
an executor (sync, async, or provided by middleware):

    data = (yield env['wsgi.submit'](env['wsgi.input'].read, 4096)).result()

In a sync server, the blocking read would execute in another thread.  
In an async one appropriate actions would be taken to request a socket
read from the client.  Both cases pause the application pending the
result.  (If you don't immediately yield the future the behaviour
between servers is the same!)

> I do think that this sort of API discussion, though, is the most
> dangerous part of trying to do an async spec.  That is, I don'texpect
> that everyone will spontaneously agree on the exact same API.  Alice's
> proposal (simply submitting object methods) has theadvantage of
> severely limiting the scope of API discussions.  ;-)

Since each async server will either implement or utilize a specific
async framework, each will offer its own "async-supported" featureset.  
What I mean is that all servers should make wsgi.input calls
async-able, some would go further to make all socket calls async.  Some
might go even further than that and define an API for external
libraries (e.g. DBs) to be truly cooperatively async.  I do believe my
solution is flexible enough for the majority of use cases, and where it
isn't (i.e. would block) "abusing" futures in this way will allow an
application to reasonalby fake async without killing async server (who
are internally single-threaded) performance by delegating blocking
calls.

I will have to experiment with determining the type of the class
instance a method is bound to from the bound method itself; this is the
crux of the implementation I suggest.  If you can't get that, the idea
is pooched for anything but wsgi.input which the server would have a
direct reference to anyway.

I hope the clarity of this post didn't degenerate too much over the few
hours I had it open and noodling around.

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alice Bevan–McGregor
In reply to this post by PJ Eby
On 2011-01-08 13:16:52 -0800, P.J. Eby said:

> In the limit case, it appears that any WSGI 1 server could provide an
> (emulated) async WSGI2 implementation, simply by wrapping WSGI2 apps
> with a finished version of the decorator in my sketch.
>
> Or, since users could do it themselves, this would mean that WSGI2
> deployment wouldn't be dependent on all server implementers immediately
> turning out their own WSGI2 implementations.

This, if you'll pardon my language, is bloody awesome.  :D  That would
strongly drive adoption of WSGI2.  Note that adapting a WSGI1
application to WSGI2 server would likewise be very handy, and I
suspect, even easier to implement.

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

exarkun
In reply to this post by Alice Bevan–McGregor
On 11:36 am, [hidden email] wrote:

>On 2011-01-08 19:34:41 -0800, P.J. Eby said:
>>At 04:40 AM 1/9/2011 +0200, Alex Gr�nholm wrote:
>>>09.01.2011 04:15, Alice Bevan�McGregor kirjoitti:
>>>>I hope that clearly identifies my idea on the subject. Since
>>>>async>>servers will /already/ be implementing their own executors, I
>>>>don't>>see this as too crazy.
>>>-1 on this. Those executors are meant for executing code in a
>>>thread>pool. Mandating a magical socket operation filter here
>>>would>considerably complicate server implementation.
>>
>>Actually, the *reverse* is true.  If you do it the way Alice proposes,
>>my sketches don't get any more complex, because the filtering goes in
>>the executor facade or submit function.
>
>Indeed; the executor is what then adds the file descriptor to the
>underlying server async reactor (select/epoll/kqueue/other).  In the
>case of the Marrow server, this would utilize a reactor callback (some
>might say "deferred") to
Don't say it if it's not true.  Deferreds aren't tied to a reactor, and
Marrow doesn't appear to have anything called "deferred".  So this
parallel to Twisted's Deferred is misleading and confusing.
>
>Since each async server will either implement or utilize a specific
>async framework, each will offer its own "async-supported" featureset.
>What I mean is that all servers should make wsgi.input calls async-
>able, some would go further to make all socket calls async.  Some might
>go even further than that and define an API for external libraries
>(e.g. DBs) to be truly cooperatively async.

I think this effort would benefit from more thought on how exactly
accessing this external library support will work.  If async wsgi is
limited to performing a single read asynchronously, then it hardly seems
compelling.

Jean-Paul

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alice Bevan–McGregor
On 2011-01-09 07:04:49 -0800,
[hidden email] said:

> Don't say it if it's not true.  Deferreds aren't tied to a reactor, and
> Marrow doesn't appear to have anything called "deferred".  So this
> parallel to Twisted's Deferred is misleading and confusing.

It was merely a comparison to the "you schedule something, attach some
callbacks to it, and when it's finished your callbacks get executed"
feature.  I did not mention Twisted; also:

:: defer - postpone: hold back to a later time; "let's postpone the exam"

:: deferred - postponed: put off until a later time; "surgery has been
postponed"

Futures are very similar to deferreds with the one difference you
mention: future instances are created by the executor/reactor and are
(possibly) the internal representation instead of Twisted treating the
Deferred as the executor in terms of registering calls.  In most other
ways, they share the same goals, and similar methods, even.

Marrow's "deferred calls" code is buried in marrow.io, with IOStreams
accepting callbacks as part of the standard read/write calls and
registering these internally.  IOStream then performs read/writes
across the raw sockets utilizing callbacks from the IOLoop reactor.  
When an IOStream meets its criteria (e.g. written all of the requested
data, read a number of bytes >= the requested count, or read until a
marker has appeared in the stream, e.g. \r\n) IOLoop then executes the
callbacks registered with it, passing the data, if any.

I will likely expand this to include additional criteria and callback hooks.

IOStream, in this way, acts more like Twisted Deferreds than Futures.

> I think this effort would benefit from more thought on how exactly
> accessing this external library support will work.  If async wsgi is
> limited to performing a single read asynchronously, then it hardly
> seems compelling.

There appears to be a misunderstanding over how futures work.  Please
read PEP 3148 [1] carefully.  While there's not much there, here's the
gist: the executor schedules the callable passed to submit.  If the
"worker pool" is full, the underlying pooling mechanism will delay
execution of the callable until a slot is freed.  Pool and slot are
defined, by example only, as thread or process pools, but are not
restricted to such.

(There are three relevant classes defined by concurrent.futures:
Executor, ProcessPoolExecutor, and ThreadPoolExecutor.  Again, as long
as you implement the Executor duck-typed interface, you're good to go
and compliant to PEP 3148, regardless of underlying mechanics.)

If a "slot" is available at the moment of submission, the callable has
a reasonable expectation of being immediately executed.  The
future.result() method merely blocks awaiting completion of the already
running, not yet running, or already completed future.  If already
completed (a la the future sent back up to to the application after
yielding it) the call to result is non-blocking / immediate.

Yielding the future is simply a way of safely "blocking" (usually done
by calling .result() before the future is complete), not some absolute
requirement for the future itself to run.  The future (and thus async
socket calls et. al.) can, and should, be scheduled with the underlying
async reactor in the call to submit().

        - Alice.

[1] http://www.python.org/dev/peps/pep-3148/


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alice Bevan–McGregor
In reply to this post by exarkun
On 2011-01-09 07:04:49 -0800,
[hidden email] said:
> I think this effort would benefit from more thought on how exactly
> accessing this external library support will work.  If async wsgi is
> limited to performing a single read asynchronously, then it hardly
> seems compelling.

Apologies if the last e-mail was too harsh; I'm about to go to bed, and
it' been a long night/morning.  ;)

Here's a proposed solution: a generator API on top of futures.

If the async server implementing the executor can detect a generator
being submitted, then:

:: The executor accepts the generator and begins iteration (passing the
executor and the arguments supplied to submit).

:: The generator is expected to be /fast/.

:: The generator does work until it needs an operation over a file
descriptor, at which point it yields the fd and the operation (say,
'r', or 'w').

:: The executor schedules with the async reactor the generator to be
re-called when the operation is possible.

:: The Future is considered complete when the generator raises
GeneratorExit and the first argument is used as the return value of the
Future.

Yielding a 2-tuple of readers/writers would work, too, and allow for
more concurrent utilization of sockets, though I'm not sure of the use
cases for this.  If so, the generator would be woken up when any of the
readers or writers are available and sent() a 2-tuple of
available_readers, available_writers.

The executor is passed along for any operations the generator can not
accomplish safely without threads, and the executor, as it's running
through the generator, will accomplish the same semantics as iterating
the WSGI application: if a future instance is yielded, the generator is
suspended until the future is complete, allowing heavy processing to be
mixed with async calls in a fully async server.

The wsgi.input operations can be implemented this way, as can database
operations and pretty much anything that uses sockets, pipes, or
on-disk files.  In fact, the WSGI application -itself- could be called
in this way (with the omission of the executor or a simple wrapper that
saves the executor into the environ).

Just a quick thought before running off to bed.

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

PJ Eby
In reply to this post by Alex Grönholm-3
At 06:06 AM 1/9/2011 +0200, Alex Grönholm wrote:
>A new feature here is that the application itself yields a (status,
>headers) tuple and then chunks of the body (or futures).

Hm.  I'm not sure if I like that.  The typical app developer really
shouldn't be yielding multiple body strings in the first place.  I
much prefer that the canonical example of a WSGI app just return a
list with a single bytestring -- preferably in a single statement for
the entire return operation, whether it's a yield or a return.

IOW, I want it to look like the normal way to do thing is to just
return the whole request at once, and use the additional difficulty
of creating a second iterator to discourage people writing iterated
bodies when they should just write everything to a BytesIO and be done with it.

Also, it makes middleware simpler: the last line can just yield the
result of calling the app, or a modified version, i.e.:

     yield app(environ)

or:

     s, h, b = app(environ)
     # ... modify or replace s, h, b
     yield s, h, b

In your approach, the above samples have to be rewritten as:

     return app(environ)

or:

     result = app(environ)
     s, h = yield result
     # ... modify or replace s, h
     yield s, h

     for data in result:
          # modify b as we go
          yield result

Only that last bit doesn't actually work, because you have to be able
to send future results back *into* the result.  Try actually making
some code that runs on this protocol and yields to futures during the
body iteration.

Really, this modified protocol can't work with a full async API the
way my coroutine-based version does, AND the middleware is much more
complicated.  In my version, your do-nothing middleware looks like this:


class NullMiddleware(object):
     def __init__(self, app):
         self.app = app

     def __call__(environ):
         # ACTION: pre-application environ mangling

         s, h, body = yield self.app(environ)

         # modify or replace s, h, body here

         yield s, h, body


If you want to actually process the body in some way, it looks like:

class NullMiddleware(object):

     def __init__(self, app):
         self.app = app

     def __call__(environ):
         # ACTION: pre-application environ mangling

         s, h, body = yield self.app(environ)

         # modify or replace s, h, body here

         yield s, h, self.process(body)

     def process(self, body_iter):
         while True:
             chunk = yield body_iter
             if chunk is None:
                 break
             # process/modify chunk here
             yield chunk

And that's still a lot simpler than your sketch.

Personally, I would write both of the above as:

     def null_middleware(app):

         def wrapped(environ):
             # ACTION: pre-application environ mangling
             s, h, body = yield app(environ)

             # modify or replace s, h, body here
             yield s, h, process(body)

         def process(body_iter):
             while True:
                 chunk = yield body_iter
                 if chunk is None:
                     break
                 # process/modify chunk here
                 yield chunk

         return wrapped

But that's just personal taste.  Even as a class, it's much easier to
write.  The above middleware pattern works with the sketches I gave
on the PEAK wiki, and I've now updated the wiki to include an example
app and middleware for clarity.

Really, the only hole in this approach is dealing with applications
that block.  The elephant in the room here is that while it's easy to
write these example applications so they don't block, in practice
people read files and do database queries and whatnot in their
requests, and those APIs are generally synchronous.  So, unless they
somehow fold their entire application into a future, it doesn't work.


>I liked the idea of having a separate async_read() method in
>wsgi.input, which would set the underlying socket in nonblocking
>mode and return a future. The event loop would watch the socket and
>read data into a buffer and trigger the callback when the given
>amount of data has been read. Conversely, .read() would set the
>socket in blocking mode. What kinds of problems would this cause?

That you could never *call* the .read() method outside of a future,
or else you would block the server, thereby obliterating the point of
having the async API in the first place.

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

PJ Eby
In reply to this post by Alice Bevan–McGregor
At 04:25 AM 1/9/2011 -0800, Alice Bevan­McGregor wrote:

>On 2011-01-08 13:16:52 -0800, P.J. Eby said:
>
>>In the limit case, it appears that any WSGI 1 server could provide
>>an (emulated) async WSGI2 implementation, simply by wrapping WSGI2
>>apps with a finished version of the decorator in my sketch.
>>Or, since users could do it themselves, this would mean that WSGI2
>>deployment wouldn't be dependent on all server implementers
>>immediately turning out their own WSGI2 implementations.
>
>This, if you'll pardon my language, is bloody awesome.  :D  That
>would strongly drive adoption of WSGI2.  Note that adapting a WSGI1
>application to WSGI2 server would likewise be very handy, and I
>suspect, even easier to implement.

I very much doubt that.  You'd need greenlets or a thread with a
communication channel in order to support WSGI 1 apps that use write() calls.

By the way, I don't really see the point of the new sketches you're
doing, as they aren't nearly as general as the one I've already done,
but still have the same fundamental limitation: wsgi.input.

If wsgi.input offers any synchronous methods, then they must be used
from a future and must somehow raise an error when called from within
the application -- otherwise it would block, nullifying the point of
having a generator-based API.

If it offers only asynchronous methods, OTOH, then you can't pass
wsgi.input to any existing libraries (e.g. the cgi module).

The latter problem is the worse one, because it means that the
translation of an app between my original WSGI2 API and the current
sketch is no longer just "replace 'return' with 'yield'".

The only way this would work is if WSGI applications are still
allowed to be written in a blocking style.  Greenlet-based frameworks
would have no problem with this, of course, but servers like Twisted
would still have to run WSGI apps in a worker thread pool, just
because they *might* block.

If we're okay with this as a limitation, then adding _async method
variants that return futures might work, and we can proceed from there.

Mostly, though, it seems to me that the need to be able to write
blocking code does away with most of the benefit of trying to have a
single API in the first place.  Either everyone ends up putting their
whole app into a future, or else the server has to accept that the
app could block... and put it into a future for them.  ;-)

So, the former case will be unacceptable to app developers who don't
feel a need for async code, and the latter doesn't seem to offer
anything to the developers of non-blocking servers.

(The exception to these conditions, of course, are greenlet-based
servers, but they can run WSGI *1* apps in a non-blocking way, and so
have no need for a new protocol.)

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alex Grönholm-3
In reply to this post by PJ Eby
09.01.2011 19:03, P.J. Eby kirjoitti:
> At 06:06 AM 1/9/2011 +0200, Alex Grönholm wrote:
>> A new feature here is that the application itself yields a (status,
>> headers) tuple and then chunks of the body (or futures).
>
> Hm.  I'm not sure if I like that.  The typical app developer really
> shouldn't be yielding multiple body strings in the first place.  I
> much prefer that the canonical example of a WSGI app just return a
> list with a single bytestring -- preferably in a single statement for
> the entire return operation, whether it's a yield or a return.
Uh, so don't yield multiple body strings then? How is that so difficult?
>
>
> IOW, I want it to look like the normal way to do thing is to just
> return the whole request at once, and use the additional difficulty of
> creating a second iterator to discourage people writing iterated
> bodies when they should just write everything to a BytesIO and be done
> with it.
I fail to understand why a second iterator is necessary when we can get
away with just one.

>
>
> Also, it makes middleware simpler: the last line can just yield the
> result of calling the app, or a modified version, i.e.:
>
>     yield app(environ)
>
> or:
>
>     s, h, b = app(environ)
>     # ... modify or replace s, h, b
>     yield s, h, b
Asynchronous applications may not be ready to send the status line as
the first thing coming out of the generator. Consider an app that
receives a file. The first thing coming out of the app is a future. The
app needs to receive the entire file until it can determine what status
line to send. Maybe there was an I/O error writing the file, so it needs
to send a 500 response instead of 200. This is not possible with a body
iterator, and if we are already iterating the application generator, I
really don't understand why the body needs to be an iterator as well.

>
>
> In your approach, the above samples have to be rewritten as:
>
>     return app(environ)
>
> or:
>
>     result = app(environ)
>     s, h = yield result
>     # ... modify or replace s, h
>     yield s, h
>
>     for data in result:
>          # modify b as we go
>          yield result
>
> Only that last bit doesn't actually work, because you have to be able
> to send future results back *into* the result.  Try actually making
> some code that runs on this protocol and yields to futures during the
> body iteration.
Did you miss the gist posted by myself (and improved by Alice)?

>
> Really, this modified protocol can't work with a full async API the
> way my coroutine-based version does, AND the middleware is much more
> complicated.  In my version, your do-nothing middleware looks like this:
>
>
> class NullMiddleware(object):
>     def __init__(self, app):
>         self.app = app
>
>     def __call__(environ):
>         # ACTION: pre-application environ mangling
>
>         s, h, body = yield self.app(environ)
>
>         # modify or replace s, h, body here
>
>         yield s, h, body
>
>
> If you want to actually process the body in some way, it looks like:
>
> class NullMiddleware(object):
>
>     def __init__(self, app):
>         self.app = app
>
>     def __call__(environ):
>         # ACTION: pre-application environ mangling
>
>         s, h, body = yield self.app(environ)
>
>         # modify or replace s, h, body here
>
>         yield s, h, self.process(body)
>
>     def process(self, body_iter):
>         while True:
>             chunk = yield body_iter
>             if chunk is None:
>                 break
>             # process/modify chunk here
>             yield chunk
>
> And that's still a lot simpler than your sketch.
>
> Personally, I would write both of the above as:
>
>     def null_middleware(app):
>
>         def wrapped(environ):
>             # ACTION: pre-application environ mangling
>             s, h, body = yield app(environ)
>
>             # modify or replace s, h, body here
>             yield s, h, process(body)
>
>         def process(body_iter):
>             while True:
>                 chunk = yield body_iter
>                 if chunk is None:
>                     break
>                 # process/modify chunk here
>                 yield chunk
>
>         return wrapped
>
> But that's just personal taste.  Even as a class, it's much easier to
> write.  The above middleware pattern works with the sketches I gave on
> the PEAK wiki, and I've now updated the wiki to include an example app
> and middleware for clarity.
>
> Really, the only hole in this approach is dealing with applications
> that block.  The elephant in the room here is that while it's easy to
> write these example applications so they don't block, in practice
> people read files and do database queries and whatnot in their
> requests, and those APIs are generally synchronous.  So, unless they
> somehow fold their entire application into a future, it doesn't work.
>
>
>> I liked the idea of having a separate async_read() method in
>> wsgi.input, which would set the underlying socket in nonblocking mode
>> and return a future. The event loop would watch the socket and read
>> data into a buffer and trigger the callback when the given amount of
>> data has been read. Conversely, .read() would set the socket in
>> blocking mode. What kinds of problems would this cause?
>
> That you could never *call* the .read() method outside of a future, or
> else you would block the server, thereby obliterating the point of
> having the async API in the first place.
>
Outside of the application/middleware you mean? I hope there isn't any
more confusion left about what a future is. The fact is that you cannot
use synchronous API calls directly from an async app no matter what.
Some workaround is always necessary.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

PJ Eby
At 08:09 PM 1/9/2011 +0200, Alex Grönholm wrote:
>Asynchronous applications may not be ready to send the status line
>as the first thing coming out of the generator.

So?  In the sketches that are the subject of this thread, it doesn't
have to be the first thing.  If the application yields a future
first, it will be paused...  and so will the middleware.  When this
line is executed in the middleware:

         status, headers, body = yield app(environ)

...the middleware is paused until the application actually yields its
response tuple.

Specifically, this yield causes the app iterator to be pushed on the
Coroutine object's .stack attribute, then iterated.  If the
application yields a future, the server suspends the whole thing
until it gets called back, at which point it .send()s the result back
into the app iterator.

The app iterator then yields its response, which is tagged as a
return value, so the app is popped off the .stack, and the response
is sent via .send() into the middleware, which then proceeds as if
nothing happened in the meantime.  It then yields *its* response, and
whatever body iterator is given gets put into a second coroutine that
proceeds similarly.

When the process_response() part of the middleware does a "yield
body_iter", the body iterator is pushed, and the middleware is paused
until the body iterator yields a chunk.  If the body yields a future,
the whole process is suspended and resumed.  The middleware won't be
resumed until the body yields another chunk, at which point it is
resumed.  If it yields a chunk of its own, then that's passed up to
any response-processing middleware further up the stack.

In contrast, middleware based on the 2+body protocol cannot process a
body without embedding coroutine management into the middleware
itself.   For example, you can't write a standalone body processor
function, and reuse it inside of two pieces of middleware, without
doing a bunch of send()/throw() logic to make it work.


>Outside of the application/middleware you mean? I hope there isn't
>any more confusion left about what a future is. The fact is that you
>cannot use synchronous API calls directly from an async app no
>matter what. Some workaround is always necessary.

Which pretty much kills the whole idea as being a single, universal
WSGI protocol, since most people don't care about async.

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alex Grönholm-3
09.01.2011 22:56, P.J. Eby kirjoitti:

> At 08:09 PM 1/9/2011 +0200, Alex Grönholm wrote:
>> Asynchronous applications may not be ready to send the status line as
>> the first thing coming out of the generator.
>
> So?  In the sketches that are the subject of this thread, it doesn't
> have to be the first thing.  If the application yields a future first,
> it will be paused...  and so will the middleware.  When this line is
> executed in the middleware:
>
>         status, headers, body = yield app(environ)
>
> ...the middleware is paused until the application actually yields its
> response tuple.
>
> Specifically, this yield causes the app iterator to be pushed on the
> Coroutine object's .stack attribute, then iterated.  If the
> application yields a future, the server suspends the whole thing until
> it gets called back, at which point it .send()s the result back into
> the app iterator.
>
> The app iterator then yields its response, which is tagged as a return
> value, so the app is popped off the .stack, and the response is sent
> via .send() into the middleware, which then proceeds as if nothing
> happened in the meantime.  It then yields *its* response, and whatever
> body iterator is given gets put into a second coroutine that proceeds
> similarly.
>
> When the process_response() part of the middleware does a "yield
> body_iter", the body iterator is pushed, and the middleware is paused
> until the body iterator yields a chunk.  If the body yields a future,
> the whole process is suspended and resumed.  The middleware won't be
> resumed until the body yields another chunk, at which point it is
> resumed.  If it yields a chunk of its own, then that's passed up to
> any response-processing middleware further up the stack.
>
> In contrast, middleware based on the 2+body protocol cannot process a
> body without embedding coroutine management into the middleware
> itself.   For example, you can't write a standalone body processor
> function, and reuse it inside of two pieces of middleware, without
> doing a bunch of send()/throw() logic to make it work.
Some boilerplate code was necessary in WSGI 1 middleware too. Alice's
cleaned up example didn't look too bad, and it would not require that
Coroutine stack at all.

I think that at this point both sides need to present some code that
really works, and those implementations could then be compared. The
examples so far have been a bit too abstract to be fairly evaluated.
>
>
>> Outside of the application/middleware you mean? I hope there isn't
>> any more confusion left about what a future is. The fact is that you
>> cannot use synchronous API calls directly from an async app no matter
>> what. Some workaround is always necessary.
>
> Which pretty much kills the whole idea as being a single, universal
> WSGI protocol, since most people don't care about async.
I'm confused. Did you not know this? If so, why then were you at least
initially receptive to the idea?
Personally I don't think that this is a big problem. Async apps will
always have to take care not to block the reactor unreasonably long, and
that is never going to change. Synchronous apps just need to follow the
protocol, but beyond that they shouldn't have to care about the async
side of things.
>
>

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alice Bevan–McGregor
In reply to this post by PJ Eby
On 2011-01-09 09:26:19 -0800, P.J. Eby said:
> By the way, I don't really see the point of the new sketches you're doing...

I'm sorry.

> ...as they aren't nearly as general as the one I've already done, but
> still have the same fundamental limitation: wsgi.input.

You missed the point entirely, then.

> If wsgi.input offers any synchronous methods...

Regardless of whether or not wsgi.input is implemented in an async way,
wrap it in a future and eventually get around to yielding it.  Problem
/solved/.  Identical APIs for both sync and async, and if you have an
async server but haven't gotten around to implementing your own
executor yet, wrapping the blocking read call in a future also solves
the problem (albeit not in the most efficient way).

I.e. wrap every call to a wsgi.input method by passing it to wsgi.submit.

> ...then they must be used from a future and must some how raise an
> error when called from within the application -- otherwise it would
> block, nullifying the point ofhaving a generator-based API.

See above.  No extra errors, nothing really that insane.

> If it offers only asynchronous methods, OTOH, then you can't pass
> wsgi.input to any existing libraries (e.g. the cgi module).

Describe to me how a function can be suspended (other than magical
greenthreads) if it does not yield; if I knew this, maybe I wouldn't be
so confused.

> The latter problem is the worse one, because it means that the
> translation of an app between my original WSGI2 API and the current
> sketch is no longer just "replace 'return' with 'yield'".

I've deviated from your sketch, obviously, and any semblance of
yielding a 3-tuple.  Stop thinking of my example code as conforming to
your ideas; it's a new idea, or, worst case, a narrowing of an idea
into its simplest form.

> The only way this would work is if WSGI applications are still allowed
> to be written in a blocking style.  Greenlet-based frameworks would
> have no problem with this, of course, but servers like Twisted would
> still have to run WSGI apps in a worker thread pool, just because they
> *might* block.

Then that is not acceptable and "would not work".  The mechanics of
yielding futures instances allows you to (in your server) implement the
necessary async code however you wish while providing a uniform
interface to both sync and async applications running on sync and async
servers.  In fact, you would be able to safely run a sync application
on an async server and vice-versa.  You can, on an async server:

:: Add a callback to the yielded future to re-schedule the application
generator.

:: If using greenthreads, just block on future.result() then
immediately wake up the application generator.

:: Do other things I can't think of because I'm still waking up.

The first solution is how Marrow HTTPd would operate.

> If we're okay with this as a limitation, then adding _async method
> variants that return futures might work, and we can proceed from there.

That is not optimum, because now you have an optional API that
applications who want to be compatible will need to detect and choose
between.

> Mostly, though, it seems to me that the need to be able to write
> blocking code does away with most of the benefit of trying to have a
> single API in the first place.

You have artificially created this need, ignoring the semantics of
using the server-specific executor to detect async-capable requests and
the yield mechanics I suggested; which happens to be a single, coherent
API across sync and async servers and applications.

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Server-side async API implementation sketches

Alice Bevan–McGregor
In reply to this post by PJ Eby
On 2011-01-09 09:03:38 -0800, P.J. Eby said:
> Hm.  I'm not sure if I like that.  The typical app developer really
> shouldn't be yielding multiple body strings in the first place.

Wait; what?  So you want the app developer to load a 40MB talkcast MP3
into memory before sending it?  You want to completely eliminate the
ability to stream an HTML page to the client in chunks (e.g. <head>
block, headers + search box, search results, advertisements, footer --
the exact thing Google does with every search result)?  That sounds
like artificially restricting application developers, to me.

> I much prefer that the canonical example of a WSGI app just return a
> list with a single bytestring...

Why is it wrapped in a list, then?

> IOW, I want it to look like the normal way to do thing is to just
> return the whole request at once, and use the additional difficulty of
> creating a second iterator to discourage people writing iterated bodies
> when they should just write everything to a BytesIO and be done with it.

It sounds to me like your "should" doesn't cover an extremely large
range of common use cases.

> In your approach, the above samples have to be rewritten as:
>
>      return app(environ)
>
> [snip]

My code does not use return.  At all.  Only yield.

> Try actually making some code that runs on this protocol and yields to
> futures during the body iteration.

Sure.  I'll also implement my actual proposal of not having a separate
body iterable.

> The above middleware pattern works with the sketches I gaveon the PEAK
> wiki, and I've now updated the wiki to include an exampleapp and
> middleware for clarity.

I'll need to re-read the code on your wiki; I find it incredibly
difficult to grok, however, you can help me out a bit by answering a
few questions about it: How does middleware trap exceptions raised by
the application.  (Specifically how does the server pass the buck with
exceptions?  And how does the exception get to the application to
bubble out towards the server, through middleware, as it does now?)

> Really, the only hole in this approach is dealing with applications that block.

That's what the executor in the environ is for.  If you have image
scaling or something else that will block you submit it.  All
networking calls?  You submit them.

> The elephant in the room here is that while it's easy towrite these
> example applications so they don't block, in practicepeople read files
> and do database queries and what not in their requests, and those APIs
> are generally synchronous.  So, unless they somehow fold their entire
> application into a future, it doesn't work.

Actually, that's how multithreading support in marrow.server[.http] was
implemented.  Overhead?  40-60 RSecs.  The option is provided for those
who can do nothing about their application blocking, while still
maintaining the internally async nature of the server.

> That you could never *call* the .read() method outside of a future,or
> else you would block the server, thereby obliterating the point
> ofhaving the async API in the first place.

See above re: your confusion over the calling semantics of wsgi.input
in regards to my (and Alex's) proposal.  Specifically:

    data = (yield submit(wsgi_input.read, 4096)).result()

This would work on sync and async servers, and with sync and async
applications, with no difference in the code.

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
12