PEP 444 feature request - Futures executor

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

PEP 444 feature request - Futures executor

Timothy Farrell-2
There has been much discussion about how to handle async in PEP 444 and that discussion centers around the use of futures.  However, I'm requesting that servers _optionally_ provide environ['wsgi.executor'] as a futures executor that applications can use for the purpose of doing something after the response is fully sent to the client.  This is feature request is designed to be concurrency methodology agnostic.

Some example use cases are:

- send an email that might block on a slow email server (Alice, I read what you said about Turbomail, but one product is not the solution to all situations)
- initiate a database vacuum
- clean a cache
- build a cache
- compile statistics

When serving pages of an application, these are all things that could be done after the response has been sent.  Ideally these things don't need to be done in a request thread and aren't incredibly time-sensitive.  It seems to me that futures would be an ideal way of handling this.

Thoughts?
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 feature request - Futures executor

Guido van Rossum
If it's optional, what's the benefit for the app of getting it through
WSGI instead of through importing some other standard module? The API
of the executor will require a lot of thought. I worry that this
weighs down the WSGI standard with the responsibility of coming up
with the perfect executor API, and if it's not quite perfect after
all, servers are additionally required to support the standard but
suboptimal API effectively forever. Or they can choose not to provide
it, in which case it was a waste of time putting it in WSGI.

On Fri, Jan 7, 2011 at 9:47 AM, Timothy Farrell
<[hidden email]> wrote:

> There has been much discussion about how to handle async in PEP 444 and that discussion centers around the use of futures.  However, I'm requesting that servers _optionally_ provide environ['wsgi.executor'] as a futures executor that applications can use for the purpose of doing something after the response is fully sent to the client.  This is feature request is designed to be concurrency methodology agnostic.
>
> Some example use cases are:
>
> - send an email that might block on a slow email server (Alice, I read what you said about Turbomail, but one product is not the solution to all situations)
> - initiate a database vacuum
> - clean a cache
> - build a cache
> - compile statistics
>
> When serving pages of an application, these are all things that could be done after the response has been sent.  Ideally these things don't need to be done in a request thread and aren't incredibly time-sensitive.  It seems to me that futures would be an ideal way of handling this.
>
> Thoughts?
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/guido%40python.org
>



--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 feature request - Futures executor

Timothy Farrell-2
> If it's optional, what's the benefit for the app of getting it through WSGI instead of through importing some other standard module?

Performance primarily.  If you instantiate an executor at every page request, wouldn't that slow things down unnecessarily?  Aside from that, servers currently specify if they are multi-threaded and/or multi-process.  Having the server provide the executor allows it to provide an executor that most matches its own concurrency model...again for performance reasons.

Optional and not manditory because not every application wants or need such functionality.  Maybe this should be a server option instead of a spec option.  But since we already have the module available, it shouldn't be too much of a burden on server/gateway authors to add support for it.

> I worry that this weighs down the WSGI standard with the responsibility of coming up with the perfect executor API, and if it's not quite perfect after all, servers are additionally required to support the standard but suboptimal API effectively forever.

I'm not following you here.  What's wrong with executor.submit() that might need changing?  Granted, it would not be ideal if an application called executor.shutdown().  This doesn't seem difficult to my tiny brain.


----- Original Message -----
From: "Guido van Rossum" <[hidden email]>
To: "Timothy Farrell" <[hidden email]>
Cc: [hidden email]
Sent: Friday, January 7, 2011 11:59:10 AM
Subject: Re: [Web-SIG] PEP 444 feature request - Futures executor

If it's optional, what's the benefit for the app of getting it through
WSGI instead of through importing some other standard module? The API
of the executor will require a lot of thought. I worry that this
weighs down the WSGI standard with the responsibility of coming up
with the perfect executor API, and if it's not quite perfect after
all, servers are additionally required to support the standard but
suboptimal API effectively forever. Or they can choose not to provide
it, in which case it was a waste of time putting it in WSGI.

On Fri, Jan 7, 2011 at 9:47 AM, Timothy Farrell
<[hidden email]> wrote:

> There has been much discussion about how to handle async in PEP 444 and that discussion centers around the use of futures.  However, I'm requesting that servers _optionally_ provide environ['wsgi.executor'] as a futures executor that applications can use for the purpose of doing something after the response is fully sent to the client.  This is feature request is designed to be concurrency methodology agnostic.
>
> Some example use cases are:
>
> - send an email that might block on a slow email server (Alice, I read what you said about Turbomail, but one product is not the solution to all situations)
> - initiate a database vacuum
> - clean a cache
> - build a cache
> - compile statistics
>
> When serving pages of an application, these are all things that could be done after the response has been sent.  Ideally these things don't need to be done in a request thread and aren't incredibly time-sensitive.  It seems to me that futures would be an ideal way of handling this.
>
> Thoughts?
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/guido%40python.org
>



--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 feature request - Futures executor

PJ Eby
In reply to this post by Timothy Farrell-2
At 11:47 AM 1/7/2011 -0600, Timothy Farrell wrote:

>There has been much discussion about how to handle async in PEP 444
>and that discussion centers around the use of futures.  However, I'm
>requesting that servers _optionally_ provide
>environ['wsgi.executor'] as a futures executor that applications can
>use for the purpose of doing something after the response is fully
>sent to the client.  This is feature request is designed to be
>concurrency methodology agnostic.
>
>Some example use cases are:
>
>- send an email that might block on a slow email server (Alice, I
>read what you said about Turbomail, but one product is not the
>solution to all situations)
>- initiate a database vacuum
>- clean a cache
>- build a cache
>- compile statistics
>
>When serving pages of an application, these are all things that
>could be done after the response has been sent.  Ideally these
>things don't need to be done in a request thread and aren't
>incredibly time-sensitive.  It seems to me that futures would be an
>ideal way of handling this.
>
>Thoughts?

This seems like a potentially good way to do it; I suggest making it
a wsgi.org extension; see (and update)
http://www.wsgi.org/wsgi/Specifications with your proposal.

I would suggest including a simple sample executor wrapper that
servers could use to block all but the methods allowed by your
proposal.  (i.e., presumably not shutdown(), for example.)

There are some other issues that might need to be addressed, like
maybe adding an attribute or two for the level of reliability
guaranteed by the executor, or allowing the app to request a given
reliability level.  Specifically, it might be important to distinguish between:

* this will be run exactly once as long as the server doesn't crash
* this will eventually be run once, even if the server suffers a
fatal error between now and then

IOW, to indicate whether the thing being done is "transactional", so to speak.

I mean, I can imagine building a transactional service on top of the
basic service, by queuing task information externally, then just
using executor calls to pump the queue.  But IMO it seems pretty
intrinsic to want that kind of persistence guarantee for at least the
email case, or, say, sending off a charge to a credit card or
something like that.

One other relevant use case: sometimes you want a long-running
process step that the user checks back in on periodically, so having
a way to get a "handle" for a future that can be kept in a session or
something might be important.  Like, say, you're preparing a report
that will be viewed in the browser, and using meta-refresh or some
such to poll.  The app needs to check on a previously queued future
and get its results.

I don't know how easy any of the above are to implement with the
futures API or your proposal, but they seem like worthwhile things to
have available, and actually would provide for some rich application
use cases.  But if they're implementable over the futures API at all,
it should be possible to implement them as WSGI 1.x middleware or as
a server extension.

A spec like that definitely needs some thrashing out, but I don't
think it need derail any PEPs in progress: the API of such an
extension doesn't affect the basic WSGI protocol at all.

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 feature request - Futures executor

Alice Bevan–McGregor
In reply to this post by Timothy Farrell-2
On Fri, Jan 7, 2011 at 9:47 AM, Timothy Farrell wrote:
> There has been much discussion about how to handle async in PEP 444 and
> that discussion centers around the use of futures.  However, I'm
> requesting that servers _optionally_ provide environ['wsgi.executor']
> as a futures executor that applications can use for the purpose of
> doing something after the response is fully sent to the client.  This
> is feature request is designed to be concurrency methodology agnostic.

+1

On 2011-01-07 11:07:36 -0800, Timothy Farrell said:
> On 2011-01-07 09:59:10 -0800, Guido van Rossum said:
>> If it's optional, what's the benefit for the app of getting it through
>> WSGI instead of through importing some other standard module?
>
> Aside from that, servers currently specify if they are multi-threaded
> and/or multi-process.  Having the server provide the executor allows it
> to provide an executor that most matches its own concurrency model...

I think that's the bigger point; WSGI servers do implement their own
concurrency model for request processing and utilizing a
server-provided executor which interfaces with whatever the internal
representation of concurrency is would be highly beneficial.  (Vs. an
application utilizing a more generic executor implementation that adds
a second thread pool...)

Taking futures to be separate and distinct from the rest of async
discussion, I still think it's an extremely useful feature.  I outlined
my own personal use cases in my slew of e-mails last night, and many of
them are also not time sensitive.  (E.g. image scaling, full text
indexing, etc.)

> Maybe this should be a server option instead of a spec option.

It would definitely fall under the Server API spec, not the application
one.  Being optional, and with simple (wsgi.executor) access via the
environ would also allow middleware developers to create executor
implementations (or just reference the concurrent.futures
implementation).

>> I worry that this weighs down the WSGI standard with the responsibility
>> of coming up with the perfect executor API, and if it's not quite
>> perfect after all, servers are additionally required to support the
>> standard but suboptimal API effectively forever.
>
> I'm not following you here.  What's wrong with executor.submit() that
> might need changing?  Granted, it would not be ideal if an application
> called executor.shutdown().  This doesn't seem difficult to my tiny
> brain.

The "perfect" executor API is already well defined in PEP 3148 AFIK.  
Specific methods with specific semantics implemented in a duck-typed
way.  The underlying implementation is up to the server, or the server
can utilize an external (or built-in in 3.2) futures implementation.

If WSGI 2 were to incorporate futures as a feature there would have to
be some mandate as to which methods applications and middleware are
allowed to call; similar to how we do not allow .close() across
wsgi.input or wsgi.errors.

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 feature request - Futures executor

Timothy Farrell-2
In reply to this post by PJ Eby


----- Original Message -----
From: "P.J. Eby" <[hidden email]>
To: "Timothy Farrell" <[hidden email]>, [hidden email]
Sent: Friday, January 7, 2011 2:14:20 PM
Subject: Re: [Web-SIG] PEP 444 feature request - Futures executor

> This seems like a potentially good way to do it; I suggest making it
> a wsgi.org extension; see (and update)
> http://www.wsgi.org/wsgi/Specifications with your proposal.
>
> I would suggest including a simple sample executor wrapper that
> servers could use to block all but the methods allowed by your
> proposal.  (i.e., presumably not shutdown(), for example.)

OK, will do.



_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 feature request - Futures executor

Alex Grönholm-3
In reply to this post by Guido van Rossum
07.01.2011 19:59, Guido van Rossum kirjoitti:
> If it's optional, what's the benefit for the app of getting it through
> WSGI instead of through importing some other standard module? The API
> of the executor will require a lot of thought. I worry that this
> weighs down the WSGI standard with the responsibility of coming up
> with the perfect executor API, and if it's not quite perfect after
> all, servers are additionally required to support the standard but
> suboptimal API effectively forever. Or they can choose not to provide
> it, in which case it was a waste of time putting it in WSGI.
The only plausible reason for having a wsgi.executor object is to make
writing (asynchronous) middleware easier. Otherwise the app could just
create its own executor (as it is done now).
If there is no wsgi.executor, how will the middleware get ahold of a
thread/process pool? Having individual middleware maintain their own
pools is pretty pointless, as I'm sure everyone will agree. On the other
hand, I believe allowing wsgi.executor to be either a process pool or a
thread pool is a recipe for disaster. I'm really not sure where to go
from here.

> On Fri, Jan 7, 2011 at 9:47 AM, Timothy Farrell
> <[hidden email]>  wrote:
>> There has been much discussion about how to handle async in PEP 444 and that discussion centers around the use of futures.  However, I'm requesting that servers _optionally_ provide environ['wsgi.executor'] as a futures executor that applications can use for the purpose of doing something after the response is fully sent to the client.  This is feature request is designed to be concurrency methodology agnostic.
>>
>> Some example use cases are:
>>
>> - send an email that might block on a slow email server (Alice, I read what you said about Turbomail, but one product is not the solution to all situations)
>> - initiate a database vacuum
>> - clean a cache
>> - build a cache
>> - compile statistics
>>
>> When serving pages of an application, these are all things that could be done after the response has been sent.  Ideally these things don't need to be done in a request thread and aren't incredibly time-sensitive.  It seems to me that futures would be an ideal way of handling this.
>>
>> Thoughts?
>> _______________________________________________
>> Web-SIG mailing list
>> [hidden email]
>> Web SIG: http://www.python.org/sigs/web-sig
>> Unsubscribe: http://mail.python.org/mailman/options/web-sig/guido%40python.org
>>
>
>

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 feature request - Futures executor

Alice Bevan–McGregor
On 2011-01-07 16:39:46 -0800, Alex Grönholm said:
> I believe allowing wsgi.executor to be either a process pool or a
> thread pool is a recipe for disaster. I'm really not sure where to go
> from here.

Two solutions:

:: Don't pass an executor to the job such that it can not schedule its
own futures.

:: Mandate a thread pool, not process pool, to avoid deadlocking.

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 feature request - Futures executor

Alice Bevan–McGregor
In reply to this post by Timothy Farrell-2
On 2011-01-07 09:47:12 -0800, Timothy Farrell said:

> However, I'm requesting that servers _optionally_ provide
> environ['wsgi.executor'] as a futures executor that applications can
> use for the purpose of doing something after the response is fully sent
> to the client.  This is feature request is designed to be concurrency
> methodology agnostic.

Done.  (In terms of implementation, not updating PEP 444.)  :3

The Marrow server now implements a thread pool executor using the
concurrent.futures module (or equiv. futures PyPi package).  The
following are the commits; the changes will look bigger than they are
due to cutting and pasting of several previously nested blocks of code
into separate functions for use as callbacks.  100% unit test coverage
is maintained (without errors), an example application is added, and
the benchmark suite updated to support the definition of thread count.

        http://bit.ly/gUL33v
        http://bit.ly/gyVlgQ

Testing this yourself requires Git checkouts of the
marrow.server/threading branch and marrow.server.http/threading branch,
and likely the latest marrow.io from Git as well:

        https://github.com/pulp/marrow.io
        https://github.com/pulp/marrow.server/tree/threaded
        https://github.com/pulp/marrow.server.http/tree/threaded

This update has not been tested under Python 3.x yet; I'll do that
shortly and push any fixes; I doubt there will be any.

On 2011-01-08 03:26:28 -0800, Alice Bevan–McGregor said in the [PEP
444] Future- and Generator-Based Async Idea thread:

> As a side note, I'll be adding threading support to the server... but I
> suspect the overhead will outweigh the benefit for speedy applications.

I was surprisingly quite wrong in this prediction.  The following is
the output of a C25 pair of benchmarks, the first not threaded, the
other with 30 threads  (enough so there would be no waiting).

        https://gist.github.com/770893

The difference is the loss of 60 RSecs out of 3280.  Note that the
implementation I've devised can pass the concurrent.futures executor to
the WSGI application (and, in fact, does), fufilling the requirements
of this discussion.  :D

The use of callbacks internally to the HTTP protocol makes a huge
difference in overhead, I guess.

        - Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 feature request - Futures executor

Timothy Farrell-2
In reply to this post by PJ Eby
----- Original Message -----
From: "P.J. Eby" <[hidden email]>
To: "Timothy Farrell" <[hidden email]>, [hidden email]
Sent: Friday, January 7, 2011 2:14:20 PM
Subject: Re: [Web-SIG] PEP 444 feature request - Futures executor

> There are some other issues that might need to be addressed, like
> maybe adding an attribute or two for the level of reliability
> guaranteed by the executor, or allowing the app to request a given
> reliability level.  Specifically, it might be important to distinguish between:

> * this will be run exactly once as long as the server doesn't crash
> * this will eventually be run once, even if the server suffers a
> fatal error between now and then

> IOW, to indicate whether the thing being done is "transactional", so to speak.

I understand why this would be good (credit card transactions particularly), but how would this play our in the real world?  All servers will do their best to run the jobs given them.  

Are you suggesting that there would be a property of the executor that would change based on the load of the server or some other metric?  Say the server has 100 queued jobs and only 2 worker threads, would it then have a way of saying, "I'll get to this eventually, but I'm pretty swamped."?

Is that what you're getting at or something more like database transactions..."I guarantee that I won't stop halfway through this process."

Thanks,
-t
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 feature request - Futures executor

Timothy Farrell-2
In reply to this post by Timothy Farrell-2
Quartz is certainly powerful, but I think it's outside the scope of something we want in a WSGI spec.  Is there a specific feature you're referring to?

----- Original Message -----
From: "Nam Nguyen" <[hidden email]>
To: "Timothy Farrell" <[hidden email]>
Cc: "P.J. Eby" <[hidden email]>, [hidden email]
Sent: Tuesday, January 11, 2011 2:28:55 AM
Subject: Re: [Web-SIG] PEP 444 feature request - Futures executor

On Tue, Jan 11, 2011 at 10:40 AM, Timothy Farrell
<[hidden email]> wrote:

> ----- Original Message -----
> From: "P.J. Eby" <[hidden email]>
> To: "Timothy Farrell" <[hidden email]>, [hidden email]
> Sent: Friday, January 7, 2011 2:14:20 PM
> Subject: Re: [Web-SIG] PEP 444 feature request - Futures executor
>
>> There are some other issues that might need to be addressed, like
>> maybe adding an attribute or two for the level of reliability
>> guaranteed by the executor, or allowing the app to request a given
>> reliability level.  Specifically, it might be important to distinguish between:
>
>> * this will be run exactly once as long as the server doesn't crash
>> * this will eventually be run once, even if the server suffers a
>> fatal error between now and then
>
>> IOW, to indicate whether the thing being done is "transactional", so to speak.
>
> I understand why this would be good (credit card transactions particularly), but how would this play our in the real world?  All servers will do their best to run the jobs given them.
>
> Are you suggesting that there would be a property of the executor that would change based on the load of the server or some other metric?  Say the server has 100 queued jobs and only 2 worker threads, would it then have a way of saying, "I'll get to this eventually, but I'm pretty swamped."?
>
> Is that what you're getting at or something more like database transactions..."I guarantee that I won't stop halfway through this process."

Maybe in the same vein as Quartz (http://www.quartz-scheduler.org/) in
Java world.

Nam
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: PEP 444 feature request - Futures executor

PJ Eby
In reply to this post by PJ Eby
At 09:11 PM 1/10/2011 -0600, Timothy Farrell wrote:

>PJ,
>
>You seem to be old-hat at this so I'm looking for a little advise as
>I draft this spec.  It seems a bad idea to me to just say
>environ['wsgi.executor'] will be a wrapped futures executor because
>the api handled by the executor can and likely will change over
>time.  Am I write in thinking that a spec should be more specific in
>saying that the executor object will have "these specific methods"
>and so as future changes, the spec is not in danger of invalidation
>due to the changes?

I'd actually just suggest something like:

     future = environ['wsgiorg.future'](func, *args, **kw)

(You need to use the wsgiorg.* namespace for extension proposals like
this, btw.)

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com