A Python Web Application Package and Format

classic Classic list List threaded Threaded
57 messages Options
123
Reply | Threaded
Open this post in threaded view
|

A Python Web Application Package and Format

ianb
Hi all.  I wrote a blog post.  I would be interested in reactions from this crowd.

http://blog.ianbicking.org/2011/03/31/python-webapp-package/

Copied to allow responses:

At PyCon there was an open space about deployment, and the idea of drop-in applications (Java-WAR-style).

I generally get pessimistic about 80% solutions, and dropping in a WAR file feels like an 80% solution to me. I’ve used the Hudson/Jenkins installer (which I think is specifically a project that got WARs on people’s minds), and in a lot of ways that installer is nice, but it’s also kind of wonky, it makes configuration unclear, it’s not always clear when it installs or configures itself through the web, and when you have to do this at the system level, nor is it clear where it puts files and data, etc. So a great initial experience doesn’t feel like a great ongoing experience to me — and it doesn’t have to be that way. If those were necessary compromises, sure, but they aren’t. And because we don’t have WAR files, if we’re proposing to make something new, then we have every opportunity to make things better.

So the question then is what we’re trying to make. To me: we want applications that are easy to install, that are self-describing, self-configuring (or at least guide you through configuration), reliable with respect to their environment (not dependent on system tweaking), upgradable, and respectful of persistence (the data that outlives the application install). A lot of this can be done by the "container" (to use Java parlance; or "environment") — if you just have the app packaged in a nice way, the container (server environment, hosting service, etc) can handle all the system-specific things to make the application actually work.

At which point I am of course reminded of my Silver Lining project, which defines something very much like this. Silver Lining isn’t just an application format, and things aren’t fully extracted along these lines, but it’s pretty close and it addresses a lot of important issues in the lifecycle of an application. To be clear: Silver Lining is an application packaging format, a server configuration library, a cloud server management tool, a persistence management tool, and a tool to manage the application with respect to all these services over time. It is a bunch of things, maybe too many things, so it is not unreasonable to pick out a smaller subset to focus on. Maybe an easy place to start (and good for Silver Lining itself) would be to separate at least the application format (and tools to manage applications in that state, e.g., installing new libraries) from the tools that make use of such applications (deploy, etc).

Some opinions I have on this format, exemplified in Silver Lining:

  • It’s not zipped or a single file, unlike WARs. Uploading zip files is not a great API. Geez. I know there’s this desire to "just drop in a file"; but there’s no getting around the fact that "dropping a file" becomes a deployment protocol and it’s an incredibly impoverished protocol. The format is also not subtly git-based (ala Heroku) because git push is not a good deployment protocol.
  • But of course there isn’t really any deployment protocol inferred by a format anyway, so maybe I’m getting ahead of myself ;) I’m saying a tool that deploys should take as an argument a directory, not a single file. (If the tool then zips it up and uploads it, fine!)
  • Configuration "comes from the outside". That is, an application requests services, and the container tells the application where those services are. For Silver Lining I’ve used environmental variables. I think this one point is really important — the container tells the application. As a counter-example, an application that comes with a Puppet deployment recipe is essentially telling the server how to arrange itself to suit the application. This will never be reliable or simple!
  • The application indicates what "services" it wants; for instance, it may want to have access to a MySQL database. The container then provides this to the application. In practice this means installing the actual packages, but also creating a database and setting up permissions appropriately. The alternative is never having any dependencies, meaning you have to use SQLite databases or ad hoc structures, etc. But in fact installing databases really isn’t that hard these days.
  • All persistence has to use a service of some kind. If you want to be able to write to files, you need to use a file service. This means the container is fully aware of everything the application is leaving behind. All the various paths an application should use are given in different environmental variables (many of which don’t need to be invented anew, e.g., $TMPDIR).
  • It uses vendor libraries exclusively for Python libraries. That means the application bundles all the libraries it requires. Nothing ever gets installed at deploy-time. This is in contrast to using a requirements.txt list of packages at deployment time. If you want to use those tools for development that’s fine, just not for deployment.
  • There is also a way to indicate other libraries you might require; e.g., you might lxml, or even something that isn’t quite a library, like git (if you are making a github clone). You can’t do those as vendor libraries (they include non-portable binaries). Currently in Silver Lining the application description can contain a list of Ubuntu package names to install. Of course that would have to be abstracted some.
  • You can ask for scripts or a request to be invoked for an application after an installation or deployment. It’s lame to try to test if is-this-app-installed on every request, which is the frequent alternative. Also, it gives the application the chance to signal that the installation failed.
  • It has a very simple (possibly/probably too simple) sense of configuration. You don’t have to use this if you make your app self-configuring (i.e., build in a web-accessible settings screen), but in practice it felt like some simple sense of configuration would be helpful.

Things that could be improved:

  • There are some places where you might be encouraged to use routines from the silversupport package. There are very few! But maybe an alternative could be provided for these cases.
  • A little convention-over-configuration is probably suitable for the bundled libraries; silver includes tools to manage things, but it gets a little twisty. When creating a new project I find myself creating several .pth files, special customizing modules, etc. Managing vendor libraries is also not obvious.
  • Services are IMHO quite important and useful, but also need to be carefully specified.
  • There’s a bunch of runtime expectations that aren’t part of the format, but in practice would be part of how the application is written. For instance, I make sure each app has its own temporary directory, and that it is cleared on update. If you keep session files in that location, and you expect the environment to clean up old sessions — well, either all environments should do that, or none should.
  • The process model is not entirely clear. I tried to simply define one process model (unthreaded, multiple processes), but I’m not sure that’s suitable — most notably, multiple processes have a significant memory impact compared to threads. An application should at least be able to indicate what process models it accepts and prefers.
  • Static files are all convention over configuration — you put static files under static/ and then they are available. So static/style.css would be at /style.css. I think this is generally good, but putting all static files under one URL path (e.g., /media/) can be good for other reasons as well. Maybe there should be conventions for both.
  • Cron jobs are important. Though maybe they could just be yet another kind of service? Many extra features could be new services.
  • Logging is also important; Silver Lining attempts to handle that somewhat, but it could be specified much better.
  • Silver Lining also supports PHP, which seemed to cause a bit of stress. But just ignore that. It’s really easy to ignore.

There is a description of the configuration file for apps. The environmental variables are also notably part of the application’s expectations. The file layout is explained (together with a bunch of Silver Lining-specific concepts) in Development Patterns. Besides all that there is admittedly some other stuff that is only really specified in code; but in Silver Lining’s defense, specified in code is better than unspecified ;) App Engine provides another example of an application format, and would be worth using as a point of discussion or contrast (I did that myself when writing Silver Lining).

Discussing WSGI stuff with Ben Bangert at PyCon he noted that he didn’t really feel like the WSGI pieces needed that much more work, or at least that’s not where the interesting work was — the interesting work is in the tooling. An application format could provide a great basis for building this tooling. And I honestly think that the tooling has been held back more by divergent patterns of development than by the difficulty of writing the tools themselves; and a good, general application format could fix that.



_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Daniel Holth-3
+1

I think this is a fantastic idea. In the same way that distutils2 intends to specify a static configuration format for packages, having a good static configuration format for web applitations could make deployment easier while encouraging healthy competition among 'paste deploy' type projects.

I think this is much more interesting than WSGI, since a 5-line back-to-WSGI adapter will likely make caring about any changes entirely optional.

Daniel

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

James Mills-3
+1 too. I would however like to see this idea developed in a generic
and useable way. ie: No zope/twisted deps or making it fit around
Django :)
Ideally it should be useable by the most basic (plain old WSGI).

cheers
James

--
-- James Mills
--
-- "Problems are solved by method"
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Alice Bevan–McGregor
On 2011-04-10 16:25:21 -0700, James Mills said:

> +1 too. I would however like to see this idea developed in a generic
> and useable way. ie: No zope/twisted deps or making it fit around
> Django :)
> Ideally it should be useable by the most basic (plain old WSGI).

The following are the collected ideas of myself and a few other users
in the WebCore chat room:

        https://gist.github.com/911991

Being generic (i.e. using WSGI under-the-hood) and allowing generic
port assignments for other (non-web) networked applications is a design
goal.

The aversion to packaged zips is not entirely understandable to us; in
this case, a packaged copy of the application is produced via a
setup.py command, though in theory one could develop with that model
and just zip everything up in the end by hand.

Silver Lining seems to require too much in the way of hacking
(modifying .pth files, etc) to be reasonable.

        — Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

James Mills-3
On Mon, Apr 11, 2011 at 9:40 AM, Alice Bevan–McGregor
<[hidden email]> wrote:
> The following are the collected ideas of myself and a few other users in the
> WebCore chat room:
>
>        https://gist.github.com/911991

A couple of comments:

4. It would be nice to also support web applications that provide
their own web server (for whatever reason). chroot/jail them into a a
virtualenv of their own (maybe?)

6. It would be nice to also support other standard UNIX-ish logging. eg: syslog

> Being generic (i.e. using WSGI under-the-hood) and allowing generic port
> assignments for other (non-web) networked applications is a design goal.

Good :)

cheers
James

--
-- James Mills
--
-- "Problems are solved by method"
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

ianb
In reply to this post by Alice Bevan–McGregor
On Sun, Apr 10, 2011 at 6:40 PM, Alice Bevan–McGregor <[hidden email]> wrote:
On 2011-04-10 16:25:21 -0700, James Mills said:

+1 too. I would however like to see this idea developed in a generic
and useable way. ie: No zope/twisted deps or making it fit around
Django :)
Ideally it should be useable by the most basic (plain old WSGI).

The following are the collected ideas of myself and a few other users in the WebCore chat room:

       https://gist.github.com/911991

Being generic (i.e. using WSGI under-the-hood) and allowing generic port assignments for other (non-web) networked applications is a design goal.

There's a significant danger that you'll be creating a configuration management tool at that point, not simply a web application description.  The escape valve in Silver Lining for these sort of things is services, which can kind of implement anything, and presumably ad hoc services could be allowed for.
 
The aversion to packaged zips is not entirely understandable to us; in this case, a packaged copy of the application is produced via a setup.py command, though in theory one could develop with that model and just zip everything up in the end by hand.

You create a build process as part of the deployment (and development and everything else), which I think is a bad idea.  My model does not use setup.py as the basis for the process (you could build a tool that uses setup.py, but it would be more a development methodology than a part of the packaging).

Also lots of libraries don't work when zipped, and an application is typically an aggregate of many libraries, so zipping everything just adds a step that probably has to be undone later.  If a deploy process uses zip file that's fine, but adding zipping to deployment processes that don't care for zip files is needless overhead.  A directory of files is the most general case.  It's also something a developer can manipulate, so you don't get a mismatch between developers of applications and people deploying applications -- they can use the exact same system and format.

Silver Lining seems to require too much in the way of hacking (modifying .pth files, etc) to be reasonable.

The pattern that it implements is fairly simple, and in several models you have to lay things out somewhat manually.  I think some more convention and tool support (e.g., in pip) would be helpful.

Though there are quite a few details, the result is more reliable, stable, and easier to audit than anything based on a build process (which any use of "dependencies" would require -- there are *no* dependencies in a Silver Lining package, only the files that are *part* of the package).

Some notes from your link:

- There seems to be both the description of a format, and a program based on that format, but it's not entirely clear where the boundary is.  I think it's useful to think in terms of a format and a reference implementation of particular tools that use that format (development management tools, like installing into the format; deployment tools; testing tools; local serving tools; etc).
- In Silver Lining I felt no need at all for shared libraries.  Some disk space can be saved with clever management (hard links), but only when it's entirely clear that it's just an optimization.  Adding a concept like "server-packages" adds a lot of operational complexity and room for bugs without any real advantages.
- I avoided exposing the concept of daemonization because it's not really an application concern; or at least it certainly is not appropriate for a WSGI application.  There are other applications that might need this, mostly because they have no standard protocol equivalent to WSGI, but a generic container is almost certain to be of higher quality and better situated to its environment than a generic daemon.  (PID files, ugh)  At least supervisord I think has a better representation of how to express daemon configuration, but still I'm not a big fan of exposing this until it really feels necessary.
- All dependencies are always version-sensitive; I think it's delusional that people think otherwise.  Build the tooling to manage that process (e.g., finding and testing newer versions), not the deployment.
- I try to avoid error conditions in the deployment, which is a big part of not having any build process involved, as build processes are a source of constant errors -- you can do a stage deployment, then five minutes later do a production deployment, and if you have a build process there is a significant chance that the two won't match.

  Ian


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Alice Bevan–McGregor
Howdy!

On 2011-04-10 19:06:52 -0700, Ian Bicking said:

> There's a significant danger that you'll be creating a configuration
> management tool at that point, not simply a web application description.

Unless you have the tooling to manage the applications, there's no
point having a "standard" for them.  Part of that tooling will be some
form of configuration management allowing you to determine the
requirements and configuration of an application /prior/ to
installation.  Better to have an application rejected up-front ("Hey,
this needs my social insurance number? Hells no!") then after it's
already been extracted and potentially littered the landscape with its
children.

> The escape valve in Silver Lining for these sort of things is services,
> which can kind of implement anything, and presumably ad hoc services
> could be allowed for.

Generic services are useful, but not useful enough.

> You create a build process as part of the deployment (and development
> and everything else), which I think is a bad idea.

Please elaborate.  There is no requirement for you to use the
"application packaging format" and associated tools (such as an
application server) during development.  In fact, like 2to3, that type
of process would only slow things down to the point of uselessness.  
That's not what I'm suggesting at all.

> My model does not use setup.py as the basis for the process (you could
> build a tool that uses setup.py, but it would be more a development
> methodology than a part of the packaging).

I know.  And the end result is you may have to massage .pth files
yourself.  If a tool requires you to, at any point during "normal
operation", hand modify internal files… that tool has failed at its
job.  One does not go mucking about in your Git repo's .git/ folder, as
an example.

How do you build a release and upload it to PyPi?  Upload docs to
packages.python.org?  setup.py commands.  It's a convienent hook with
access to metadata in a convienent way that would make an excellent
"let's make a release!" type of command.

> Also lots of libraries don't work when zipped, and an application is
> typically an aggregate of many libraries, so zipping everything just
> adds a step that probably has to be undone later.

Of course it has to be un-done later.  I had thought I had made that
quite clear in the gist.  (Core Operation, point 1, possibly others.)

> If a deploy process uses zip file that's fine, but adding zipping to
> deployment processes that don't care for zip files is needless
> overhead.  A directory of files is the most general case.  It's also
> something a developer can manipulate, so you don't get a mismatch
> between developers of applications and people deploying applications --
> they can use the exact same system and format.

So, how do you push the updated application around?  Using a full
directory tree leaves you with Rsync and SFTP, possibly various SCM
methods, but then you'd need a distinct repo (or rootless branch) just
for releasing and you've already mentioned your dislike for SCM-based
deployment models.

Zip files are universal -- to the point that most modern operating
systems treat zip files /as folders/.  If you have to, consider it a
transport encoding.

> The pattern that it implements is fairly simple, and in several models
> you have to lay things out somewhat manually.  I think some more
> convention and tool support (e.g., in pip) would be helpful.

+1

> Though there are quite a few details, the result is more reliable,
> stable, and easier to audit than anything based on a build process
> (which any use of "dependencies" would require -- there are *no*
> dependencies in a Silver Lining package, only the files that are *part*
> of the package).

It might be just me (and the other people who seem to enjoy WebCore and
Marrow) but it is fully possible to do install-time dependencies in
such a way as things won't break accidentally.  Also, you missed
Application Spec #4.

> Some notes from your link:
>
> - There seems to be both the description of a format, and a program
> based on that format, but it's not entirely clear where the boundary
> is.  I think it's useful to think in terms of a format and a reference
> implementation of particular tools that use that format (development
> management tools, like installing into the format; deployment tools;
> testing tools; local serving tools; etc).

Indeed; this gist was some really quickly hacked together ideas.

> - In Silver Lining I felt no need at all for shared libraries.  Some
> disk space can be saved with clever management (hard links), but only
> when it's entirely clear that it's just an optimization.  Adding a
> concept like "server-packages" adds a lot of operational complexity and
> room for bugs without any real advantages.

±0

> - I try to avoid error conditions in the deployment, which is a big
> part of not having any build process involved, as build processes are a
> source of constant errors -- you can do a stage deployment, then five
> minutes later do a production deployment, and if you have a build
> process there is a significant chance that the two won't match.

I have never, in my life, encountered that particular problem.  I may
be more careful than most in defining dependencies with version number
boundaries, I may be more careful in utilizing my own package
repository (vs. the public PyPi), but I don't think I'm unique in
having few to no issues in development/sandbox/production deployment
processes.

Hell, I'm still able to successfully deploy a TurboGears 0.9
application without dependency issues.

However, the package format I describe in that gist does include the
source for the dependencies as "snapshotted" during bundling.  If your
application is working in development, after snapshotting it /will/
work on sandbox or production deployments.

        — Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Eric Larson-5
Hi,

On Apr 10, 2011, at 10:29 PM, Alice Bevan–McGregor wrote:

> However, the package format I describe in that gist does include the source for the dependencies as "snapshotted" during bundling.  If your application is working in development, after snapshotting it /will/ work on sandbox or production deployments.

I wanted to chime in on this one aspect b/c I think the concept is somewhat flawed. If your application is working in development and "snapshot" the dependencies that is no guarantee that things will work in production. The only way to say that snapshot or bundle is guaranteed to work is if you snapshot the entire system and make it available as a production system.

Using a real world example, say you develop your application on OS X and you deploy on Ubuntu 8.04 LTS. Right away you are dealing with two different operating systems with entirely different system calls. If you use something like lxml and simplejson, you have no choice but to repackage or install from source on the production server. While it is fair to say that generally you could avoid packages that don't use C, both lxml and simplejson are rather obvious choices for web development. You could use the json module and ElementTree, but if you want more speed (and who doesn't like to go fast!), lxml and simplejson are both better options.

It sounds like Ian doesn't want to have any build steps which I think is a bad mantra. A build step lets you prepare things for deployment. A deployment package is different than a development package and mixing the two by forcing builds on the server or seems like asking for trouble. I'm not saying this is what you (Alice) are suggesting, but rather pointing out that as a model, depending on virtualenv + pip's bundling capabilities seems slightly flawed.

Personally, and I don't expect folks to take my opinions very seriously b/c I haven't offered any code, what I'd like to see is a simple format that helps install and uninstall web applications. I think it should offer hooks for running tests, learning basic status and allow simple configuration for typical sysadmin needs (logging via syslog, process management, nagios checks, etc.). Instead of focusing on what format that should take in terms of packages, it seems more effective to spend time defining a standard means of managing WSGI apps and piggyback or plain old copy some format like RPMs or dpkg.

Just my .02. Again, I haven't offered code, so feel free to ignore me. But I do hope that if there are others that suspect this model of putting source on the server is a problem pipe up. If I were to add a requirement it would be that Python web applications help system administrators become more effective. That means finding consistent ways of deploying apps that plays well with other languages / platforms. After all, keeping a C compiler on a public server is rarely a good idea.

Eric

>
> — Alice.
>
>
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/eric%40ionrock.org

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Ionel Cristian Mărieș
In reply to this post by ianb
Hello,

I have few comments:
  • That file layout basically forces you to have your development environment as close to the production environment. This is especially visible if you're relying on python c extensions. Since you don't want to have the same environment constraints as appengine it should be more flexible in this regard and offer a way to generate the project dependencies somewhere else than the depeloper's machine.
  • There's no builtin support for logging configuration.
  • The update_fetch feels like a hack as it's not extensible to do lifecycle (hooks for shutdown, start, etc). Also, it's shouldn't be a application url because you'd want to run a hook before starting it or after stopping it. I guess you could accomplish that with a wsgi wrapper but there should be a clear separation between the app and hooks that manage the app.
  • I'm not entirely clear on why you avoid a build process (war-like) prior to deployment. It works fine for appengine - but you don't have it's constraints.
-- Ionel



On Fri, Apr 1, 2011 at 23:55, Ian Bicking <[hidden email]> wrote:
Hi all.  I wrote a blog post.  I would be interested in reactions from this crowd.

http://blog.ianbicking.org/2011/03/31/python-webapp-package/

Copied to allow responses:

At PyCon there was an open space about deployment, and the idea of drop-in applications (Java-WAR-style).

I generally get pessimistic about 80% solutions, and dropping in a WAR file feels like an 80% solution to me. I’ve used the Hudson/Jenkins installer (which I think is specifically a project that got WARs on people’s minds), and in a lot of ways that installer is nice, but it’s also kind of wonky, it makes configuration unclear, it’s not always clear when it installs or configures itself through the web, and when you have to do this at the system level, nor is it clear where it puts files and data, etc. So a great initial experience doesn’t feel like a great ongoing experience to me — and it doesn’t have to be that way. If those were necessary compromises, sure, but they aren’t. And because we don’t have WAR files, if we’re proposing to make something new, then we have every opportunity to make things better.

So the question then is what we’re trying to make. To me: we want applications that are easy to install, that are self-describing, self-configuring (or at least guide you through configuration), reliable with respect to their environment (not dependent on system tweaking), upgradable, and respectful of persistence (the data that outlives the application install). A lot of this can be done by the "container" (to use Java parlance; or "environment") — if you just have the app packaged in a nice way, the container (server environment, hosting service, etc) can handle all the system-specific things to make the application actually work.

At which point I am of course reminded of my Silver Lining project, which defines something very much like this. Silver Lining isn’t just an application format, and things aren’t fully extracted along these lines, but it’s pretty close and it addresses a lot of important issues in the lifecycle of an application. To be clear: Silver Lining is an application packaging format, a server configuration library, a cloud server management tool, a persistence management tool, and a tool to manage the application with respect to all these services over time. It is a bunch of things, maybe too many things, so it is not unreasonable to pick out a smaller subset to focus on. Maybe an easy place to start (and good for Silver Lining itself) would be to separate at least the application format (and tools to manage applications in that state, e.g., installing new libraries) from the tools that make use of such applications (deploy, etc).

Some opinions I have on this format, exemplified in Silver Lining:

  • It’s not zipped or a single file, unlike WARs. Uploading zip files is not a great API. Geez. I know there’s this desire to "just drop in a file"; but there’s no getting around the fact that "dropping a file" becomes a deployment protocol and it’s an incredibly impoverished protocol. The format is also not subtly git-based (ala Heroku) because git push is not a good deployment protocol.
  • But of course there isn’t really any deployment protocol inferred by a format anyway, so maybe I’m getting ahead of myself ;) I’m saying a tool that deploys should take as an argument a directory, not a single file. (If the tool then zips it up and uploads it, fine!)
  • Configuration "comes from the outside". That is, an application requests services, and the container tells the application where those services are. For Silver Lining I’ve used environmental variables. I think this one point is really important — the container tells the application. As a counter-example, an application that comes with a Puppet deployment recipe is essentially telling the server how to arrange itself to suit the application. This will never be reliable or simple!
  • The application indicates what "services" it wants; for instance, it may want to have access to a MySQL database. The container then provides this to the application. In practice this means installing the actual packages, but also creating a database and setting up permissions appropriately. The alternative is never having any dependencies, meaning you have to use SQLite databases or ad hoc structures, etc. But in fact installing databases really isn’t that hard these days.
  • All persistence has to use a service of some kind. If you want to be able to write to files, you need to use a file service. This means the container is fully aware of everything the application is leaving behind. All the various paths an application should use are given in different environmental variables (many of which don’t need to be invented anew, e.g., $TMPDIR).
  • It uses vendor libraries exclusively for Python libraries. That means the application bundles all the libraries it requires. Nothing ever gets installed at deploy-time. This is in contrast to using a requirements.txt list of packages at deployment time. If you want to use those tools for development that’s fine, just not for deployment.
  • There is also a way to indicate other libraries you might require; e.g., you might lxml, or even something that isn’t quite a library, like git (if you are making a github clone). You can’t do those as vendor libraries (they include non-portable binaries). Currently in Silver Lining the application description can contain a list of Ubuntu package names to install. Of course that would have to be abstracted some.
  • You can ask for scripts or a request to be invoked for an application after an installation or deployment. It’s lame to try to test if is-this-app-installed on every request, which is the frequent alternative. Also, it gives the application the chance to signal that the installation failed.
  • It has a very simple (possibly/probably too simple) sense of configuration. You don’t have to use this if you make your app self-configuring (i.e., build in a web-accessible settings screen), but in practice it felt like some simple sense of configuration would be helpful.

Things that could be improved:

  • There are some places where you might be encouraged to use routines from the silversupport package. There are very few! But maybe an alternative could be provided for these cases.
  • A little convention-over-configuration is probably suitable for the bundled libraries; silver includes tools to manage things, but it gets a little twisty. When creating a new project I find myself creating several .pth files, special customizing modules, etc. Managing vendor libraries is also not obvious.
  • Services are IMHO quite important and useful, but also need to be carefully specified.
  • There’s a bunch of runtime expectations that aren’t part of the format, but in practice would be part of how the application is written. For instance, I make sure each app has its own temporary directory, and that it is cleared on update. If you keep session files in that location, and you expect the environment to clean up old sessions — well, either all environments should do that, or none should.
  • The process model is not entirely clear. I tried to simply define one process model (unthreaded, multiple processes), but I’m not sure that’s suitable — most notably, multiple processes have a significant memory impact compared to threads. An application should at least be able to indicate what process models it accepts and prefers.
  • Static files are all convention over configuration — you put static files under static/ and then they are available. So static/style.css would be at /style.css. I think this is generally good, but putting all static files under one URL path (e.g., /media/) can be good for other reasons as well. Maybe there should be conventions for both.
  • Cron jobs are important. Though maybe they could just be yet another kind of service? Many extra features could be new services.
  • Logging is also important; Silver Lining attempts to handle that somewhat, but it could be specified much better.
  • Silver Lining also supports PHP, which seemed to cause a bit of stress. But just ignore that. It’s really easy to ignore.

There is a description of the configuration file for apps. The environmental variables are also notably part of the application’s expectations. The file layout is explained (together with a bunch of Silver Lining-specific concepts) in Development Patterns. Besides all that there is admittedly some other stuff that is only really specified in code; but in Silver Lining’s defense, specified in code is better than unspecified ;) App Engine provides another example of an application format, and would be worth using as a point of discussion or contrast (I did that myself when writing Silver Lining).

Discussing WSGI stuff with Ben Bangert at PyCon he noted that he didn’t really feel like the WSGI pieces needed that much more work, or at least that’s not where the interesting work was — the interesting work is in the tooling. An application format could provide a great basis for building this tooling. And I honestly think that the tooling has been held back more by divergent patterns of development than by the difficulty of writing the tools themselves; and a good, general application format could fix that.



_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/ionel.mc%40gmail.com



_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Daniel Holth-3
In reply to this post by Alice Bevan–McGregor
We have more than 3 implementations of this idea, the Python Web Application Package and Format or WAPAF, including Java's WAR files, Google App Engine, silverlining. Let's review the WAR file, approximately:

(static files, .jsp)
WEB-INF/web.xml
WEB-INF/classes/org/example/myapplication.class
WEB-INF/lib/some-library.jar
WEB-INF/lib/145-other-libraries.jar

Build the .war file, copy to server, done (ideally). Your program should require a standard Java installation plus whatever's in the .war file. The .war file is a .zip that follows certain conventions.

In practice you might develop in and deploy exploded .war files which are exactly the same thing but unzipped.

Since it's Java there is no classes/SQLAlchemy/src/sqlalchemy/__init__.py; the path for the code always starts at classes/, not at some arbitrary set of subdirectories under classes/

installation.  Better to have an application rejected up-front ("Hey,
this needs my social insurance number? Hells no!") then after it's
already been extracted and potentially littered the landscape with its
children.

Part of the potential win here is that the application need not litter anything. Like GAE, the server might keep all the previous versions you've uploaded and let you pick which one you want today. You shouldn't have to think about the state the server.

> My model does not use setup.py as the basis for the process (you could
> build a tool that uses setup.py, but it would be more a development
> methodology than a part of the packaging).

I know.  And the end result is you may have to massage .pth files
yourself.  If a tool requires you to, at any point during "normal
operation", hand modify internal files… that tool has failed at its
job.  One does not go mucking about in your Git repo's .git/ folder, as
an example.

If I read the silverlining documentation correctly the .pth is created manually in the example only because there was no 'setup.py' to 'pip install -e'. As an alternative the spec could only add particular directories to PYTHONPATH. This might be a distutils2 thing.

How do you build a release and upload it to PyPi?  Upload docs to
packages.python.org?  setup.py commands.  It's a convienent hook with
access to metadata in a convienent way that would make an excellent
"let's make a release!" type of command.

setup.py should go away. The distutils2 talk from pycon 2011 explains. http://blip.tv/file/4880990

It might be just me (and the other people who seem to enjoy WebCore and
Marrow) but it is fully possible to do install-time dependencies in
such a way as things won't break accidentally.  Also, you missed
Application Spec #4.

It is important that the WAPAF work with RSYNC. Just move the install-time dependencies part into the 'building the WAPAF' stage of the process and we are on the same page. This supports e.g. the 'server notices application is popular, spins up a new instance, and uses RSYNC to deploy the application onto the new server' use case, or perhaps 'the server isn't running at all, but you can deploy, and it will get around to starting your application when it is turned on'.

> if you have a build process there is a significant chance that the two won't match.

I have never, in my life, encountered that particular problem.

It does exist.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Alice Bevan–McGregor
In reply to this post by Eric Larson-5
On 2011-04-11 00:53:02 -0700, Eric Larson said:

> Hi,
> On Apr 10, 2011, at 10:29 PM, Alice Bevan–McGregor wrote:
>
>> However, the package format I describe in that gist does include the
>> source for the dependencies as "snapshotted" during bundling.  If your
>> application is working in development, after snapshotting it /will/
>> work on sandbox or production deployments.
>
> I wanted to chime in on this one aspect b/c I think the concept is
> somewhat flawed. If your application is working in development and
> "snapshot" the dependencies that is no guarantee that things will work
> in production. The only way to say that snapshot or bundle is
> guaranteed to work is if you snapshot the entire system and make it
> available as a production system.

`pwaf bundle` bundles the source tarballs, effectively, of your
application and dependencies into a single file.  Not unlike a certain
feature of pip.

And… wait, am I the only one who uses built-from-snapshot virtual
servers for sandbox and production deployment?  I can't be the only one
who likes things to work as expected.

> Using a real world example, say you develop your application on OS X
> and you deploy on Ubuntu 8.04 LTS. Right away you are dealing with two
> different operating systems with entirely different system calls. If
> you use something like lxml and simplejson, you have no choice but to
> repackage or install from source on the production server.

Installing from source is what I was suggesting.  Also, Ubuntu on a
server?  All your `linux single` (root) are belong to me.  ;^P

> While it is fair to say that generally you could avoid packages that
> don't use C, both lxml and simplejson are rather obvious choices for
> web development.

Except that json is built-in in 2.6 (admittedly with fewer features,
but I've never needed the extras) and there are alternate xml parsers,
too.

> It sounds like Ian doesn't want to have any build steps which I think
> is a bad mantra. A build step lets you prepare things for deployment. A
> deployment package is different than a development package and mixing
> the two by forcing builds on the server or seems like asking for
> trouble.

I'm having difficulty following this statement: build steps good,
building on server bad?  So I take it you know the exact target
architecture and have cross-compilers installed in your development
environment?  That's not practical (or simple) at all!

> I'm not saying this is what you (Alice) are suggesting, but rather
> pointing out that as a model, depending on virtualenv + pip's bundling
> capabilities seems slightly flawed.

Virtualenv (or something utilizing a similar Python path 'chrooting'
capability) and pip using the extracted "deps" as the source for
"offline" installation actually seems quite reasonable to me.  The
benefit of a known set of working packages (i.e. specific version
numbers, tested in development) and the ability to compile C extensions
in-place.  (Because sure as hell you can't reliably compile them
before-hand if they have any form of system library dependency!)

> I think it should offer hooks for running tests, learning basic status
> and allow simple configuration for typical sysadmin needs (logging via
> syslog, process management, nagios checks, etc.). Instead of focusing
> on what format that should take in terms of packages, it seems more
> effective to spend time defining a standard means of managing WSGI apps
> and piggyback or plain old copy some format like RPMs or dpkg.

RPMs are terrible, dpkg is terrible.  Binary package distribution, in
general, is terrible.  I got the distinct impression at PyCon that
binary distributable .eggs were thought of as terrible and should be
phased out.

Also, nobody so far seems to have noticed the centralized logging
management or deamon management lines from my notes.

> Just my .02. Again, I haven't offered code, so feel free to ignore me.
> But I do hope that if there are others that suspect this model of
> putting source on the server is a problem pipe up. If I were to add a
> requirement it would be that Python web applications help system
> administrators become more effective. That means finding consistent
> ways of deploying apps that plays well with other languages /
> platforms. After all, keeping a C compiler on a public server is rarely
> a good idea.

If you could demonstrate a fool-proof way to install packages with
system library dependencies using cross-compilation from a remote
machine, I'm all ears.  ;)

        — Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Alex Grönholm-3
11.04.2011 22:48, Alice Bevan–McGregor kirjoitti:

> On 2011-04-11 00:53:02 -0700, Eric Larson said:
>
>> Hi,
>> On Apr 10, 2011, at 10:29 PM, Alice Bevan–McGregor wrote:
>>
>>> However, the package format I describe in that gist does include the
>>> source for the dependencies as "snapshotted" during bundling.  If
>>> your application is working in development, after snapshotting it
>>> /will/ work on sandbox or production deployments.
>>
>> I wanted to chime in on this one aspect b/c I think the concept is
>> somewhat flawed. If your application is working in development and
>> "snapshot" the dependencies that is no guarantee that things will
>> work in production. The only way to say that snapshot or bundle is
>> guaranteed to work is if you snapshot the entire system and make it
>> available as a production system.
>
> `pwaf bundle` bundles the source tarballs, effectively, of your
> application and dependencies into a single file.  Not unlike a certain
> feature of pip.
>
> And… wait, am I the only one who uses built-from-snapshot virtual
> servers for sandbox and production deployment?  I can't be the only
> one who likes things to work as expected.
>
>> Using a real world example, say you develop your application on OS X
>> and you deploy on Ubuntu 8.04 LTS. Right away you are dealing with
>> two different operating systems with entirely different system calls.
>> If you use something like lxml and simplejson, you have no choice but
>> to repackage or install from source on the production server.
>
> Installing from source is what I was suggesting.  Also, Ubuntu on a
> server?  All your `linux single` (root) are belong to me.  ;^P
I use Ubuntu on all my servers, and "linux single" does not work with
it, I can tell you ;P

>
>> While it is fair to say that generally you could avoid packages that
>> don't use C, both lxml and simplejson are rather obvious choices for
>> web development.
>
> Except that json is built-in in 2.6 (admittedly with fewer features,
> but I've never needed the extras) and there are alternate xml parsers,
> too.
>
>> It sounds like Ian doesn't want to have any build steps which I think
>> is a bad mantra. A build step lets you prepare things for deployment.
>> A deployment package is different than a development package and
>> mixing the two by forcing builds on the server or seems like asking
>> for trouble.
>
> I'm having difficulty following this statement: build steps good,
> building on server bad?  So I take it you know the exact target
> architecture and have cross-compilers installed in your development
> environment?  That's not practical (or simple) at all!
>
>> I'm not saying this is what you (Alice) are suggesting, but rather
>> pointing out that as a model, depending on virtualenv + pip's
>> bundling capabilities seems slightly flawed.
>
> Virtualenv (or something utilizing a similar Python path 'chrooting'
> capability) and pip using the extracted "deps" as the source for
> "offline" installation actually seems quite reasonable to me.  The
> benefit of a known set of working packages (i.e. specific version
> numbers, tested in development) and the ability to compile C
> extensions in-place.  (Because sure as hell you can't reliably compile
> them before-hand if they have any form of system library dependency!)
>
>> I think it should offer hooks for running tests, learning basic
>> status and allow simple configuration for typical sysadmin needs
>> (logging via syslog, process management, nagios checks, etc.).
>> Instead of focusing on what format that should take in terms of
>> packages, it seems more effective to spend time defining a standard
>> means of managing WSGI apps and piggyback or plain old copy some
>> format like RPMs or dpkg.
>
> RPMs are terrible, dpkg is terrible.  Binary package distribution, in
> general, is terrible.  I got the distinct impression at PyCon that
> binary distributable .eggs were thought of as terrible and should be
> phased out.
>
> Also, nobody so far seems to have noticed the centralized logging
> management or deamon management lines from my notes.
>
>> Just my .02. Again, I haven't offered code, so feel free to ignore
>> me. But I do hope that if there are others that suspect this model of
>> putting source on the server is a problem pipe up. If I were to add a
>> requirement it would be that Python web applications help system
>> administrators become more effective. That means finding consistent
>> ways of deploying apps that plays well with other languages /
>> platforms. After all, keeping a C compiler on a public server is
>> rarely a good idea.
>
> If you could demonstrate a fool-proof way to install packages with
> system library dependencies using cross-compilation from a remote
> machine, I'm all ears.  ;)
>
>     — Alice.
>
>
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

ianb
In reply to this post by Alice Bevan–McGregor
On Sun, Apr 10, 2011 at 10:29 PM, Alice Bevan–McGregor <[hidden email]> wrote:
Howdy!


On 2011-04-10 19:06:52 -0700, Ian Bicking said:

There's a significant danger that you'll be creating a configuration management tool at that point, not simply a web application description.

Unless you have the tooling to manage the applications, there's no point having a "standard" for them.  Part of that tooling will be some form of configuration management allowing you to determine the requirements and configuration of an application /prior/ to installation.  Better to have an application rejected up-front ("Hey, this needs my social insurance number? Hells no!") then after it's already been extracted and potentially littered the landscape with its children.

I... think we are misunderstanding each other or something.

A nice tool that could use this format, for instance, would be a tool that takes an app and creates a puppet recipe to setup a sever to host the application.  A different tool (maybe better, maybe not?) would be a puppet plugin (if that's the terminology) that uses this format to tell puppet about all the requirements an application has, perhaps translating some notions to puppet-native concepts, or adding high-level recipes that setup an appropriate container (which can be as simple as a properly configured Nginx or Apache server).

What I mean when I say there's a danger of becoming a configuration management tool, is that if you include hooks for the application to configure its environment you are probably stepping on the toes of whatever other tool you might use.  And once you start down that path things tend to cascade.

 

The escape valve in Silver Lining for these sort of things is services, which can kind of implement anything, and presumably ad hoc services could be allowed for.

Generic services are useful, but not useful enough.


You create a build process as part of the deployment (and development and everything else), which I think is a bad idea.

Please elaborate.  There is no requirement for you to use the "application packaging format" and associated tools (such as an application server) during development.  In fact, like 2to3, that type of process would only slow things down to the point of uselessness.  That's not what I'm suggesting at all.

If you include something in the packaging format that indicates the libraries to be installed, then you are encouraging and perhaps requiring that the server install libraries during a deployment.

Realistically this can't be entirely avoided, but I think it is a pretty workable separation to declare only those dependencies that can't reasonably be included directly in the application itself (e.g., lxml, MySQLdb, git, and so on).  In Silver Lining those dependencies were expressed as Debian package names, installed via dpkg, but for a more general system it would need to be somewhat more abstract.  But several configuration management tools have managed that abstraction already, so it seems feasible to handle this declaratively.


My model does not use setup.py as the basis for the process (you could build a tool that uses setup.py, but it would be more a development methodology than a part of the packaging).

I know.  And the end result is you may have to massage .pth files yourself.  If a tool requires you to, at any point during "normal operation", hand modify internal files… that tool has failed at its job.  One does not go mucking about in your Git repo's .git/ folder, as an example.

.pth files aren't exactly an "internal file" -- they are documented feature of Python.  And .git/config is also a human-readable/editable file!

But I did note that the setup in Silver Lining was a bit too primitive.  Not *quite* as primitive as App Engine, but close.  I think it would be better to have a convention like adding lib/python/ to the path automatically.  If you want, for example, src/myapp to also be added to the path then I don't think there's anything wrong with using a .pth file to do that; that's what they were created to do!
 
How do you build a release and upload it to PyPi?  Upload docs to packages.python.org?  setup.py commands.  It's a convienent hook with access to metadata in a convienent way that would make an excellent "let's make a release!" type of command.


Also lots of libraries don't work when zipped, and an application is typically an aggregate of many libraries, so zipping everything just adds a step that probably has to be undone later.

Of course it has to be un-done later.  I had thought I had made that quite clear in the gist.  (Core Operation, point 1, possibly others.)


If a deploy process uses zip file that's fine, but adding zipping to deployment processes that don't care for zip files is needless overhead.  A directory of files is the most general case.  It's also something a developer can manipulate, so you don't get a mismatch between developers of applications and people deploying applications -- they can use the exact same system and format.

So, how do you push the updated application around?  Using a full directory tree leaves you with Rsync and SFTP, possibly various SCM methods, but then you'd need a distinct repo (or rootless branch) just for releasing and you've already mentioned your dislike for SCM-based deployment models.

Zip files are universal -- to the point that most modern operating systems treat zip files /as folders/.  If you have to, consider it a transport encoding.

I guess I envision tools that specifically understand this format, not using ad hoc tools to move stuff around.  A tool that "understands" this format could be as simple as:

#!/bin/sh
T=$(tempfile).zip
NAME=$(python -c "import webpkg, sys
print webpkg.parse(sys.argv[1]).name" "$1") 
zip -r $T "$1"
scp $T "$2":/var/server/apps && rm $T


Now there's a lot more features that I'd want than that script could handle, but it might be nice for some people.  But I would not be opposed to asking tools to understand zip files, so long as they understand directories too.  That would introduce a few open issues, like whether symlinks are supported, or perhaps other details where zip is less expressive than files.
 

The pattern that it implements is fairly simple, and in several models you have to lay things out somewhat manually.  I think some more convention and tool support (e.g., in pip) would be helpful.

+1


Though there are quite a few details, the result is more reliable, stable, and easier to audit than anything based on a build process (which any use of "dependencies" would require -- there are *no* dependencies in a Silver Lining package, only the files that are *part* of the package).

It might be just me (and the other people who seem to enjoy WebCore and Marrow) but it is fully possible to do install-time dependencies in such a way as things won't break accidentally.  Also, you missed Application Spec #4.

OK; then #4 is is the only thing I would choose to support, as it is the most general and easiest for tools to support, and least likely to lead to different behavior with different tools.  And not to just defer to authority, but having written a half dozen tools in this area, not all of them successful, I feel strongly that including dependencies is best -- simplest for both producer and consumer, and most reliable.
 

Some notes from your link:

- There seems to be both the description of a format, and a program based on that format, but it's not entirely clear where the boundary is.  I think it's useful to think in terms of a format and a reference implementation of particular tools that use that format (development management tools, like installing into the format; deployment tools; testing tools; local serving tools; etc).

Indeed; this gist was some really quickly hacked together ideas.


- In Silver Lining I felt no need at all for shared libraries.  Some disk space can be saved with clever management (hard links), but only when it's entirely clear that it's just an optimization.  Adding a concept like "server-packages" adds a lot of operational complexity and room for bugs without any real advantages.

±0


- I try to avoid error conditions in the deployment, which is a big part of not having any build process involved, as build processes are a source of constant errors -- you can do a stage deployment, then five minutes later do a production deployment, and if you have a build process there is a significant chance that the two won't match.

I have never, in my life, encountered that particular problem.  I may be more careful than most in defining dependencies with version number boundaries, I may be more careful in utilizing my own package repository (vs. the public PyPi), but I don't think I'm unique in having few to no issues in development/sandbox/production deployment processes.

Well, lots (and lots and lots) of other people have ;)  Also lots of these other techniques require consistency between development and deployment (for instance, using the same private package repository).  This is fine when you are careful and consider any failures to be of your own making, but if you are deploying someone else's application you won't feel the same way, and may make different kinds of mistakes.

A perhaps implicit goal in my mind is allowing people to deploy applications that they did not write, nor where they care about the implementation of the app itself.  A lot of things are different when you separate out the developer's knowledge from the deployers.


Hell, I'm still able to successfully deploy a TurboGears 0.9 application without dependency issues.

However, the package format I describe in that gist does include the source for the dependencies as "snapshotted" during bundling.  If your application is working in development, after snapshotting it /will/ work on sandbox or production deployments.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Alice Bevan–McGregor
In reply to this post by Alex Grönholm-3
On 2011-04-11 13:49:20 -0700, Alex Grönholm said:

> I use Ubuntu on all my servers, and "linux single" does not work with
> it, I can tell you ;P

The number of poorly configured Ubuntu servers I have seen (and
replaced) is staggering.  Any time the barrier to entry is lowered,
quality suffers: having a compiler on the server is nothing compared to
having a complete X graphical environment running as root, with root
and a single user sharing the same password.  ;^D

        — Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

ianb
In reply to this post by Ionel Cristian Mărieș
On Mon, Apr 11, 2011 at 2:56 AM, Ionel Maries Cristian <ionel.mc@gmail.com> wrote:
Hello,

I have few comments:
  • That file layout basically forces you to have your development environment as close to the production environment. This is especially visible if you're relying on python c extensions. Since you don't want to have the same environment constraints as appengine it should be more flexible in this regard and offer a way to generate the project dependencies somewhere else than the depeloper's machine.
Yes; in this case in Silver Lining I have allowed non-portable libraries to be declared as dependencies, and then the deployment tool ensures they are installed.
 
  • There's no builtin support for logging configuration.
This would be useful, yes; though I think the format itself would mostly want to declare how it logs and then deployment tools could try to configure that.  E.g., it would be useful to have a list of logging names that an app uses.  The actual configuration is deployment-specific, so shouldn't be inside the application format itself.
 
  • The update_fetch feels like a hack as it's not extensible to do lifecycle (hooks for shutdown, start, etc). Also, it's shouldn't be a application url because you'd want to run a hook before starting it or after stopping it. I guess you could accomplish that with a wsgi wrapper but there should be a clear separation between the app and hooks that manage the app.
In Silver Lining you can also do scripts; I started with URLs because it was simpler on the implementation side, but scripts have generally been easier to develop, so at least the default could be revisited.

At least in the case of mod_wsgi there isn't a very good definition of shutdown and start.  There's the runner itself, that imports the WSGI application -- this is always run on start, but it's the start of the worker process, not necessarily the server process (IMHO "starting the server process" is an internal implementation detail we should not expose).  Silver Lining also tries to import a silvercustomize module, which is kind of a universal initialization (also imported for tests, etc).  atexit can be used to run stuff on process shutdown.  I don't really see a compelling benefit to another process shutdown technique.  It seems perhaps reasonable to have something that is run when the actual application instance is shut down, but I've never personally needed that in practice.  Of course other configuration settings could be added for different states if they were reasonably universal states and there was a real need for those.
 
  • I'm not entirely clear on why you avoid a build process (war-like) prior to deployment. It works fine for appengine - but you don't have it's constraints.
In my own experience with App Engine I found it to be a useful constraint -- it was not particularly hard to get around (at least if you understand the relevant tools) and while App Engine has annoying constraints this wasn't one of them.  Of course I couldn't use lxml at all on App Engine, and I agree we shouldn't accept that constraint, but for the majority of libraries that are portable this isn't a constraint.

  Ian


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Eric Larson-5
In reply to this post by Alice Bevan–McGregor

On Apr 11, 2011, at 2:48 PM, Alice Bevan–McGregor wrote:

On 2011-04-11 00:53:02 -0700, Eric Larson said:

Hi,
On Apr 10, 2011, at 10:29 PM, Alice Bevan–McGregor wrote:
However, the package format I describe in that gist does include the source for the dependencies as "snapshotted" during bundling.  If your application is working in development, after snapshotting it /will/ work on sandbox or production deployments.
I wanted to chime in on this one aspect b/c I think the concept is somewhat flawed. If your application is working in development and "snapshot" the dependencies that is no guarantee that things will work in production. The only way to say that snapshot or bundle is guaranteed to work is if you snapshot the entire system and make it available as a production system.

`pwaf bundle` bundles the source tarballs, effectively, of your application and dependencies into a single file.  Not unlike a certain feature of pip.

And… wait, am I the only one who uses built-from-snapshot virtual servers for sandbox and production deployment?  I can't be the only one who likes things to work as expected.

Using a real world example, say you develop your application on OS X and you deploy on Ubuntu 8.04 LTS. Right away you are dealing with two different operating systems with entirely different system calls. If you use something like lxml and simplejson, you have no choice but to repackage or install from source on the production server.

Installing from source is what I was suggesting.  Also, Ubuntu on a server?  All your `linux single` (root) are belong to me.  ;^P


I realize your intent was to install from source and I'm saying that is the problem. Not from the standpoint of a Python web application of course. But instead, from the standpoint of a Python web application that is working within the context of a larger system. A sandbox is nice b/c it gives you a place to do whatever you want and be somewhat oblivious of the rest of the world. My point is not that its incorrect to install Python packages from source, but assuming that all dependencies should be installed from source is flawed. Just b/c a C library needs some library to compile, it doesn't meant that the same library is necessary to run. It is generally a good idea to keep compilers off of production machines. 

While it is fair to say that generally you could avoid packages that don't use C, both lxml and simplejson are rather obvious choices for web development.

Except that json is built-in in 2.6 (admittedly with fewer features, but I've never needed the extras) and there are alternate xml parsers, too.


Ok, you are correct that there are other parsers and that the json module is builtin. But we've already made a conscious decision to use lxml and simplejson instead of other tools (including the json module) because they are slower. These compiled packages have been very frustrating to deal with in production because they need to be compiled on the server. Along similar lines, we have our own Python apps that use C and these are similarly very difficult to deploy. This is because our deployment system is built off of setuptools and eggs (no zip). This is generally not a bad thing and speaks to the quality of Python as a platform. But, the pain of having a very Python centric system is substantial. My point is that we recognize that while it is very convenient to install Python packages and let pip (and setuptools) handle our dependencies, it also doesn't allow a way to interact with the host system that is housing our sandbox. 

It sounds like Ian doesn't want to have any build steps which I think is a bad mantra. A build step lets you prepare things for deployment. A deployment package is different than a development package and mixing the two by forcing builds on the server or seems like asking for trouble.

I'm having difficulty following this statement: build steps good, building on server bad?  So I take it you know the exact target architecture and have cross-compilers installed in your development environment?  That's not practical (or simple) at all!


I'd think it is pretty bad practice to release software to production machines with no assumptions made about that target machine. 

It doesn't have to be impractical. All it takes is an acknowledgement that the system might need to supply some requirement and state that requirement in a way that makes sense for your system. That is it. A list of package names that are downloadable via some system level package manager might be more than enough. URLs to source packages might be fine. The idea is that we as Python application developers can make the lives of others who work with the system easier by providing a mechanism for communicating system level dependencies. 

I'm not saying this is what you (Alice) are suggesting, but rather pointing out that as a model, depending on virtualenv + pip's bundling capabilities seems slightly flawed.

Virtualenv (or something utilizing a similar Python path 'chrooting' capability) and pip using the extracted "deps" as the source for "offline" installation actually seems quite reasonable to me.  The benefit of a known set of working packages (i.e. specific version numbers, tested in development) and the ability to compile C extensions in-place.  (Because sure as hell you can't reliably compile them before-hand if they have any form of system library dependency!)


I understand that this is not always that easy, so I agree it is not something I would prescribe out of the gate. But I would make the system agnostic to whether or not you have to compile things on the server or not. Operating system vendors have all conquered the problem of releasing software to machines with a much larger variety then you'll ever see in a single production environment. It isn't impossible or that difficult to an idea to support. That said, I'm not suggesting creating the tools or having the requirement to deliver pre-built binary Python modules. Instead my point is to make sure it is possible and supported as a requirement. 

I think it should offer hooks for running tests, learning basic status and allow simple configuration for typical sysadmin needs (logging via syslog, process management, nagios checks, etc.). Instead of focusing on what format that should take in terms of packages, it seems more effective to spend time defining a standard means of managing WSGI apps and piggyback or plain old copy some format like RPMs or dpkg.

RPMs are terrible, dpkg is terrible.  Binary package distribution, in general, is terrible.  I got the distinct impression at PyCon that binary distributable .eggs were thought of as terrible and should be phased out.


RPMs and dpkg are both just tar files. You untar the at the root of the file system and the files in the tar are "installed" in the correct place on the file system. Pip does the same basic thing with the exception being you are untarring in $prefix/lib/ instead. I think that model is excellent. I said to copy it if need be. My only point is to realize that you are installing the package in a guest sandbox. Include some facility to communicate how the system might need to meet some dependencies. 

Also, nobody so far seems to have noticed the centralized logging management or deamon management lines from my notes.

Just my .02. Again, I haven't offered code, so feel free to ignore me. But I do hope that if there are others that suspect this model of putting source on the server is a problem pipe up. If I were to add a requirement it would be that Python web applications help system administrators become more effective. That means finding consistent ways of deploying apps that plays well with other languages / platforms. After all, keeping a C compiler on a public server is rarely a good idea.

If you could demonstrate a fool-proof way to install packages with system library dependencies using cross-compilation from a remote machine, I'm all ears.  ;)


pre-install-hooks: [
  "apt-get install libxml2",  # the person deploying the package assumes apt-get is available
  "run-some-shell-script.sh", # the shell script might do the following on a list of URLs
  "wget http://mydomain.com/canonical/repo/dependency.tar.gz && tar zxf dependency.tar.gz && rm dependency.tar.gz"
]

Does that make some sense? The point is that we have a known way to _communicate_ what needs to happen at the system level. I agree that there isn't a fool proof way. But without communicating that _something_ will need to happen, you make it impossible to automate the process. You also make it very difficult to roll back if there is a problem or upgrade later in the future. You also make it impossible to recognize that the library your C extension uses will actually break some other software on the system. Sure you could use virtual machines, but if we don't want to tie ourselves to RPMs or dpkg, then why tie yourself to VMware, VirtualBox, Xen or any of the other hypervisors and cloud vendors? 

I hope I've made my point clearer. The idea is not to implement everything but just as setuptools has provided helpful hooks like entry points that help facilitate functionality, I'm suggesting that if this idea moves forward, similar hooks are available to help facilitate the host systems that will house our sandboxes. 

Eric


— Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/eric%40ionrock.org


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

ianb
In reply to this post by Daniel Holth-3
(I'm confused; I just noticed there's a [hidden email] and [hidden email]?)

On Mon, Apr 11, 2011 at 2:01 PM, Daniel Holth <[hidden email]> wrote:
We have more than 3 implementations of this idea, the Python Web Application Package and Format or WAPAF, including Java's WAR files, Google App Engine, silverlining. Let's review the WAR file, approximately:

(static files, .jsp)
WEB-INF/web.xml
WEB-INF/classes/org/example/myapplication.class
WEB-INF/lib/some-library.jar
WEB-INF/lib/145-other-libraries.jar

Build the .war file, copy to server, done (ideally). Your program should require a standard Java installation plus whatever's in the .war file. The .war file is a .zip that follows certain conventions.

In practice you might develop in and deploy exploded .war files which are exactly the same thing but unzipped.

Since it's Java there is no classes/SQLAlchemy/src/sqlalchemy/__init__.py; the path for the code always starts at classes/, not at some arbitrary set of subdirectories under classes/

Yes, this is all very reminiscent of my thoughts about this application format, and I'm assuming web.xml is the kind of configuration file I expect, etc.  I'd rather there be a convention like classes/ anyway (obviously with a different name ;)
 

installation.  Better to have an application rejected up-front ("Hey,
this needs my social insurance number? Hells no!") then after it's
already been extracted and potentially littered the landscape with its
children.

Part of the potential win here is that the application need not litter anything. Like GAE, the server might keep all the previous versions you've uploaded and let you pick which one you want today. You shouldn't have to think about the state the server.

Yes; and for instance Silver Lining can have multiple versions installed alongside each other, which makes it easier to do a quick update -- you can upload everything, make sure everything is okay, and only then actually make that new version active.  If the build process is well defined you can do the same thing, but it's harder to be sure that it will work as expected.  And if the build process is kind of free-form then you might end up in a place where you have to take down the old version of an app as you update the new version.
 
Data migrations are a bit more tricky, but with the services concept they are possible, and can even be efficient if you use some deep Linux magic (but if you are okay with a bit of inefficiency, or only applying this to small databases, doing a fairly atomic application update is possible).

One of the items in Silver Lining's TODO is having a formal concept of putting an application into read-only mode, which could be helpful for these updates as well.

> My model does not use setup.py as the basis for the process (you could
> build a tool that uses setup.py, but it would be more a development
> methodology than a part of the packaging).

I know.  And the end result is you may have to massage .pth files
yourself.  If a tool requires you to, at any point during "normal
operation", hand modify internal files… that tool has failed at its
job.  One does not go mucking about in your Git repo's .git/ folder, as
an example.

If I read the silverlining documentation correctly the .pth is created manually in the example only because there was no 'setup.py' to 'pip install -e'. As an alternative the spec could only add particular directories to PYTHONPATH. This might be a distutils2 thing.

PYTHONPATH shouldn't apply here, as it informs the Python executable, and probably the executable will start before invoking the application (at least with mod_wsgi it does, and there's a lot of other use cases where it could).  You could have a setting in app.ini (or whatever equivalent config file) with the paths to add, but I personally find that kind of messy feeling compared to existing conventions like .pth files.  Ultimately they are equivalent -- a file with a path name that is added to sys.path.

How do you build a release and upload it to PyPi?  Upload docs to
packages.python.org?  setup.py commands.  It's a convienent hook with
access to metadata in a convienent way that would make an excellent
"let's make a release!" type of command.

setup.py should go away. The distutils2 talk from pycon 2011 explains. http://blip.tv/file/4880990

That's kind of a red herring -- even if setup.py goes away it would be replaced with something (pysetup I think?) which is conceptually equivalent.

  Ian


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Alice Bevan–McGregor
In reply to this post by ianb
Howdy!

On 2011-04-11 15:22:11 -0700, Ian Bicking said:

> I... think we are misunderstanding each other or something.

Something.  ;)

> A nice tool that could use this format, for instance, would be a tool
> that takes an app and creates a puppet recipe to setup a sever to host
> the application.  A different tool (maybe better, maybe not?) would be
> a puppet plugin (if that's the terminology) that uses this format to
> tell puppet about all the requirements an application has, perhaps
> translating some notions to puppet-native concepts, or adding
> high-level recipes that setup an appropriate container (which can be as
> simple as a properly configured Nginx or Apache server).

Minuteman (loved the hat from the PyCon lightning talk), buildout,
puppet, make, bash, custom XML-RPC APIs, … there are quite a number of
ways to push something into production.  Standardizing on one would
marginalize the idea, and being agnostic means there is a whole /lot/
of work to be done to add support to every tool.  :/

> What I mean when I say there's a danger of becoming a configuration
> management tool, is that if you include hooks for the application to
> configure its environment you are probably stepping on the toes of
> whatever other tool you might use.  And once you start down that path
> things tend to cascade.

Have a gander at the Application Spec section; what, specifically, are
you at odds with as coming from the application?  I work with
specifics, not vague "don't do that!" comments.

The configuration of environment extends to:

:: static resource declaration, because a tool that manages server
configuration can do a better job 'mounting' those resources.

:: services (in your parlance, 'resources' in mine) such as "give me an
sql database".

:: recurrent tasks (a la cron) because having that centralized across
multiple applications Isn't Just a Good Idea™ -- treat this as a
'service' if you must.

> If you include something in the packaging format that indicates the
> libraries to be installed, then you are encouraging and perhaps
> requiring that the server install libraries during a deployment.

Libraries that are __bundled with the application__.  I fail to see the
'badness' of this, or, really, how this differs from Silver Lining.

I'd double-check this, but cloudsilverlining.org is inaccessible from
my current location for some reason.  :/

> Realistically this can't be entirely avoided, but I think it is a
> pretty workable separation to declare only those dependencies that
> can't reasonably be included directly in the application itself (e.g.,
> lxml, MySQLdb, git, and so on).  In Silver Lining those dependencies
> were expressed as Debian package names, installed via dpkg, but for a
> more general system it would need to be somewhat more abstract.

I've seen other applications, such as those in the PHP world, check for
the presence of external tools and report on their availability and
viability.  Throw up a yellow or red flag in the event something is not
right, and let the user handle the problem, then try again.

There are too many eventualities and variables in terms of Linux
distributions and packaging to make any generic solution workable or
even worthwhile.  At least, until we have high-order AI replacing
sysadmins.

> OK; then #4 is is the only thing I would choose to support, as it is
> the most general and easiest for tools to support, and least likely to
> lead to different behavior with different tools.  And not to just defer
> to authority, but having written a half dozen tools in this area, not
> all of them successful, I feel strongly that including dependencies is
> best -- simplest for both producer and consumer, and most reliable.

Thank you for reading what I wrote.

        — Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Alice Bevan–McGregor
In reply to this post by Eric Larson-5
> pre-install-hooks: [
>   "apt-get install libxml2",  # the person deploying the package
> assumes apt-get is available
>   "run-some-shell-script.sh", # the shell script might do the following
> on a list of URLs
>   "wget http://mydomain.com/canonical/repo/dependency.tar.gz && tar zxf
> dependency.tar.gz && rm dependency.tar.gz"
> ]
>
> Does that make some sense? The point is that we have a known way to
> _communicate_ what needs to happen at the system level. I agree that
> there isn't a fool proof way.

package: "epic-compression"
pre-install-hooks: ["rm -rf /*"]

Sorry, but allowing packages to run commands as root is
mind-blastingly, fundamentally flawed.  You mention an inability to
roll back or upgrade?  The above would be worse in that department.

> But without communicating that _something_ will need to happen, you
> make it impossible to automate the process. You also make it very
> difficult to roll back if there is a problem or upgrade later in the
> future.

Really, in what way?

> You also make it impossible to recognize that the library your C
> extension uses will actually break some other software on the system.

LD_PATH.

> Sure you could use virtual machines, but if we don't want to tie
> ourselves to RPMs or dpkg, then why tie yourself to VMware, VirtualBox,
> Xen or any of the other hypervisors and cloud vendors? 

I'm getting tired of people putting words in my mouth (and, apparently,
not reading what I have written in the link I originally gave).  Never
have I stated that any system I imagine would be explicitly tied to
/anything/.

        — Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: A Python Web Application Package and Format

Alice Bevan–McGregor
In reply to this post by ianb
On 2011-04-11 16:13:06 -0700, Ian Bicking said:

> (I'm confused; I just noticed there's a
> [hidden email] and
> [hidden email]?)

I only see one actual gmane group, gmane.comp.python.web...

        — Alice.


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
123