IIS and Python CGI - how do I see more than just the form data?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

IIS and Python CGI - how do I see more than just the form data?

J.D. Main
Hi Folks,

I hope this question hasn't already been answered...

I'm using IIS 5 and calling a python script directly in the URL of a request.  
Something like:

http://someserver/myscript.py

or even

http://someserver/myscript.py?var1=something&var2=somthingelse

Using the CGI module, I can certainly see and act upon the variables that
are passed as  GET or POST actions.  What I'm after is something more
low level.  I want to see the entire HTTP request with everything inside it.

Does IIS actually pass that information to the CGI application or does it just
pass the variables?

The intent is to write a "RESTFUL" CGI script.  I need to actually "see" the
URI and the parameters of the incoming request to map the appropriate
action.  Without short circuiting the IIS webserver, how would my python
parse the following:

http://someserver/someapp/someuser/someupdate?var1=Charlie

Thanks in advance!

JDM


_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: IIS and Python CGI - how do I see more than just the form data?

Aaron Watters-2
I think you should consider using the
WSGI interface.

The WSGI interface puts all the components of a request
into a request environment dictionary which is sent as a
parameter to the function generating the response.  
For example have a look at the test application

http://whiffdoc.appspot.com/tests/misc/testDebugDump?thisVar=thatValue&thisOtherVar=ThatOtherValue

which dumps out the WSGI environment (with WHIFF extensions)
to the response.  All the information you need is somewhere inside
the environment dictionary (but it's not always easy to find).

You could also look at WHIFF which helps combine some of the
features of the CGI module with the WSGI interface.

http://whiffdoc.appspot.com/

Hope that helps,  -- Aaron Watters

===
% man less
less is more.


--- On Sat, 4/3/10, J.D. Main <[hidden email]> wrote:

> From: J.D. Main <[hidden email]>
> Subject: [Web-SIG] IIS and Python CGI - how do I see more than just the form data?
> To: [hidden email]
> Date: Saturday, April 3, 2010, 12:32 PM
> Hi Folks,
>
> I hope this question hasn't already been answered...
>
> I'm using IIS 5 and calling a python script directly in the
> URL of a request. 
> Something like:
>
> http://someserver/myscript.py
>
> or even
>
> http://someserver/myscript.py?var1=something&var2=somthingelse
>
> Using the CGI module, I can certainly see and act upon the
> variables that
> are passed as  GET or POST actions.  What I'm
> after is something more
> low level.  I want to see the entire HTTP request with
> everything inside it.
>
> Does IIS actually pass that information to the CGI
> application or does it just
> pass the variables?
>
> The intent is to write a "RESTFUL" CGI script.  I need
> to actually "see" the
> URI and the parameters of the incoming request to map the
> appropriate
> action.  Without short circuiting the IIS webserver,
> how would my python
> parse the following:
>
> http://someserver/someapp/someuser/someupdate?var1=Charlie
>
> Thanks in advance!
>
> JDM
>
>
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com
>
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: IIS and Python CGI - how do I see more than just the form data?

J.D. Main
Thanks Aaron,

I think I will explore the WSGI interface.  However, I did learn a trick using
the OS Module:

import cgi, os

formfields = cgi.FieldStorage()
http_stuff = os.environ

print "Content-type: text/html"
print
print "<html>"
print "<head>"
print "<title>Raw HTTP Test</title>"
print "</head>"
print "<body>"

print formfields
print http_stuff

print "</body>"
print "</html>"

The os.environ variable is a big dictionary containing most (if not all) of the
values inside the original http GET or POST.

Best Regards,

JDM


I think you should consider using the
WSGI interface.

The WSGI interface puts all the components of a request
into a request environment dictionary which is sent as a
parameter to the function generating the response.  
For example have a look at the test application

http://whiffdoc.appspot.com/tests/misc/testDebugDump?thisVar=thatValue&
thisOtherVar=ThatOtherValue

which dumps out the WSGI environment (with WHIFF extensions)
to the response.  All the information you need is somewhere inside
the environment dictionary (but it's not always easy to find).

You could also look at WHIFF which helps combine some of the
features of the CGI module with the WSGI interface.

http://whiffdoc.appspot.com/

Hope that helps,  -- Aaron Watters

===
% man less
less is more.


--- On Sat, 4/3/10, J.D. Main <[hidden email]> wrote:

> From: J.D. Main <[hidden email]>
> Subject: [Web-SIG] IIS and Python CGI - how do I see more than just the form data?
> To: [hidden email]
> Date: Saturday, April 3, 2010, 12:32 PM
> Hi Folks,
>
> I hope this question hasn't already been answered...
>
> I'm using IIS 5 and calling a python script directly in the
> URL of a request. 
> Something like:
>
> http://someserver/myscript.py
>
> or even
>
> http://someserver/myscript.py?var1=something&var2=somthingelse
>
> Using the CGI module, I can certainly see and act upon the
> variables that
> are passed as  GET or POST actions.  What I'm
> after is something more
> low level.  I want to see the entire HTTP request with
> everything inside it.
>
> Does IIS actually pass that information to the CGI
> application or does it just
> pass the variables?
>
> The intent is to write a "RESTFUL" CGI script.  I need
> to actually "see" the
> URI and the parameters of the incoming request to map the
> appropriate
> action.  Without short circuiting the IIS webserver,
> how would my python
> parse the following:
>
> http://someserver/someapp/someuser/someupdate?var1=Charlie
>
> Thanks in advance!
>
> JDM
>
>
> _______________________________________________
> Web-SIG mailing list
> [hidden email]
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com
>
_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: IIS and Python CGI - how do I see more than just the form data?

Aaron Watters-2


--- On Tue, 4/6/10, J.D. Main <[hidden email]> wrote:

> From: J.D. Main <[hidden email]>
> Subject: Re: [Web-SIG] IIS and Python CGI - how do I see more than just the form data?
> To: [hidden email]
> Date: Tuesday, April 6, 2010, 9:25 PM
> Thanks Aaron,
>
> I think I will explore the WSGI interface.  However, I
> did learn a trick using
> the OS Module:
>
> import cgi, os
>
> formfields = cgi.FieldStorage()
> http_stuff = os.environ .....

Yes, that will work too.  In fact the CGI interface to WSGI
works like this.  The advantage to using WSGI is that it
makes it possible to move your application to other configurations
more easily (in theory) and it's just a tiny bit more high level.

Best regards,  -- Aaron Watters

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: IIS and Python CGI - how do I see more than just the form data?

and-py
In reply to this post by J.D. Main
J.D. Main wrote:

> I want to see the entire HTTP request with everything inside it.

You won't get that as a CGI (or WSGI) application. It is the web
server's job to parse the headers of the request, choose what host and
script that maps to, and make them available to you (in the environ
dictionary in WSGI, or the real environment variables in CGI). The
server may perform additional processing on the input/output (eg.
buffering and chunking).

If you really need low-level detail you'll need to write your own HTTP
server, or adapt one from eg. BaseHTTPServer. You almost never need that
for normal web applications.

> Does IIS actually pass that information to the CGI application or does it just
> pass the variables?

For a query string as posted, IIS parses the initial HTTP GET command,
extracts the path part of that, splits it, and puts the `?...` part in
the variable `QUERY_STRING` for you.

> how would my python parse the following:

> http://someserver/someapp/someuser/someupdate?var1=Charlie

Many people do this with URL rewriting, to turn that into something like:

 
http://someserver/someapp.py?user=someuser&action=someupdate&var1=Charlie

You don't get a standard URL rewriter in IIS 5 but there are many
third-party options.

Personally I hate URL rewriting and try to avoid it wherever possible,
because IMO URL format should be in the domain of the application and
not a deployment issue.

Unfortunately, if you really want to get rid of the `.py` in the URL,
you will need at least some rewriting, because IIS refuses to map files
without an extension to script engines. You can make the extension `.p`
or `.html` or something else if you like, but you can't get rid of it.

     http://someserver/someapp.py/someuser/someupdate?var1=Charlie

This URL should be parsed into environ members:

     HTTP_HOST: someserver
     SCRIPT_NAME: /someapp.py
     PATH_INFO: /someuser/someupdate
     QUERY_STRING: ?var1=Charlie

Unfortunately (again), IIS gets this wrong. It sets `PATH_INFO` to:

     /someapp.py/someuser/someupdate

which is contrary to the CGI/WSGI specifications. If you want to sniff
path parts as an input mechanism (to do URL routing yourself without
rewriting), you will have to detect this situation (probably by sniffing
SERVER_SOFTWARE) and hack a fix in. Some libraries and frameworks may do
this for you.

(Aside: even this is not certain. This wrong behaviour can be turned off
using a little-known IIS config option. However, it's unlikely to be
used in the wild, not least because the flag typically breaks ASP.)

Unfortunately (yet again), it's not reliable to send any old characters
as part of the path. Because of the poor design of the original CGI
standard (carried over into WSGI), any `%nn` escape sequences get
decoded before being dropped into SCRIPT_NAME/PATH_INFO (though not,
thankfully, QUERY_STRING).

This has the consequence that there are many characters that can't
reliably be used in a path part, including slashes, backslashes, control
characters, and all non-ASCII characters (since they go through a
Unicode decode/encode cycle with what are almost guaranteed to be the
wrong charsets). Stick with simple strings like `someuser`.

Summary: IIS is a pain.

--
And Clover
mailto:[hidden email]
http://www.doxdesk.com/

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: IIS and Python CGI - how do I see more than just the form data?

J.D. Main
Thanks Andrew.  It seems like URL rewriting is exactly the way to create a
CGI based "RESTful" WEB service using IIS.  

I think one can map an .exe to a folder in IIS and thus remove the need for
the .py extension in the URL.  Though it would probably be fairly inefficient to
execute a PY2EXE program with every web hit.

I'm going to keep tinkering...

Best Regards,

JDM



J.D. Main wrote:

> I want to see the entire HTTP request with everything inside it.

You won't get that as a CGI (or WSGI) application. It is the web
server's job to parse the headers of the request, choose what host and
script that maps to, and make them available to you (in the environ
dictionary in WSGI, or the real environment variables in CGI). The
server may perform additional processing on the input/output (eg.
buffering and chunking).

If you really need low-level detail you'll need to write your own HTTP
server, or adapt one from eg. BaseHTTPServer. You almost never need that
for normal web applications.

> Does IIS actually pass that information to the CGI application or does it just
> pass the variables?

For a query string as posted, IIS parses the initial HTTP GET command,
extracts the path part of that, splits it, and puts the `?...` part in
the variable `QUERY_STRING` for you.

> how would my python parse the following:

> http://someserver/someapp/someuser/someupdate?var1=Charlie

Many people do this with URL rewriting, to turn that into something like:

 
http://someserver/someapp.py?user=someuser&action=someupdate&var1=
Charlie

You don't get a standard URL rewriter in IIS 5 but there are many
third-party options.

Personally I hate URL rewriting and try to avoid it wherever possible,
because IMO URL format should be in the domain of the application and
not a deployment issue.

Unfortunately, if you really want to get rid of the `.py` in the URL,
you will need at least some rewriting, because IIS refuses to map files
without an extension to script engines. You can make the extension `.p`
or `.html` or something else if you like, but you can't get rid of it.

     http://someserver/someapp.py/someuser/someupdate?var1=Charlie

This URL should be parsed into environ members:

     HTTP_HOST: someserver
     SCRIPT_NAME: /someapp.py
     PATH_INFO: /someuser/someupdate
     QUERY_STRING: ?var1=Charlie

Unfortunately (again), IIS gets this wrong. It sets `PATH_INFO` to:

     /someapp.py/someuser/someupdate

which is contrary to the CGI/WSGI specifications. If you want to sniff
path parts as an input mechanism (to do URL routing yourself without
rewriting), you will have to detect this situation (probably by sniffing
SERVER_SOFTWARE) and hack a fix in. Some libraries and frameworks
may do
this for you.

(Aside: even this is not certain. This wrong behaviour can be turned off
using a little-known IIS config option. However, it's unlikely to be
used in the wild, not least because the flag typically breaks ASP.)

Unfortunately (yet again), it's not reliable to send any old characters
as part of the path. Because of the poor design of the original CGI
standard (carried over into WSGI), any `%nn` escape sequences get
decoded before being dropped into SCRIPT_NAME/PATH_INFO (though
not,
thankfully, QUERY_STRING).

This has the consequence that there are many characters that can't
reliably be used in a path part, including slashes, backslashes, control
characters, and all non-ASCII characters (since they go through a
Unicode decode/encode cycle with what are almost guaranteed to be the
wrong charsets). Stick with simple strings like `someuser`.

Summary: IIS is a pain.

--
And Clover
mailto:[hidden email]
http://www.doxdesk.com/

_______________________________________________
Web-SIG mailing list
[hidden email]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/lists%40nabble.com