Python XML parsers in Jython

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Python XML parsers in Jython

Matt Williams-9
Dear List,

I've come up against a real problem with parsing XML in jython. I wrote
some code in python, but I cannot get it to run under Jython:

from xml.dom.ext.reader import Sax2 fails as Jython won't handle the
_xmlplus package

doc = minidom.parse(sourceDocument).documentElement fails as it can't
import the expatbuilder module (even though it's in the folder!)

Even:
from xml import sax
sax.parse(document) fails as it says it needs another argument.


Please could someone explain how I've managed to get myself into such as
mess!

Thanks,

Matt

--
Dr. M. Williams MRCP(UK)
Clinical Research Fellow
Cancer Research UK
+44 (0)207 269 2953
+44 (0)7384 899570



-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Jython-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/jython-users
Reply | Threaded
Open this post in threaded view
|

Re: Python XML parsers in Jython

Ken Beesley
Matt,

In my limited understanding of Jython, it works with
pure Python modules but not with modules written
in C.  The expat library, which most Perl and Python
XML modules depend on, is written in C.  Is that the
problem?

If so, you might try using pxdom, an implementation
of DOM which is allegedly pure-Python.  Not as fast
as expat-based packages, but perhaps just what you
need for processing XML in Jython.

http://www.doxdesk.com/software/py/pxdom.html

Please keep us informed.

Ken

Message: 5
From: Matt Williams <[hidden email]>
To: Jython users <[hidden email]>
Date: Fri, 14 Oct 2005 21:09:07 +0100
Subject: [Jython-users] Python XML parsers in Jython

Dear List,

I've come up against a real problem with parsing XML in jython. I wrote
some code in python, but I cannot get it to run under Jython:

from xml.dom.ext.reader import Sax2 fails as Jython won't handle the
_xmlplus package

doc = minidom.parse(sourceDocument).documentElement fails as it can't
import the expatbuilder module (even though it's in the folder!)

Even:
from xml import sax
sax.parse(document) fails as it says it needs another argument.


Please could someone explain how I've managed to get myself into such as
mess!

Thanks,

Matt




-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Jython-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/jython-users
Reply | Threaded
Open this post in threaded view
|

re: Python XML parsers in Jython

Gil Hoffer
In reply to this post by Matt Williams-9
hi Matt,
 
why not using the Java XML parsers which come with the JDK?
 
take a look at javax.xml.*, it's quite intuitive to use it (Get an instance of a DocumentBuildFactory, then get your DocumentBuilder, then parse the file/stream and retrieve its Document instance), and I'm sure you could easily find a tutorial for parsing XMLs in Java.
Since you are using Jython, utilizaing the java classpath to your needs is a very simple (and smart) operation to perform.
 
Moreover, this API also simplifies XML validation against XSDs, XPath queries, transformations and more...
 
Good luck!


Yahoo! Music Unlimited - Access over 1 million songs. Try it free.
Reply | Threaded
Open this post in threaded view
|

Re: Python XML parsers in Jython

Paul D. Fernhout
In reply to this post by Matt Williams-9
Matt-

I can't explain your specific issues, but I had xml woes too under Jython
so I downloaded Xerces and used that from a jar file and put it in the
classpath.
   http://xerces.apache.org/

Then how about:
             parser = make_parser() # see below for make_parser function
             handler = MyContentHandler(someArgs)
             parser.setContentHandler(handler)
             try:
                 import org.xml.sax.InputSource
                 f = open(fileName)
                 try:
                     parser.parse(org.xml.sax.InputSource(f))
                 finally:
                     f.close()
             except UnicodeError, e:
                 "could not process file correctly due to encoding issues"

Needed and useful support functions and code:

from org.xml.sax.helpers import DefaultHandler as ContentHandler
from org.xml.sax.helpers import ParserFactory
import org.apache.xerces.parsers.SAXParser

def make_parser():
     return ParserFactory.makeParser("org.apache.xerces.parsers.SAXParser")

#  Escape "&", "<", and ">" in a string of data.
def xml_escape(text):
     newString = text
     newString = newString.replace("&", "&amp;")
     newString = newString.replace("<", "&lt;")
     newString = newString.replace(">", "&gt;")
     return newString

# like the CPython function
def xml_quoteattr(text):
     newString = text
     newString = newString.replace("&", "&amp;")
     newString = newString.replace("<", "&lt;")
     newString = newString.replace(">", "&gt;")
     if not '"' in newString:
         return '"' + newString + '"'
     if not "'" in newString:
         return "'" + newString + "'"
     newString = newString.replace('"', "&quot;")
     return '"' + newString + '"'

Good luck!

--Paul Fernhout

Matt Williams wrote:

> Dear List,
>
> I've come up against a real problem with parsing XML in jython. I wrote
> some code in python, but I cannot get it to run under Jython:
>
> from xml.dom.ext.reader import Sax2 fails as Jython won't handle the
> _xmlplus package
>
> doc = minidom.parse(sourceDocument).documentElement fails as it can't
> import the expatbuilder module (even though it's in the folder!)
>
> Even:
> from xml import sax
> sax.parse(document) fails as it says it needs another argument.
>
>
> Please could someone explain how I've managed to get myself into such as
> mess!
>
> Thanks,
>
> Matt
>


-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Jython-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/jython-users
Reply | Threaded
Open this post in threaded view
|

Re: Python XML parsers in Jython

Sean McGrath
In reply to this post by Matt Williams-9
Matt,

I do a lot of XML work with Jython and find that re-using from the pool of excellent Java-based tools for XML parsing is a very powerful and simple way to go.

I use Xerces and Xalan giving me very high levels of standards compliance with full XPath 1.0 implementation etc.

In CPython some of the more comprehensive XML packages use C extensions for things like James Clark's expat, XPath grammers and so on. It is primarily these C/C++ extensions that cause problems making XML tools in CPython/Jython seamless.

Sean

[Matt Williams]
>I've come up against a real problem with parsing XML in jython. I wrote
>some code in python, but I cannot get it to run under Jython:

>from xml.dom.ext.reader import Sax2 fails as Jython won't handle the
>_xmlplus package

>doc = minidom.parse(sourceDocument).documentElement fails as it can't
>import the expatbuilder module (even though it's in the folder!)

>Even:
>from xml import sax
>sax.parse(document) fails as it says it needs another argument.

>Please could someone explain how I've managed to get myself into such as
>mess!




-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Jython-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/jython-users
Reply | Threaded
Open this post in threaded view
|

Re: Python XML parsers in Jython

Nicolas-11
Hi,

I use jdom in java and jython, it's really easier to use than standard dom.
All you have to do to use it is adding jdom.jar to your classpath.

Ex :
builder = SAXBuilder()
document = builder.build ("myfile.xml")
root = document.getRootElement()
print root.getName()

Regards,
Nico

On 10/17/05, Sean McGrath <[hidden email]> wrote:
Matt,

I do a lot of XML work with Jython and find that re-using from the pool of excellent Java-based tools for XML parsing is a very powerful and simple way to go.

I use Xerces and Xalan giving me very high levels of standards compliance with full XPath 1.0 implementation etc.

In CPython some of the more comprehensive XML packages use C extensions for things like James Clark's expat, XPath grammers and so on. It is primarily these C/C++ extensions that cause problems making XML tools in CPython/Jython seamless.

Sean

[Matt Williams]
>I've come up against a real problem with parsing XML in jython. I wrote
>some code in python, but I cannot get it to run under Jython:

>from xml.dom.ext.reader import Sax2 fails as Jython won't handle the
>_xmlplus package

>doc = minidom.parse(sourceDocument).documentElement fails as it can't
>import the expatbuilder module (even though it's in the folder!)

>Even:
>from xml import sax
>sax.parse(document) fails as it says it needs another argument.

>Please could someone explain how I've managed to get myself into such as
>mess!




-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Jython-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/jython-users