Quantcast

Recipe 534109: XML to Python data structure

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Recipe 534109: XML to Python data structure

David Shi
Has anyone tried this recipe?
 
I am looking for a Python script to do the following.
 
1. Read in an xml
2. Turn xml into a Python data array, to be ready to be further manipulated and saved into a .dbf file.
 
I wonder how to use the following recipe.
 
Regards.
 
David

--- On Thu, 18/12/08, David Shi <[hidden email]> wrote:
From: David Shi <[hidden email]>
Subject: Looking for a generic efficient script for reading xml containing CDATA, extracting data for ready to be stored in a .dbf file
To: [hidden email]
Date: Thursday, 18 December, 2008, 6:27 PM

Looking for a generic efficient script for reading xml containing CDATA, extracting data for ready to be stored in a .dbf file
 
Can anyone help?
 
Regards.
 
David



_______________________________________________
XML-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/xml-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Recipe 534109: XML to Python data structure

Luis Miguel Morillas-2
Do you know amara? Try amara: http://wiki.xml3k.org/Amara

We're working now on amara 2 (http://wiki.xml3k.org/Amara2) But it's
not yet  in production.

Saludos,

--

Luis Miguel



2009/1/6 David Shi <[hidden email]>:

> Has anyone tried this recipe?
>
> I am looking for a Python script to do the following.
>
> 1. Read in an xml
> 2. Turn xml into a Python data array, to be ready to be further manipulated
> and saved into a .dbf file.
>
> I wonder how to use the following recipe.
> http://code.activestate.com/recipes/534109/
>
> Regards.
>
> David
>
> --- On Thu, 18/12/08, David Shi <[hidden email]> wrote:
>
> From: David Shi <[hidden email]>
> Subject: Looking for a generic efficient script for reading xml containing
> CDATA, extracting data for ready to be stored in a .dbf file
> To: [hidden email]
> Date: Thursday, 18 December, 2008, 6:27 PM
>
> Looking for a generic efficient script for reading xml containing CDATA,
> extracting data for ready to be stored in a .dbf file
>
> Can anyone help?
>
> Regards.
>
> David
>
>
> _______________________________________________
> XML-SIG maillist  -  [hidden email]
> http://mail.python.org/mailman/listinfo/xml-sig
>
>
_______________________________________________
XML-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/xml-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Recipe 534109: XML to Python data structure

Stefan Behnel-3
In reply to this post by David Shi

David Shi wrote:
> I am looking for a Python script to do the following.
>  
> 1. Read in an xml
> 2. Turn xml into a Python data array, to be ready to be further manipulated and saved into a .dbf file.

Did you read my reply to your last post?

Stefan

_______________________________________________
XML-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/xml-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Recipe 534109: XML to Python data structure

Stefan Behnel-3
In reply to this post by David Shi
David Shi wrote:
> What I am trying to do is to have a generic script to turn xml to Python
> dataset. Then I can manipulate it as required. Then I can save
> processed data into a .dbf file.

I'd use iterparse() for the parsing, that allows you to construct the .dbf
content on the fly.

http://codespeak.net/lxml/parsing.html#iterparse-and-iterwalk

Working with the data elements returned by the iterparse iterator is quite
easy, you'll be fine with using the properties .tag and .text, as well as
the .find() method to find subelements.

http://codespeak.net/lxml/tutorial.html#the-element-class

If you can afford to load the entire XML tree into memory, you can also
try lxml.objectify, which will give you a Python-like interface to the
data.

http://codespeak.net/lxml/objectify.html

Note that the lxml.objectify in-memory tree is most likely a lot more
memory friendly (and the parsing is definitely faster) than what the
recipe gives you.

Stefan

_______________________________________________
XML-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/xml-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Recipe 534109: XML to Python data structure

Stefan Behnel-3
In reply to this post by David Shi
It seems that apart from top-posting, you forgot to reply to the list.

David Shi wrote:
> lxml looks interesting to me as it deals with CDATA.
>
> Where is the step by step guide to use lxml to do what I need to do, as
> per my previous email.

I do not know any step-by-step guide that describes how to convert an XML
format to .dbf. I guess you'll have to figure out the mapping code
yourself to a certain extent. I gave you quite a number of references
including some tutorials and a link to a library that handles the dbf
format. If you want someone else to write the program for you for free,
you should say so.

Stefan


> --- On Wed, 7/1/09, Stefan Behnel wrote:
>
> From: Stefan Behnel <[hidden email]>
> Subject: Re: [XML-SIG] Recipe 534109: XML to Python data structure
> To: "David Shi" <[hidden email]>
> Cc: [hidden email]
> Date: Wednesday, 7 January, 2009, 12:42 PM
>
> David Shi wrote:
>> What I am trying to do is to have a generic script to turn xml to Python
>> dataset. Then I can manipulate it as required. Then I can save
>> processed data into a .dbf file.
>
> I'd use iterparse() for the parsing, that allows you to construct the .dbf
> content on the fly.
>
> http://codespeak.net/lxml/parsing.html#iterparse-and-iterwalk
>
> Working with the data elements returned by the iterparse iterator is quite
> easy, you'll be fine with using the properties .tag and .text, as well as
> the .find() method to find subelements.
>
> http://codespeak.net/lxml/tutorial.html#the-element-class
>
> If you can afford to load the entire XML tree into memory, you can also
> try lxml.objectify, which will give you a Python-like interface to the
> data.
>
> http://codespeak.net/lxml/objectify.html
>
> Note that the lxml.objectify in-memory tree is most likely a lot more
> memory friendly (and the parsing is definitely faster) than what the
> recipe gives you.
>
> Stefan

_______________________________________________
XML-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/xml-sig
Loading...