Quantcast

parsing XML with minidom

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

parsing XML with minidom

kimmyaf
Hello, I am not real sure if my question belongs here or not, but this is best place I could find.

I am a python beginner and trying to teach myself how to parse some XML with minidom.

This is the code excerpt I am struggling with....

********************************************************
   dom = minidom.parseString(xml_response)    
    handler.close()

    route_list = []
    tag = ['route']

    tmp_route=[]
    for route in dom.getElementsByTagName('body'):
        print 'in'
        tmp_route[route] = dom.getElementsByTagName(tag)[0].getAttribute('tag')
        route_list.append(tmp_route)

*******************************************************************
Here is the XML I am getting back when I call...

'<?xml version="1.0" encoding="utf-8" ?> \r\n<body copyright="All data copyright MBTA 2010.">\r\n<route tag="39" title="39"/>\r\n<route tag="111" title="111"/>\r\n<route tag="114" title="114"/>\r\n<route tag="116" title="116"/>\r\n<route tag="117" title="117"/>\r\n</body>\r\n'

    See this formatted better by pasting this URL = >    
               http://webservices.nextbus.com/service/publicXMLFeed?command=routeList&a=mbta


I am taking the following error:

  File "C:/Users/Kim/Grad School/Python/bus python.py", line 54, in <module>
    get_available_routes()
  File "C:/Users/Kim/Grad School/Python/bus python.py", line 43, in get_available_routes
    tmp_route[route] = dom.getElementsByTagName(tag)[0].getAttribute('tag')
IndexError: list index out of range



I'm sure there is something obvious that I am doing wrong. All I want to do is grab all of the <route tag> values and put them into a list.  Kind of new to parsing XML! I'm working off an example but the XML in the example code is a lot more in depth so can't really relate it to mine. I also would like any reference anyone has about how to parse with minidom!!

Help! Thank you!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: parsing XML with minidom

Rajanikanth Jammalamadaka
Try this:

from xml.etree.ElementTree import ElementTree

doc = ElementTree(file = "t.xml")

listOfTags = []

for item in doc.findall(".//route"):
    listOfTags.append(item.get('tag'))
    
print listOfTags


where t.xml is your xml file.

Thanks,
Raj


On Wed, Feb 3, 2010 at 8:04 PM, kimmyaf <[hidden email]> wrote:

Hello, I am not real sure if my question belongs here or not, but this is
best place I could find.

I am a python beginner and trying to teach myself how to parse some XML with
minidom.

This is the code excerpt I am struggling with....

********************************************************
  dom = minidom.parseString(xml_response)
   handler.close()

   route_list = []
   tag = ['route']

   tmp_route=[]
   for route in dom.getElementsByTagName('body'):
       print 'in'
       tmp_route[route] =
dom.getElementsByTagName(tag)[0].getAttribute('tag')
       route_list.append(tmp_route)

*******************************************************************
Here is the XML I am getting back when I call...

'<?xml version="1.0" encoding="utf-8" ?> \r\n<body copyright="All data
copyright MBTA 2010.">\r\n<route tag="39" title="39"/>\r\n<route tag="111"
title="111"/>\r\n<route tag="114" title="114"/>\r\n<route tag="116"
title="116"/>\r\n<route tag="117" title="117"/>\r\n</body>\r\n'

   See this formatted better by pasting this URL = >

http://webservices.nextbus.com/service/publicXMLFeed?command=routeList&a=mbta


I am taking the following error:

 File "C:/Users/Kim/Grad School/Python/bus python.py", line 54, in <module>
   get_available_routes()
 File "C:/Users/Kim/Grad School/Python/bus python.py", line 43, in
get_available_routes
   tmp_route[route] = dom.getElementsByTagName(tag)[0].getAttribute('tag')
IndexError: list index out of range



I'm sure there is something obvious that I am doing wrong. All I want to do
is grab all of the <route tag> values and put them into a list.  Kind of new
to parsing XML! I'm working off an example but the XML in the example code
is a lot more in depth so can't really relate it to mine. I also would like
any reference anyone has about how to parse with minidom!!

Help! Thank you! %-|
--
View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp27447458p27447458.html
Sent from the Python - xml-sig mailing list archive at Nabble.com.

_______________________________________________
XML-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/xml-sig



--
Rajanikanth

_______________________________________________
XML-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/xml-sig
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: parsing XML with minidom

Peter Bigot-4
In reply to this post by kimmyaf
The variable tag is a list of strings.  The method getElementsByTagName
takes a single string as its first parameter.  Since a list cannot
appear as a tag name, the second call to getElementsByTagName returns an
empty list.

body = dom.getElementsByTagName('body')[0]
for route in body.getElementsByTagName('route'):
     print route.getAttribute('tag')

Peter

On 2/3/2010 9:04 PM, kimmyaf wrote:

> Hello, I am not real sure if my question belongs here or not, but this is
> best place I could find.
>
> I am a python beginner and trying to teach myself how to parse some XML with
> minidom.
>
> This is the code excerpt I am struggling with....
>
> ********************************************************
>     dom = minidom.parseString(xml_response)
>      handler.close()
>
>      route_list = []
>      tag = ['route']
>
>      tmp_route=[]
>      for route in dom.getElementsByTagName('body'):
>          print 'in'
>          tmp_route[route] =
> dom.getElementsByTagName(tag)[0].getAttribute('tag')
>          route_list.append(tmp_route)
>
> *******************************************************************
> Here is the XML I am getting back when I call...
>
> '<?xml version="1.0" encoding="utf-8" ?>  \r\n<body copyright="All data
> copyright MBTA 2010.">\r\n<route tag="39" title="39"/>\r\n<route tag="111"
> title="111"/>\r\n<route tag="114" title="114"/>\r\n<route tag="116"
> title="116"/>\r\n<route tag="117" title="117"/>\r\n</body>\r\n'
>
>      See this formatted better by pasting this URL =>
>
> http://webservices.nextbus.com/service/publicXMLFeed?command=routeList&a=mbta
>
>
> I am taking the following error:
>
>    File "C:/Users/Kim/Grad School/Python/bus python.py", line 54, in<module>
>      get_available_routes()
>    File "C:/Users/Kim/Grad School/Python/bus python.py", line 43, in
> get_available_routes
>      tmp_route[route] = dom.getElementsByTagName(tag)[0].getAttribute('tag')
> IndexError: list index out of range
>
>
>
> I'm sure there is something obvious that I am doing wrong. All I want to do
> is grab all of the<route tag>  values and put them into a list.  Kind of new
> to parsing XML! I'm working off an example but the XML in the example code
> is a lot more in depth so can't really relate it to mine. I also would like
> any reference anyone has about how to parse with minidom!!
>
> Help! Thank you! %-|
>    


_______________________________________________
XML-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/xml-sig
Loading...