Parsing An Xml File For Unknown Elements Using Python Elementtree
I wish to extract all the tag names and their corresponding data from a multi-purpose xml file. Then save that information into a python dictionary (e.g tag = key, data = value). T
Solution 1:
from lxml import etree as ET
xmlString = """
<some_root_name>
<tag_x>bubbles</tag_x>
<tag_y>car</tag_y>
<tag...>42</tag...>
</some_root_name> """
document = ET.fromstring(xmlString)
for elementtag in document.getiterator():
print"elementtag name:", elementtag.tag
EDIT: To read from file instead of from string
document = ET.parse("myxmlfile.xml")
Solution 2:
>>>import xml.etree.cElementTree as et>>>xml = """... <some_root_name>... <tag_x>bubbles</tag_x>... <tag_y>car</tag_y>... <tag...>42</tag...>... </some_root_name>...""">>>doc = et.fromstring(xml)>>>printdict((el.tag, el.text) for el in doc)
{'tag_x': 'bubbles', 'tag_y': 'car', 'tag...': '42'}
If you really want 42
instead of '42'
, you'll need to work a little harder and less elegantly.
Solution 3:
You could use xml.sax.handler to parse the XML:
import xml.sax as sax
import xml.sax.handler as saxhandler
import pprint
classTagParser(saxhandler.ContentHandler):
# http://docs.python.org/library/xml.sax.handler.html#contenthandler-objectsdef__init__(self):
self.tags = {}
defstartElement(self, name, attrs):
self.tag = name
defendElement(self, name):
if self.tag:
self.tags[self.tag] = self.data
self.tag = None
self.data = Nonedefcharacters(self, content):
self.data = content
parser = TagParser()
src = '''\
<some_root_name>
<tag_x>bubbles</tag_x>
<tag_y>car</tag_y>
<tag...>42</tag...>
</some_root_name>'''
sax.parseString(src, parser)
pprint.pprint(parser.tags)
yields
{u'tag...': u'42', u'tag_x': u'bubbles', u'tag_y': u'car'}
Solution 4:
This could be done using lxml in python
from lxml import etree
myxml = """
<root>
value
</root> """
doc = etree.XML(myxml)
d = {}
for element in doc.iter():
key = element.tag
value = element.text
d[key] = value
print d
Post a Comment for "Parsing An Xml File For Unknown Elements Using Python Elementtree"