python - Writing with lxml emitting no whitespace even when pretty_print=True -


i'm using lxml library read xml template, insert/change elements, , save resulting xml. 1 of elements i'm creating on fly using etree.element , etree.subelement methods:

tree = etree.parse(r'xml_archive\templates\metadata_template_pts.xml') root = tree.getroot()  stream = [] element in root.iter():     if isinstance(element.tag, basestring):         stream.append(element.tag)          # find "keywords" element , insert new "theme" element         if element.tag == 'keywords' , 'theme' not in stream:             theme = etree.element('theme')             themekt = etree.subelement(theme, 'themekt').text = 'none'             tk in themekeys:                 themekey = etree.subelement(theme, 'themekey').text = tk             element.insert(0, theme) 

prints screen nicely print etree.tostring(theme, pretty_print=true):

<theme>   <themekt>none</themekt>   <themekey>hydrogeology</themekey>   <themekey>stratigraphy</themekey>   <themekey>floridan aquifer system</themekey>   <themekey>geology</themekey>   <themekey>regional groundwater availability study</themekey>   <themekey>usgs</themekey>   <themekey>united states geological survey</themekey>   <themekey>thickness</themekey>   <themekey>altitude</themekey>   <themekey>extent</themekey>   <themekey>regions</themekey>   <themekey>upper confining unit</themekey>   <themekey>fas</themekey>   <themekey>base</themekey>   <themekey>geologic units</themekey>   <themekey>geology</themekey>   <themekey>extent</themekey>   <themekey>inlandwaters</themekey> </theme> 

however, when using etree.elementtree(root).write(out_xml_file, method='xml', pretty_print=true) write out xml, element gets flattened in output file:

<theme><themekt>none</themekt><themekey>hydrogeology</themekey><themekey>stratigraphy</themekey><themekey>floridan aquifer system</themekey><themekey>geology</themekey><themekey>regional groundwater availability study</themekey><themekey>usgs</themekey><themekey>united states geological survey</themekey><themekey>thickness</themekey><themekey>altitude</themekey><themekey>extent</themekey><themekey>regions</themekey><themekey>upper confining unit</themekey><themekey>fas</themekey><themekey>base</themekey><themekey>geologic units</themekey><themekey>geology</themekey><themekey>extent</themekey><themekey>inlandwaters</themekey></theme> 

the rest of file written nicely, particular element causing (purely aesthetic) trouble. ideas of i'm doing wrong?


below snippet of markup template xml file (save "template.xml" run code snippet @ bottom). flattening of tags occurs when parse existing file , insert new element, not when xml created scratch using lxml.

<?xml version="1.0" encoding="utf-8"?> <?xml-stylesheet type="text/xsl" href="fgdc_classic.xsl"?> <metadata xmlns:xsi="http://www.w3.org/2001/xmlschema-instance" xsi:nonamespaceschemalocation="http://water.usgs.gov/gis/metadata/usgswrd/fgdc-std-001-1998.xsd">     <keywords>        <theme>             <themekt>iso 19115 topic categories</themekt>             <themekey>environment</themekey>             <themekey>geoscientificinformation</themekey>             <themekey>inlandwaters</themekey>         </theme>         <place>             <placekt>none</placekt>             <placekey>florida</placekey>             <placekey>georgia</placekey>             <placekey>alabama</placekey>             <placekey>south carolina</placekey>         </place>     </keywords>  </metadata> 

below snippet of code used snippet of markup (above):

# create new theme element insert root themekeys = ['hydrogeology', 'stratigraphy', 'inlandwaters']  tree = etree.parse(r'template.xml') root = tree.getroot()  stream = [] element in root.iter():     if isinstance(element.tag, basestring):         stream.append(element.tag)          # edit theme keywords         if element.tag == 'keywords':             theme = etree.element('theme')             themekt = etree.subelement(theme, 'themekt').text = 'none'             tk in themekeys:                 themekey = etree.subelement(theme, 'themekey').text = tk             element.insert(0, theme)  # write xml new file out_xml_file = 'test.xml' etree.elementtree(root).write(out_xml_file, method='xml', pretty_print=true) open(out_xml_file, 'r') f:     lines = f.readlines()  open(out_xml_file, 'w') f:     f.write('<?xml version="1.0" encoding="utf-8"?>\n')     line in lines:         f.write(line) 

if replace line:

tree = etree.parse(r'template.xml') 

with these lines:

parser = etree.xmlparser(remove_blank_text=true) tree = etree.parse(r'template.xml', parser) 

then work expected. trick use xmlparser has remove_blank_text option set true. existing ignorable whitespace removed , therefore not disrupt subsequent pretty-printing.


Comments

Popular posts from this blog

toolbar - How to add link to user registration inside toobar in admin joomla 3 custom component -

linux - disk space limitation when creating war file -

How to provide Authorization & Authentication using Asp.net, C#? -