Append XML metadata to HTML code using Python

819
2
Jump to solution
06-25-2013 08:20 PM
CPoynter
Occasional Contributor III
Hi All,

I have a script which pulls information from a number of XML files to populate a table within a HTML document.

Although I can pull the information, it is overwriting the table space with the new information. What I would like to do is append the table space (body) of the HTML document for each new XML file, so that I will have a Thumbnail, metadata details for each XML creating a large HTML listing of data for quick reference.

import arcpy, sys import xml.etree.ElementTree as ET  html_head = """  <!DOCTYPE HTML> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title></title> </head>"""  fh = open(r'D:\Temp\CSD_HTML\file.html', 'wb') fh.write(html_head)  XML_List = [r'D:\Temp\CSD_HTML\test.xml', r'D:\Temp\CSD_HTML\contours.sdc.xml'] for xml in XML_List:     print xml + '\n'     path = xml     tree = ET.parse(path)      for node in tree.findall('.//title'):         title = node.text         print 'Title: ' + node.text      for node in tree.findall('.//westbc'):         westbc = node.text         print 'West: ' + node.text      for node in tree.findall('.//eastbc'):         eastbc = node.text         print 'East: ' + node.text      for node in tree.findall('.//northbc'):         northbc = node.text         print 'North: ' + node.text      for node in tree.findall('.//southbc'):         southbc = node.text         print 'South: ' + node.text      for node in tree.findall('.//geogunit'):         geogunit = node.text         print 'Geographic Units: ' + node.text      for node in tree.findall('.//horizdn'):         horizdn = node.text         print 'Projection: ' + node.text      for node in tree.findall('.//ellips'):         ellips = node.text         print 'Ellipsoid: ' + node.text      html_body = """      <body>     <p> </p>     <table width="800" border="0">       <tr>         <td width="309" rowspan="5"><img src="Thumbs/MitchellD_2013-06-25.jpg" alt="" width="300" height="300" align="left"></td>         <td width="4" rowspan="5"> </td>         <td height="50" colspan="3">Title: """ + title + """</td>       </tr>       <tr>         <td width="150" height="50"> </td>         <td width="165" height="50">North: """ + northbc + """</td>         <td width="150" height="50"> </td>       </tr>       <tr>         <td height="50">West: """ + westbc + """</td>         <td height="50"> </td>         <td height="50">East: """ + eastbc + """</td>       </tr>       <tr>         <td height="50"> </td>         <td height="50">South: """ + southbc + """</td>         <td height="50"> </td>       </tr>       <tr>         <td height="150" colspan="3"><p>Geographic Units: """ + geogunit + """</p>         <p>Projection: """ + horizdn + """</p>         <p>Ellipsoid: """ + ellips + """</p></td>       </tr>     </table>     <p> </p>     </body>"""      fh = open(r'D:\Temp\CSD_HTML\file.html', 'a+')     fh.write(html_body)  html_tail = """  </html>"""  fh = open(r'D:\Temp\CSD_HTML\file.html', 'wb') fh.write(html_tail) fh.close()  del tree 


Having an issue getting subsequent tables to add to HTML with each metadata details for each XML file.

Regards,

Craig
Tags (2)
0 Kudos
1 Solution

Accepted Solutions
ShaunWalbridge
Esri Regular Contributor

Having an issue getting subsequent tables to add to HTML with each metadata details for each XML file.


Just open a single file handle, and use that to write all of your data:

fh = open(r'D:\Temp\CSD_HTML\file.html', 'wb')


Don't reopen the handle throughout the script, it isn't necessary here, and in the last block when you open the file 'wb' you're effectively truncating the rest of the file prior to closing. You can just use the single 'fh' object throughout your script, and be in fine shape.

cheers,
Shaun

View solution in original post

0 Kudos
2 Replies
ShaunWalbridge
Esri Regular Contributor

Having an issue getting subsequent tables to add to HTML with each metadata details for each XML file.


Just open a single file handle, and use that to write all of your data:

fh = open(r'D:\Temp\CSD_HTML\file.html', 'wb')


Don't reopen the handle throughout the script, it isn't necessary here, and in the last block when you open the file 'wb' you're effectively truncating the rest of the file prior to closing. You can just use the single 'fh' object throughout your script, and be in fine shape.

cheers,
Shaun
0 Kudos
CPoynter
Occasional Contributor III
Shaun's advice solved this for me.

import arcpy, sys
import xml.etree.ElementTree as ET

fh = open(r'D:\Temp\CSD_HTML\file.html', 'wb')

html_head = """

<!DOCTYPE HTML>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title></title>
</head>"""

fh.write(html_head)

XML_List = [r'D:\Temp\test1.xml', r'D:\Temp\test2.xml']
for xml in XML_List:
    print xml + '\n'
    path = xml
    tree = ET.parse(path)

    for node in tree.findall('.//title'):
        title = node.text
        print 'Title: ' + node.text

    for node in tree.findall('.//westbc'):
        westbc = node.text
        print 'West: ' + node.text

    for node in tree.findall('.//eastbc'):
        eastbc = node.text
        print 'East: ' + node.text

    for node in tree.findall('.//northbc'):
        northbc = node.text
        print 'North: ' + node.text

    for node in tree.findall('.//southbc'):
        southbc = node.text
        print 'South: ' + node.text

    for node in tree.findall('.//geogunit'):
        geogunit = node.text
        print 'Geographic Units: ' + node.text

    for node in tree.findall('.//horizdn'):
        horizdn = node.text
        print 'Projection: ' + node.text

    for node in tree.findall('.//ellips'):
        ellips = node.text
        print 'Ellipsoid: ' + node.text

        html_body = """

        <body>
        <p> </p>
        <table width="800" border="0">
          <tr>
            <td width="309" rowspan="5"><img src="Thumbs/img.jpg" alt="" width="300" height="300" align="left"></td>
            <td width="4" rowspan="5"> </td>
            <td height="50" colspan="3">Title: """ + title + """</td>
          </tr>
          <tr>
            <td width="150" height="50"> </td>
            <td width="165" height="50">North: """ + northbc + """</td>
            <td width="150" height="50"> </td>
          </tr>
          <tr>
            <td height="50">West: """ + westbc + """</td>
            <td height="50"> </td>
            <td height="50">East: """ + eastbc + """</td>
          </tr>
          <tr>
            <td height="50"> </td>
            <td height="50">South: """ + southbc + """</td>
            <td height="50"> </td>
          </tr>
          <tr>
            <td height="150" colspan="3"><p>Geographic Units: """ + geogunit + """</p>
            <p>Projection: """ + horizdn + """</p>
            <p>Ellipsoid: """ + ellips + """</p></td>
          </tr>
        </table>
        <p> </p>
        </body>"""

        fh.write(html_body)

html_tail = """
</html>"""

fh.write(html_tail)
fh.close()

del tree



Regards,

Craig
0 Kudos