Python script to read LANDSAT XML metafile

Question

I need to obtain the scene ID for several Landsat images. The metadata for the images is contained in an xml file:

<?xml version="1.0"?>
<searchResponse xsi:schemaLocation="http://upe.ldcm.usgs.gov/schema/metadata
http://earthexplorer.usgs.gov/EE/metadata.xsd" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"    
xmlns="http://upe.ldcm.usgs.gov/schema/metadata">
    <metaData>
        <browseAvailable>Y</browseAvailable
        <browseURL>http://earthexplorer.usgs.gov/browse/tm/146/41/1998/LT51460411998300XXX01.jpg</browseURL> 
        <sceneID>LT51460411998300XXX01</sceneID> 
        <sensor>LANDSAT_TM</sensor> 
        <acquisitionDate>1998-10-27</acquisitionDate> 
        <dateUpdated>2012-07-31</dateUpdated> 
        <path>146</path> 
        <row>41</row> 
        <...>
     </metaData>
     <metaData>
        <sceneID>LT51460411998300XXX01</sceneID> 
        <sensor>LANDSAT_TM</sensor> 
        <acquisitionDate>1998-10-27</acquisitionDate> 
        <dateUpdated>2012-07-31</dateUpdated> 
        <path>146</path> 
        <row>41</row> 
        <...>
    <etc etc>

The following code will list all the nodes in metaData (browseAvailable, browseURL, sceneID) but what I really want to do is just get a list of sceneID.

#!/usr/bin/python

#import os
#import sys
from xml.dom.minidom import parse

import xml.etree.ElementTree as ET
#import grass.script as grass

tree = ET.parse('C:/Users/Simon/Documents/tif/Metadata/metadata_test1.xml')
root = tree.getroot()

for metadata in root:  
    for data in metadata:          
    sceneID = data.text
        grass.message('data -> %s' % (sceneID))

I've tried using the tools described on the Python xml.etree.ElementTree with no success. Could anyone help me adapt the code above to return sceneID only?

Possible duplicate of http://stackoverflow.com/questions/1412004/reading-xml-using-python-minidom-and-iterating-over-each-node — LSerni, Nov 06 '12 at 18:26

score 1 · Accepted Answer · answered Nov 06 '12 at 18:31

1

Try this one:

#!/usr/bin/python

from xml.dom.minidom import parse

dom = parse('C:/Users/Simon/Documents/tif/Metadata/metadata_test1.xml')

ids  = dom.getElementsByTagName('sceneID')
for id in ids:
    print "%s" % id.childNodes[0].nodeValue

answered Nov 06 '12 at 18:31

LSerni

55,617
10
65
107

Python script to read LANDSAT XML metafile

1 Answers1