2

Hej all, I have load an XML-File into R and want to extract an attribute value.

<espa_metadata version="2.0" xsi:schemaLocation="http://espa.cr.usgs.gov/v2 http://espa.cr.usgs.gov/schema/espa_internal_metadata_v2_0.xsd">
<global_metadata></global_metadata>
<bands>
<band product="cfmask" source="toa_refl" name="cfmask" category="qa" data_type="UINT8" nlines="7801" nsamps="7651" fill_va.lue="255">
<percent_coverage>
<cover type="clear">40.35</cover>
<cover type="cloud">39.99</cover>
</percent_coverage>
</band>
</bands>
</espa_metadata>

I want to extract the value 39.99 for cover type="cloud". I used the following approach but I only get "NULL"

library(XML)
data <- xmlParse("LC82030342015346LGN00.xml")
xpathApply(data,"//percent_coverage/cover[@type='cloud']" , xmlValue)

Any ideas? Thank u in advance!

Piotr De
  • 157
  • 8
  • Can't reproduce here! **39.99** does result. Check if you are pointing to correct document. – Parfait Sep 26 '16 at 20:18
  • I did not have any problems extracting `39.99`. Are you sure `xmlParse` actually parsed the file (i.e., file name is correct and file is in working directory). Check the value for `data`. – aichao Sep 26 '16 at 20:19
  • I checked it and also with other files and I know that it should work, but it doesn't. – Piotr De Sep 26 '16 at 20:23
  • @aichao Yes I'm sure! The output for `data` is also correct – Piotr De Sep 26 '16 at 20:28
  • I usually discourage posting links to files on file sharing services, but there may be an issue with BOM or other encoding issues that were masked by the snippet paste into SO. Can you link to the raw file directly? – hrbrmstr Sep 26 '16 at 20:41
  • 1
    Here is the original file: https://www.dropbox.com/s/e1fah8iot5bnuaz/LC82030342015346LGN00.xml?dl=0 – Piotr De Sep 26 '16 at 20:46
  • I'm not sure why, but treating it as HTML instead of XML makes it work: `library(xml2) ; xml %>% read_html() %>% xml_find_all('//cover[@type="cloud"]') %>% xml_double()` – alistaire Sep 26 '16 at 21:42
  • @PiotrDe Your XML has *default namespace* declared at the root element : `xmlns="http://espa.cr.usgs.gov/v2"`. – har07 Sep 27 '16 at 01:57
  • Here are some similar questions I have encountered in the past which using `r` : [Q1](http://stackoverflow.com/questions/24954792/xpath-and-namespace-specification-for-xml-documents-with-an-explicit-default-nam/24955051#24955051), [Q2](http://stackoverflow.com/questions/34049887/xml-r-how-to-retrieve-values-could-this-be-a-namespace-issue/34313931#34313931) – har07 Sep 27 '16 at 01:58

0 Answers0