0

My energy company provides my usage data in XML format in blocks of use per 30 minutes. I have no experience with reading XML in data. How can I extract the espi:timePeriod and espi:value from this type of information in R?

Also of importance, although much lower: espi:secondsPerInterval and ns3:updated

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns3:entry xmlns:espi="http://naesb.org/espi" xmlns:ns3="http://www.w3.org/2005/Atom">
    <ns3:link href="https://cust-api.duke-energy.com/cea/v1/usage" rel="self"/>
    <ns3:content>
        <ns3:id>urn:uuid:XXXXXXXXXXXXXXXXXXXXXX</ns3:id>
        <espi:IntervalBlock>
            <espi:interval>
                <espi:servicePointId>6XXXXXXXXXXX3</espi:servicePointId>
                <espi:serviceType>ELECTRIC</espi:serviceType>
                <espi:unitOfMeasure>kWH</espi:unitOfMeasure>
                <espi:secondsPerInterval>1800</espi:secondsPerInterval>
                <espi:duration>65750400</espi:duration>
                <espi:start>1560556800</espi:start>
            </espi:interval>
            <espi:IntervalReading>
                <espi:timePeriod>
                    <espi:start>1560556800</espi:start>
                </espi:timePeriod>
                <espi:value>0.09</espi:value>
            </espi:IntervalReading>

...lots of data in this format...

            <espi:IntervalReading>
                <espi:timePeriod>
                    <espi:start>1626391800</espi:start>
                </espi:timePeriod>
                <espi:value>0.12</espi:value>
            </espi:IntervalReading>
        </espi:IntervalBlock>
    </ns3:content>
    <ns3:published>2021-07-16T17:15:33.314</ns3:published>
    <ns3:updated>2021-07-16T17:15:33.314</ns3:updated>
</ns3:entry>

Rob Hanssen
  • 153
  • 8
  • 2
    Several approaches here [xml parse to data.frame](https://stackoverflow.com/questions/17198658/how-to-parse-xml-to-r-data-frame) to get you started over a range of different packages. – Chris Jul 26 '21 at 17:06

1 Answers1

0

Based on the article published by Chris, I found that this solution worked

library(xml2)

datafile <- "source/energyusage.txt"

data <- read_xml(datafile)

time <- xml_find_all(data, "//espi:start") %>%
                as_list() %>%
                unlist() %>%
                as.numeric()

# need to remove the first value because it also shows up earlier in the XML
time <- time[2:length(time)]

value <- xml_find_all(data, "//espi:value") %>%
                as_list() %>% 
                unlist() %>% 
                as.numeric()

energy <- tibble(datetime = as_datetime(time), energy = value)
Rob Hanssen
  • 153
  • 8