0

I'm attempting to get 2 different elements from an XML file, I'm trying to print them as the x and y on a scatter plot, I can manage to get both the elements but when I plot them it only uses one of the dates to plot the other elements. I'm using the below code to get a weather HTML and save it as an XML.

        url = "http://api.met.no/weatherapi/locationforecast/1.9/?lat=52.41616;lon=-4.064598"
        response = requests.get(url)
        xml_text=response.text
        weather= bs4.BeautifulSoup(xml_text, "xml")
        f = open('file.xml', "w")
        f.write(weather.prettify())
        f.close()

I'm then trying to get the time ('from') element and the ('windSpeed' > 'mps') element and attribute. I'm then trying to plot it as an x and y on a scatter plot.

 with open ('file.xml') as file:
     soup = bs4.BeautifulSoup(file, "xml")
     times = soup.find_all("time")
     windspeed = soup.select("windSpeed")
     form = ("%Y-%m-%dT%H:%M:%SZ")
     x = []
     y = []
     for element in times:
         time = element.get("from")
         t = datetime.datetime.strptime(time, form)
         x.append(t)
     for mps in windspeed:
         speed = mps.get("mps")
         y.append(speed)
     plt.scatter(x, y)         
     plt.show() 

I'm trying to make 2 lists from 2 loops, and then read them as the x and y, but when I run it it gives the error; raise ValueError("x and y must be the same size") ValueError: x and y must be the same size

I'm assuming it's because it prints the list as datetime.datetime(2016, 12, 22, 21, 0), how do I remove the datetime.datetime from the list.

I know there's probably a simple way of fixing it, any ideas would be great, you people here on stack are helping me a lot with learning to code. Thanks

2 Answers2

0

Simply make two lists one containing x-axis values and other with y-axis values and pass to scatter function

plt.scatter(list1, list2);

jack jay
  • 2,493
  • 1
  • 14
  • 27
  • I attempted to use 2 lists but it's throwing the error I've put in the question, any ideas? –  Dec 29 '16 at 17:48
  • just dont put `plt.scatter(x,y) and plt.show()` in for loop. – jack jay Dec 29 '16 at 18:23
  • @M.Man what is `t` in `x.append(t)`? it might also causing you error – jack jay Dec 29 '16 at 18:26
  • it should read, 't = datetime.datetime.strptime(time, form)' , the error at the end is ValueError("x and y must be the same size") –  Dec 29 '16 at 18:35
  • plz make these twwo statement outside the for loop `plt.scatter(x,y) and plt.show()`. By putting them inside loop the size of both list is not same in first iteration and hence it throws error. – jack jay Dec 29 '16 at 18:38
  • Even outside the loop it still throws the error, the x list is being printed with datetime.datetime infront of every iteration. –  Dec 29 '16 at 18:45
  • I think either `t` is containing a list or the number of elements in both x and y list are different. check them these two issues. – jack jay Dec 29 '16 at 18:54
  • refer this [link](http://stackoverflow.com/questions/10624937/convert-datetime-object-to-a-string-of-date-only-in-python) and connvet `t` before appending to list x – jack jay Dec 29 '16 at 19:20
  • I've realised it's not the format it's the way I've got the data that is the problem, the x has 155 and the y only has 50, so I need to go back figure out how to select only "from" elements with "windSpeed" elements. I appreciate the help anyway. –  Dec 29 '16 at 20:04
  • thanks @M.Man and make sure you always get pair of data i.e. time corresponding to windspeed. – jack jay Dec 29 '16 at 21:01
0

I suggest that you use lxml for analysing xml because it gives you the ability to use xpath expressions which can make life much easier. In this case, not every time entry contains a windSpeed entry; therefore, it's essential to identify the windSpeed entries first then to get the associated times. This code does that. There are two little problems I usually encounter: (1) I still need to 'play' with xpath to get it right; (2) Sometimes I get a list when I expect a singleton which is why there's a '[0]' in the code. I find it's better to build the code interactively.

>>> from lxml import etree
>>> XML = open('file.xml')
>>> tree = etree.parse(XML)
>>> for count, windSpeeds in enumerate(tree.xpath('//windSpeed')):
...     windSpeeds.attrib['mps'], windSpeeds.xpath('../..')[0].attrib['from']
...     if count>5:
...         break
...     
('3.9', '2016-12-29T18:00:00Z')
('4.8', '2016-12-29T21:00:00Z')
('5.0', '2016-12-30T00:00:00Z')
('4.5', '2016-12-30T03:00:00Z')
('4.1', '2016-12-30T06:00:00Z')
('3.8', '2016-12-30T09:00:00Z')
('4.4', '2016-12-30T12:00:00Z')
Bill Bell
  • 21,021
  • 5
  • 43
  • 58