0

I am trying to isolate the first row in a table using python and bs4. Within the first row I would like to pull the data and write it to a csv file with the associated time and date.

import bs4
from urllib.request import urlopen as u_req
from bs4 import BeautifulSoup as soup
import requests
import csv
my_url = 'http://mis.ercot.com/misapp/GetReports.do?reportTypeId=11485&reportTitle=LMPs%20by%20Electrical%20Bus&showHTMLView=&mimicKey/'

#opening up connection, grabbing the page
uClient = u_req(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html, "html.parser")

#find rows in ercot 5 min historical data
ercot_row_saved=""
for record in page_soup.findAll('tr'):
   print(record.text)

A new link is loaded to the site every five minutes, ultimately I would like the program to run every five minutes to capture the data in the csv file. the above code gets me to all the data in a text file. Any help would be appreciated.

tholder
  • 3
  • 4
  • Hi, welcome to stack overflow. Explain your question in more detail. – Jeroen Heier Aug 09 '18 at 17:27
  • So the csv link is to a price point for one mega-watt at any given electrical bus in Texas. The prices update every five minutes and every time they update, they are saved and loaded to the site above. I am trying to write a program that will go in an pull the first csv link every five minutes. Since the site updates every 5 minutes, this will pull the new 5 minute data. The program will pull the csv link and save it to my hardrive. Clear out the data and then run again. I'm not really looking to have all that answered, just a few suggestions or a nudge in the right direction. – tholder Aug 09 '18 at 19:34
  • Seems the URL in example isn't working. Is it live? – Andrej Kesely Aug 10 '18 at 06:31

1 Answers1

0

I simplify it to find first link on page with xpath:

import requests
from lxml import etree
from io import StringIO
r = requests.get('http://mis.ercot.com/misapp/GetReports.do?reportTypeId=11485&reportTitle=LMPs%20by%20Electrical%20Bus&showHTMLView=&mimicKey/')
htmlparser = etree.HTMLParser()
tree = etree.parse(StringIO(r.text), htmlparser)
tree.xpath("//a/@href")[0] 

Then check Downloading and unzipping a .zip file without writing to disk

lojza
  • 1,823
  • 2
  • 13
  • 23