I'm trying to retrive stock information from yahoo finance. I have figured out how to use re.findall to get the prices into a list. If the stock symbol/price does not exist, I have found a way to retrive it saying ['No such ticker symbol']. My issue is I need to have the prices and No such ticket symbol found in the same list in order. This is my code so far. Is it possible to have two patterns in findall() so it can put them both into one list??
import urllib.request
import re
li = [i.strip().split() for i in open("Portfolio.txt").readlines()]
li[0:26] =[]
li = [x for x in li if x]
li.sort()
def retrieve_page(url):
my_socket = urllib.request.urlopen(url)
dta = str(my_socket.readall())
my_socket.close()
price = re.findall((r'<td class="col-price cell-raw:(.*?)"><span'), dta)
noprice = re.findall(r'<span class ="no-symbol">(.*?):<strong>', dta)
print(price)
print(noprice)
retrieve_page("http://finance.yahoo.com/quotes/AAPL,GOOG,HWP,IBM,MSFT")
My output is as follows
['107.120003', '552.25', '164.478699', '46.0938']
['No such ticker symbol']