I'm trying to get data and export to CSV which I have main URL page and second URL main page which I have imported the following of these:
from bs4 import BeautifulSoup
import urllib.request
from urllib.parse import urlparse, parse_qs
import csv
def get_page(url):
request = urllib.request.Request(url)
response = urllib.request.urlopen(request)
mainpage = response.read().decode('utf-8')
return mainpage
mainpage = get_page(www.website1.com)
mainpage_parser = BeautifulSoup(mainpage,'html.parser')
secondpage = get_page('www.website2.com')
secondpage_parser = BeautifulSoup(secondpage,'html.parser')
The patterns of the data are the same such as Title, Address; thus, the code I use is "find" or "find_all" in each class; for example,
try:
name = page_parser.find("h1",{"class":"xxx"}).find("a").get_text()
print(name)
except:
print(name)
Which it worked. However, I couldn't get the "lat" and "lon" from url link in this html class:
<img class="aaa" alt="map" data-track-id="static-map" width="97" height="142" src="https://www.website.com/aaaaaaa;height=284&lat=18.111&lon=98.111&level=15&returnImage=true">
The code I'm trying to get latitude and longitude is:
for gps in secondpage_parser.find_all('img',{"class":"aaa"}, src=True):
parsed_url = urlparse(gps['src'])
mykeys = ['lat', 'lon']
gpslocation = [parse_qs(parsed_url.query)[k][0] for k in mykeys]
print(gpslocation)
But it has Key Error on the "gpslocation = [parse_qs(parsed_url.query)[k][0] for k in mykeys]" line which it indicates "KeyError: 'lat'"
I would like to know which part here I have the mistake or how should I fix it. Please help.