I got a python script that scrapes domain names. I just parsed the json response because it renders html code. I used pandas to read the html and get the body
which is the html content. I tried to print it and then i got the correct values. Now that i got it, I want to save every result in a mysql database. How could I achieve it?
Here is my script
mydb = mysql.connector.connect(
host="localhost",
user="root",
passwd="",
database='domainscrape'
)
mycursor = mydb.cursor()
print(mydb)
pageNumber = 0
while True:
driver.implicitly_wait(3)
driver.get('https://reversewhois.domaintools.com/?ajax=mReverseWhois&call=ajaxGetPreviewPage&q=%5B%5B%5B%22whois%22%2C%222%22%2C%22VerifiedID%40SG-Mandatory%22%5D%5D%5D&o='+str(pageNumber))
time.sleep(3)
pre = driver.find_element_by_tag_name("pre").text
data = json.loads(pre)
if data['body']:
table = data['body']
tables = pd.read_html(table,skiprows=1)
df = tables[-1]
print(df.to_string(index=False))
pageNumber += 1
continue
else:
break
I got a result like this
0vh-cl0ud.sg 2017-10-12 KEY-SYSTEMS GMBH
0vh-cloud.sg 2017-10-12 KEY-SYSTEMS GMBH
0vhcloud.sg 2017-10-12 KEY-SYSTEMS GMB
Tried saving it to an csv file and i got good results
df.to_csv('Domains.csv', mode='a', sep=',',index=False)
but i dont want to import csv to mysql. I just want to directly insert the rows in an existing mysql table in it.
How could i format it like 0vh-cl0ud.sg
is the domain 2017-10-12
is the date and KEY-SYSTEMS GMBH
is the company? I dit not include the header since in every iteration it prints the header and i dont want it.