I want to store heading tags into mysql, I need to store from different languages (e.g. english, persian, arabic and etc) For example my string must be something like below:
{"h1": "زبان فارس - english"}
But when I want to store in my db the unicode changing to something like below:
{"h1": "\u0628\u0631\u062e\u0648\u0631\u062f"}
My python 3 code is:
data = {}
if not soup.find('h1'):
h1 = ""
else:
heading_flag = 1
h1 = (soup.find('h1').text).strip()
" \n \t".join(h1.split())
data['h1']="{}".format(h1)
if not soup.find('h2'):
h2 = ""
else:
h2 = (soup.find('h2').text).strip()
" \n \t".join(h2.split())
data['h2']="{}".format(h2)
if not soup.find('h3'):
h3 = ""
else:
heading_flag = 1
h3 = (soup.find('h3').text).strip()
" \n \t".join(h3.split())
data['h3']="{}".format(h3)
if not soup.find('h4'):
h4 = ""
else:
heading_flag = 1
h4 = (soup.find('h4').text).strip()
" \n \t".join(h4.split())
data['h4']="{}".format(h4)
if not soup.find('h5'):
h5 = ""
else:
heading_flag = 1
h5 = (soup.find('h5').text).strip()
" \n \t".join(h5.split())
data['h5']="{}".format(h5)
if not soup.find('h6'):
h6 = ""
else:
heading_flag = 1
h6 = (soup.find('h6').text).strip()
" \n \t".join(h6.split())
data['h6']="{}".format(h6)
if heading_flag ==1:
page_heading = json.dumps(data)
else:
page_heading = ""
page_content(initUrl[0], page_title, page_desc, page_heading)
My problem is related to data variable, because when I pass soup.find('h6').text as page_heading variable I can store with correct encoding, and string is something like (زبان فارس - english) in mysql db not like (\u0628\u0631\u062e\u0648\u0631\u062f). I tried encode('utf8') but it was't useful. I've appreciate you for any help.
Update: My function to save into db:
def page_content(link_id, page_title, page_desc, page_heading):
insQuery="INSERT IGNORE INTO ex_ctnt(cw_id, c_title, c_meta_desc, c_heading) VALUES(%s, %s, %s, %s)"
if ((len(page_title)>0)):
connection = pymysql.connect(host="localhost", user="root", passwd="kiuhddh87d83gfgfg", db="hiihh8y929g2")
myquery = connection.cursor()
myquery.execute(insQuery,(link_id, page_title, page_desc, page_heading))
connection.commit()
connection.close()
else:
print("problem with the length of page title or description (Not Inserted !)")