The source code of html page is show as below
<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=gb2312">
<script>
document.domain = "xxxx.com";
var jsonObj = {
list: [
{ip: "166.255.255.25", port: 1080, path: "/data/pps.jpeg"}
]
}
var jsParObj = {param1: 25532, param2: 54463}
</script>
</head>
<body>
</body>
</html>
I try to extract the data from that html page and store them in json format.
soup = BeautifulSoup(html_doc, 'html.parser')
script_text = soup.find('script')
Using python library BeautifulSoup4, I get this
<script>
document.domain = "xxxx.com";
var jsonObj = {
list: [
{ip: "166.255.255.25", port: 1080, path: "/data/pps.jpeg"}
]
}
var jsParObj = {param1: 25532, param2: 54463}
</script>
How can I remove the <script>
tag and translate that data into json format?
Also, I use python.