1

how to convert to json,javascript var to dict

html code

<script type="text/javascript">
        var _admin_pv_props = {
            from_page: 'post',
            is_block_editor: 'true',
            source: 'wp-admin',
            blog_id: '74229154',
            user_type: ''
        };
        _tkq = window._tkq || [];
        _tkq.push( [ 'identifyUser', 70966694, 'dgkug' ] );
        _tkq.push( [ 'recordEvent', 'wpcom_admin_page_view', _admin_pv_props ] );
    </script>

i want to get var _admin_pv_props my code

from bs4 import BeautifulSoup
soup = BeautifulSoup(html_content, 'lxml')

pattern = re.compile(r'var _admin_pv_props = .*?;$', re.MULTILINE | re.DOTALL)
script = soup.find("script", text=pattern)
blog_str = pattern.search(script.text).group(0)
blog_str = blog_str.replace('var _admin_pv_props = ', '').replace(';', '')

print(blog_str)

{
            from_page: 'post',
            is_block_editor: 'true',
            source: 'wp-admin',
            blog_id: '74229154',
            user_type: ''
        }

but blog str is not standard json

xin.chen
  • 964
  • 2
  • 8
  • 24
  • Does this answer your question? [How to convert raw javascript object to a dictionary?](https://stackoverflow.com/questions/24027589/how-to-convert-raw-javascript-object-to-a-dictionary) – HedgeHog Mar 23 '22 at 15:47

1 Answers1

2

Try:

import re
from ast import literal_eval

txt = """
<script type="text/javascript">
        var _admin_pv_props = {
            from_page: 'post',
            is_block_editor: 'true',
            source: 'wp-admin',
            blog_id: '74229154',
            user_type: ''
        };
        _tkq = window._tkq || [];
        _tkq.push( [ 'identifyUser', 70966694, 'dgkug' ] );
        _tkq.push( [ 'recordEvent', 'wpcom_admin_page_view', _admin_pv_props ] );
    </script>
"""

data = re.search(r"_admin_pv_props = ({.*?});", txt, flags=re.S).group(1)
data = re.sub(r"([^\s]+): ", r"'\1': ", data)

data = literal_eval(data)
print(data)

Prints:

{
    "from_page": "post",
    "is_block_editor": "true",
    "source": "wp-admin",
    "blog_id": "74229154",
    "user_type": "",
}
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91