I'm trying to parse HTML in Python
that has an inline script in it. I need to find a string inside of the script, then extract the value. I've been trying to do this in regex
for the past few hours, but I'm still not convinced this is the correct approach.
Here is a sample:
['key_to_search_for']['post_date'] = '10 days ago';
The result I want to extract is: 10 days ago
This regex gets me part of the way, but I can't figure out the full match:
^\[\'key_to_search_for\'\]\[\'post_date\'\] = '(\d{1,2})+( \w)
However, even once I can match with regex
, I'm not sure the best way to get only the value. I was thinking of just replacing the keys with blanks, like .replace('['key_to_search_for']['post_date'] = '',''), but that seems inefficient.
Should I be matching the regex
then replacing? Is there a better way to handle this?