I have a regular expression that should work to remove all content in a file before div id="content"
and including/after <div id="footer"
([\s\S]*)(?=<div id="content")|(?=<div id="footer)([\s\S]*)
I am using the re module to work with the regex in python. The code I am using in my python:
file = open(file_dir)
content = file.read()
result = re.search('([\s\S]*)(?=<div id="content")|(?=<div id="footer)([\s\S]*))', content)
I have tried using re.match as well. I am unable to return the content I want. Right now I can only get it to return everything BEFORE the div#content