I'm getting information from a website that has no API or anything. I've got the login and retrieve HTML part working and I've got a system that finds the right <div>
that will contain the information I need. But I need to remove all the information that isn't in the format "DD/MM/YYYY". So I need to remove all the parts of this string that aren't in that format. Here's an example of the returned <div>
:
<div id="wkDrop">
<div name="weekstarts" id="2018_29">Week 29-16/07/2018</div>
<div style="display:none" name="weekstarts" id="2018_30">Week 30-23/07/2018</div>
</div>
The parts that will change each week are the id="YYYY_WW" and Week WW-DD/MM/YYYY. So from the above example, I'm after two dates: 16/07/2018 and 23/07/2018.
Please bear in mind that there could be between 1 and 4 dates within this <div>
so it won't always be two weeks that I need to extract.
I would also ideally have each date retrieved printed on a new line.
Any ideas how I'd go about this?
Thanks in advance for any replies :)