I would like to parse through a file containing exported Google Chrome bookmarks. This is a .html file For each bookmark, I am interested in the URL, the ADD_DATE, and the title which is at the end of the hyperlink tag.
Here is a snippet of a Chrome bookmarks html file.
<!DOCTYPE NETSCAPE-Bookmark-file-1>
<!-- This is an automatically generated file.
It will be read and overwritten.
DO NOT EDIT! -->
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
<TITLE>Bookmarks</TITLE>
<H1>Bookmarks</H1>
<DL><p>
<DT><A HREF="https://www.programcreek.com/2011/03/java-write-to-a-file-code-example/" ADD_DATE="1508652899" ICON="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAABVUlEQVQ4jcWSwYoTURBFz30vDTG8xcTgIhsHYdwFI7jRf/G3/AT/pA0mkF9woSLBTkyMbfKa6a5yIQmIDCIRvLu6VB2qLiV3F/9TcverSwGbvx1y4HS33N0v2aB3F+BkuzsxxrPf3LZkhxQDkhMkIemXIeDsxRi1Wq0o38z4vvvCxzbw6t2RGCUjEMxMXddJkkIIAmRmMjOFELTZbFgul6qqzyrfLnT/9ps+NdLr95mISW3beoyR7XZLXdeMRiMGgwHuTs6Z2WzG8XikKApy0/BgeMWz5y/40IhH9/QzxPV67YvFgq7r6Pf7TCYTxuMx8/mcqqooigJ3J4RAzpnr64dMp09lZq66rq0sSw6HA0VR0LYtIQRSSuz3e3q93m/ZNE3Dk+mUxzc39KqqspQSw+HwHKQkuq4jpXSuTxBJmBlfdztyzv/gD4CXlwAultw9/rntbv0A1ZC8BgHlLSQAAAAASUVORK5CYII=">How to Write a File Line by Line in Java?</A>
<DT><A HREF="https://stackoverflow.com/questions/2885173/how-do-i-create-a-file-and-write-to-it-in-java" ADD_DATE="1508652914">How do I create a file and write to it in Java? - Stack Overflow</A>
<DT><A HREF="https://www.javacodegeeks.com/2010/05/getting-started-with-youtube-java-api.html" ADD_DATE="1508996959" ICON="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAABvUlEQVQ4jZWTP2sUURTFf3dmxcUUCVgFHZwU1m66dBmbLOkmXRy2WBtBDMh+gjCfIEg2hAiCi8tgY1axkKTZEQsTm52tLTISUoo2gQ1m3rXYP4TdjCavu+8ezj2cey5c8wV151ew5bSGdeG6BAoJSjqs5V/gqWC/lFnqKzwoGAlPo6VkHHOpgmDLqQLsflUXWBcgs7QLXI0AZQMktSA0oz+9d6uy5xsoqTB99qZcmyAoVj+5nOMefu8kzu1vDVMwyZ3pTnzycz7kBqk5lyMAUQBqEx7crOy1BPx+ZR6uLDxZF/CitWMZ9DsCpb52M9d7vZwWKpt3PSPSRiXcPaTLiMDyRGhg5PPqtut+/PJyJhNeiDEzKpoUKf7uDRU8qjttC/nw7mAnUaznAl3byPvTaCm5OLXXLE9sLXeNQd05UkhbB68A9QBsI/Pjq8wNkkIqkKIGRACJL8PlKrgYorNmeQUG+VAWo7Xjx0OclUeQWWzQD5E/FeyXBrIWgerqtuv+lwA1o7xndt9EY9uhse25t0/TUS//mEQbID/AxAv3d9zZutMW86cWPTu5mom95nIMxACzm44v4InaHmP38Bf/laoOI/FjiQAAAABJRU5ErkJggg==">Getting Started with YouTube Java API | Java Code Geeks - 2017</A>
</DL><p>
Note that some bookmarks have the "ICON" property and some do not.
I would like to retrieve everything except for the "ICON" value. My objective is to retrieve the info from file and store in a database for organizing and utilizing the data in another application.
I looked into regular expressions for this but don't have much experience with them to get this working fully. My preferred language for this is Java but if Python works better I could use that.