We are in need of a DOM parser, that will be able to run a bunch of patterns and would store the results. For this we are looking for libraries that are open and we can start on,
- able to select elements by regexp (for example grab all elements that contain "price" either in class, id, other attributes like meta attributes),
- should have a lot of helpers like: remove comments, iframes, etc
- and be pretty fast.
- can be run from browser extensions.