0

How can I use the command line to search through a folder of html and css files identifying html files that:

  • Have divs with class .highlight
  • Have img tags
  • Do not have divs with class .main
Craig
  • 167
  • 12
user3060126
  • 461
  • 1
  • 4
  • 13
  • Analogous to [this answer on another question](http://stackoverflow.com/a/7549515/1470607), you can use [PhantomJS](http://phantomjs.org/) to open the files and then check if they fulfill all the requirements you have. You can then, for example, return a list of the files that do fulfill your requirements. PhantomJS (headless Webkit) has the advantage that it won't simply die on ill-formed XML (as opposed to some parsers). – Etheryte Jun 05 '14 at 20:29

1 Answers1

1

For simple queries you can use grep (avaliable on *nix platforms usually, can install on Windows as well) which uses regular expressions, but that would only work in one case here. For your image tags:

    grep -R <img *.html

Otherwise you would actually need a parser because what you are talking about requires parsing html tags and examining their contents. There are many libraries out there for this- but it's not something built into the command line.

Using regular expressions to parse HTML: why not?

Community
  • 1
  • 1
Craig
  • 167
  • 12