3

Some programmers say they are "... sick of using regex to parse things that really shouldn't be parsed with regex" (see this popular @nickf comment). Other ones, like me, prefer to program more with PHP (and regular expressions) and avoid another framework (like Lex/Yacc)... But not "so more", and that is the first question:

When we must avoid to create a (complex) PHP parser, migrating to a real "parser generator engine"?

The second question, that completes the first, is "What the best PHP tool-kit" for parsing complex things? Today, 2013, there are a standard interoperable parser-content, XML (or SimpleXML arrays, etc.), and "standard parsers", like DOM API, XPath and XSLT.

As I sketched, perhaps there are no "best solution", but a good practices recipe to select sometimes a solution, then another.

Summarizing: 1) "When should we leave pure PHP or PHP+RegEx, to use parser generators?"; 2) "What the best parser generators for PHP, or the recipe/context/condictions for select the best ones?"


(add in edited version)

I think readers will appreciate generic discussion, but, to give a guideline, here some scope:

  • (answering @HugoDelsing) In general I "dont care how it works but want to get quick results". In a few cases I need optimization, when I "want full control over everything".

  • (answering @bizzehdee) In recent years, I am parsing many kinds of text strings: raw text of controlled vocabularies; Lex URNs; raw text of References/Bibliography and another stiling text, like Vancouver Style; CSS strings; dates; e-mail text; units and equations (to recognize, normalize, and eventually convert). In another times, I developed simple command line tools.


Curiosity (was my motivation to post this question): my answer about use of "PHP alternatives for Lex/Yacc approach" is oscilating every month, with positive and negative "useful votes"... Perhaps it is only an indication that there are "hate and love" behaviour, but, by other hand, can be a good question!

Community
  • 1
  • 1
Peter Krauss
  • 13,174
  • 24
  • 167
  • 304
  • 2
    so, what are you parsing? are you trying to parse the PHP its self? or are you trying to parse "something" using PHP? – bizzehdee Mar 12 '13 at 10:43
  • The common interpretation from your intro is based on confusing matching and parsing. Not sure there is a generic answer for your actual question, as there aren't generic tools that apply to everything. Depends on your input lingo or representation to be parsed with. Surely `unpack()` is more suited for (please note the quotes) "parsing" binary data than a lexer or a regex (not implying it's infeasible though). – mario Mar 12 '13 at 11:16
  • 1
    If you want to learn how something works and want full control over everything you do: `Do it yourself`. If you dont care how it works but want to get quick results > Use something somebody else wrote for you. Atlast > If you are coding above your knowledge, there is a good chance somebody else wrote better/more efficient code and it might be wise to use it – Hugo Delsing Mar 12 '13 at 11:25
  • PHP does not have a decent parser generator. By "decent" I mean one that is actively maintained/developed, and has decent documentation. – Bart Kiers Mar 12 '13 at 11:25
  • for the record: question has been re-posted to Programmers: http://programmers.stackexchange.com/questions/190557/how-to-choose-a-proper-parser-generator-for-php – gnat Mar 14 '13 at 15:09

0 Answers0