5

I'm aware regular expressions, as a concept, represent a single Regular Language, which can be processed via a DFA/NFA with O(n) ~O(2^m) complexity, being n the size of the string and m the size of the regex. Most stack-overflow discussions about the subject quote this awesome article that proves it.

However, regex implemented in modern languages deal with more than regular languages. For instance, it's possible to recognize palindromes with regex, a context-free grammar problem that, when solved via a push-down automata(PDA), is known to have a O(n³) complexity.

I would like to know were exactly in the Extended Chomsky Hierarchy modern (perl or python re, for example) regex implementations fit, or, at least, their worst possible complexity.

nluizsoliveira
  • 355
  • 1
  • 9
  • 2
    This really depends on the implementation details of the programming language/library. 'Modern' is not a fixed term. Please define the functionalities and maybe someone will answer your question (which in my opinion should be posted on [cstheory](https://cstheory.stackexchange.com/)). – cronoik Jul 23 '20 at 12:18
  • Edited, i'll take perl for 'modern' since many regex implementations are perl-derived. I did not know cstheory stackexchange exist, I'll be posting there too, thank you @cronoik! – nluizsoliveira Jul 23 '20 at 12:20
  • I did not post the question in cstheory yet, since it involves language implementations (not pure cs theory) and it just started to receive visibility here @tripleee. – nluizsoliveira Jul 23 '20 at 13:04
  • 1
    While clearly interesting, I don't think this is suitable for Stack Overflow. You are basically hoping someone will write an academic article about this, and until they bite and one exists, this will only be landfill collecting "good question, but" upvotes and stray comments or, in the worst case, speculative or bogus non-answers. – tripleee Jul 23 '20 at 13:14
  • As I agree that an academic article would be interesting, I don't think one is needed. Whoever implemented or is extremely familiar with pearl/python regex is aware of its limitations and capabilities. It's not an unsolved problem that needs to be researched, it's a closed problem, scoped by implementation choices of python/perl creators, that a few expertise programmers should know the answer. – nluizsoliveira Jul 23 '20 at 13:27
  • Even if there are some people on SO who can provide a proper answer, it is still not a question in [scope of SO](https://stackoverflow.com/help/on-topic). I would recommend to post this question on [cstheory](https://cstheory.stackexchange.com/). But also there you should limit your question to python or perl, not both because the have different [features](https://en.wikipedia.org/wiki/Comparison_of_regular-expression_engines). – cronoik Jul 23 '20 at 14:02
  • @cronoik. [cstheory.se] is an academic site intended for discussion about "*research-level* questions in *theoretical computer science*", where "research-level" means that it is the sort of question which might come up in the course of writing a Ph.D. thesis or doing post-doctoral research. This question is not really at that level, and suggesting that it be migrated there is not doing anyone any favours. In most cases, CS questions asked on SO could be asked on [cs.se], if the questioner has made enough of an effort to investigate the themes. But that site definitely discourages... – rici Jul 23 '20 at 20:59
  • ... questions about specific programming languages, so your suggesting that the question be limited to python or perl would be counter-productive; the question would be more likely to be rejected. A CS-oriented discussion would be about a particular regex feature, independent of the language(s) it happens to be implemented in. If a question is about a specific language, it is closer to be in-scope for SO, in my opinion. – rici Jul 23 '20 at 21:01

0 Answers0