0

Can some explain to me why you can not use regex to describe a recursive structure.

Eg

  • A = *A?B
shua
  • 1
  • 3
    That's a consequence of the formal definition of a regular language. See http://en.wikipedia.org/wiki/Regular_language – p.s.w.g Jan 19 '14 at 10:20
  • @p.s.w.g Careful! Not everything running under the label "regex" is equivalent to regular languages, many are strictly more powerful. –  Jan 19 '14 at 10:27
  • 1
    @delnann while the question is tagged "regex" and has no language-specific question, `p.s.w.g`'s statement is applicable (and in any case is perfectly accurate). – David-SkyMesh Jan 19 '14 at 10:30

2 Answers2

3

Because a regex (in the sense of regular languages, at least) corresponds to a finite state machine. You'd need an infinite number of states to track arbitrary levels of nesting.

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
  • "regex" no longer means "finite state machine", unless you want to get into a long and fruitless terminology flamewar with half the programming population. Even backtracking breaks that isomorphism, not to mention other extensions. –  Jan 19 '14 at 10:29
  • 1
    @delnan: I realise that ;) but unless qualified further, I assume "regex" means "regular language"... – Oliver Charlesworth Jan 19 '14 at 10:31
  • It is this assumption I have a problem with. In my experience, the majority uses "regex" to mean "whatever my language(s) support" which in most cases is at least backtracking. –  Jan 19 '14 at 10:36
2

Even though regular expressions cannot express recursion by formal definition, for some languages like Perl and Ruby there are 'regex' implementations that support recursion.

Also for python there is an alternative regex implementation supporting recursion that's not in the standard lib.

But again, regular languages don't have recursive structures, so formally regular expressions cannot express recursive structures by definition.

Community
  • 1
  • 1
chtenb
  • 14,924
  • 14
  • 78
  • 116