10

This question comes in an attempt to understand one of the answer in : How to check that a string is a palindrome using regular expressions?

Answer given by Markus Jarderot is :

/^((.)(?1)\2|.?)$/

Can someone please explain, whats exactly happening here....i need to do similar in Perl, but not able to understand this solution!!!

PS : I am not very good in perl so please go easy ....and also "this can't be considered a regular expression if you want to be strict" - i read this line, so i am aware that this not regex strictly

Community
  • 1
  • 1
NoobEditor
  • 15,563
  • 19
  • 81
  • 112
  • 9
    Re "i need to do similar in perl", That is Perl. – ikegami Mar 12 '14 at 11:09
  • 1
    Perl regex grammar is described in [perlre](http://perldoc.perl.org/perlre.html). – ikegami Mar 12 '14 at 11:10
  • See this explanation of Perl recursive regular expressions: http://www.rexegg.com/regex-recursion.html – Barmar Mar 12 '14 at 11:25
  • [`perldoc perlretut` - Recursive Patterns](http://perldoc.perl.org/perlretut.html#Recursive-patterns) – Zaid Mar 12 '14 at 11:41
  • 3
    Just FYI, you don't have to use a regex to check if a string is palindromic. The simple case is a one-liner that can be readily seasoned to taste: `sub is_palindromic { $_[0] eq reverse $_[0] }` – Zaid Mar 12 '14 at 11:44
  • Not sure this regex works as a palindrome checker. If I'm reading it right, "hih" will match "hiih" will not. – Paul Hutchinson Mar 12 '14 at 11:54
  • @Zaid : i agree, even `reverse` would be a better option,....but m not the one taking decisions!! :) – NoobEditor Mar 12 '14 at 11:57
  • @ikegami : right.. i do mess up PERL,perl and Perl!! :D – NoobEditor Mar 12 '14 at 11:57
  • 2
    @NoobEditor, It wasn't a comment on the spelling. I meant that the posted code is Perl code, so the request for Perl code is ...odd. (`perl` the executable or Perl the language would both make sense.) – ikegami Mar 12 '14 at 12:36

3 Answers3

14
  • ^ - matches beginning of string
  • ( - starts capture group #1
  • (.) - matches any single character except a newline, save it in capture group #2
  • (?1) - recurse = replace this group with the entire regexp capture group #1
  • \2 - matches the same thing as capture group #2. This requires the first and last characters of the string to match each other
  • | - creates an alternative
  • .? - optionally matches any one character that isn't a newline - This handles the end of the recursion, by matching an empty string (when the whole string is an even length) or a single character (when it's an odd length)
  • ) - ends capture group #1
  • $ - matches end of string or before a newline at the end of the string.

The recursion (?1) is the key. A palindrome is an empty string, a 1-character string, or a string whose first and last characters are the same and the substring between them is also a palindrome.

ikegami
  • 367,544
  • 15
  • 269
  • 518
Barmar
  • 741,623
  • 53
  • 500
  • 612
3

It might be easier to understand with this analogous function, that does the same thing for arrays:

sub palindrome {
  if (scalar(@_) >= 2) {
    my $first_dot = shift;
    my $slash_two = pop;
    return $first_dot eq $slash_two && palindrome(@_);
  } else {
    # zero or one items
    return 1;
  }
}

print "yes!\n" if palindrome(qw(one two three two one));
print "really?\n" if palindrome(qw(one two three two two one));

The (?1) notation is a recursive reference to the start of the first parenthesis in the regex, the \2 is a backreference in the current recursion to the (.). Those two are anchored at the start and end of 'whatever is matching at the current recursion depth', so everything else is matched at the next depth down.


ikegami suspects this is faster:

sub palindrome {
   my $next = 0;
   my %symbols;
   my $s = join '', map chr( $symbols{$_} ||= $next++ ), @_;
   return $s =~ /^((.)(?1)\2|.?)\z/s;
}
ikegami
  • 367,544
  • 15
  • 269
  • 518
bazzargh
  • 1,792
  • 10
  • 16
  • @ikegami, the point of this answer was not speed, but clarity, for someone who doesn't understand that last line. – bazzargh Mar 12 '14 at 16:46
0

I made this regEx few days ago. If you use it like this it will give you an array of all palindromes in a certain text. The example is for #JavaScript but you can use the regEx itself in any language to do the job. Works perfect for words to 21 chars or numbers to 21 digits. You can make it more accurate if you need to.


const palindromeFinder = /\b(\w?)(\w?)(\w?)(\w?)(\w?)(\w?)(\w?)(\w?)(\w?)(\w)\S?\10\9\8\7\6\5\4\3\2\1\b/g;

console.log(inputString.match(palindromeFinder));
  • 1
    That's really inefficient to match anything which isn't a palindrome, as the regex engine will try all combination of matching/non-matching a char for each `\w?`. That's on top of its limited usefulness due to its 21 character restriction. – Abigail Jun 21 '21 at 11:08