4

I am learning Regex and after reading this post, I started doing some exercises and I got stuck on this exercise. Here are the two lists of words that should be matched and not matched

enter image description here

I started with

^(.).*\1$

and get bothered with sporous that get matched although it should not. So I found

^(.)(?!p).*\1$

that did the trick.

The best solution (uses one less character than my solution) given here is

^(.)[^p].*\1$

but I don't really understand this pattern. Actually I think I am confused about seeing the ^ anchor in a group [] and I am confused about seeing the ^ anchor somewhere else than at the beginning of the regex.

Can you help to understand what this regex is doing?

Community
  • 1
  • 1
Remi.b
  • 17,389
  • 28
  • 87
  • 168
  • 1
    For what it's worth, all of the words on the left are palindromes. There are more elegant solutions than excluding `p` as the second letter. – John Kugelman Apr 15 '15 at 17:38
  • I thought about that at first but then I sumbled upon [this post](http://stackoverflow.com/questions/233243/how-to-check-that-a-string-is-a-palindrome-using-regular-expressions) that says that there is no way to match palindromes in regex – Remi.b Apr 15 '15 at 17:42
  • "No easy way" in this context probably means you need to figure out the hard way. – tripleee Apr 15 '15 at 17:44
  • Yes, that's true, you can't match arbitrary palindromes. In this case though, you could match the first two letters against the last two. It's not a full match but it's better than `[^p]`, at least. – John Kugelman Apr 15 '15 at 17:44

5 Answers5

2

Anything in square brackets is a character class. This context uses its own mini-syntax which simply lists the allowed characters [abc] or a range of allowed characters [a-z] or disallowed characters by adding a caret as the very first character in the character class [^a-z].

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • Does it mean that `^` means "beginning of the line" when it is in the beginning of a regex and means `NOT` when it is in a group `[]`? – Remi.b Apr 15 '15 at 17:37
  • Yes, exactly. But only when it is the first character in the character class (not "group"; that's something else). – tripleee Apr 15 '15 at 17:44
1

[^p] simply means that any character will match, which is not p.

I'll explain the regex step by step in the following sentences.

^        start of the string
(.)      matches any character as group 1
[^p]     matches any character that is not p
 .*      matches any character that repeats zero or more times
 \1      matches the exact matched character(s) from group 1
 $       end of the string

A good source for learning regex is regex101.

NaCl
  • 2,683
  • 2
  • 23
  • 37
1

Your solution uses a negative look-ahead (?!p) that does not consume characters, and just checks if the next character is not p.

The other solution uses a negated character class [^p] that will consume a character other than p.

So, the final solution depends on what you need to match/capture.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
1

Here is the pattern explanation of ^(.)[^p].*\1$

^     start of the string/line
(.)   group first character
[^p]  any character except p
.*    zero or more characters
\1    first matched group again
$     end of the string/line

The above regex matches any string that starts and ends with the same character and not contains p at second position.

For detail explanation visit at regex101.

Read more about Negated Character Classes.

Braj
  • 46,415
  • 5
  • 60
  • 76
1

^ means assert position at start of the line, however, in a character class [ ] it equates to match character other than ...

Example:

^test-[^p]-1234

Result:

test-q-1234 // match
test-p-1234 // no match
test-o-1234 // match

https://regex101.com/r/wN4zF9/1

l'L'l
  • 44,951
  • 10
  • 95
  • 146