I am looking for a regex which matches words where the first two letters are equal to the last two letters. An example can clarify the requirement.
Given the following text:
The dodo was one of the sturdiest birds. An educated termite may learn how to operate a phonograph, but it's unlikely. I sense that an amalgam that includes magma will enlighten Papa.
How can I get this output:
answer = [('dodo', 'do'), ('sturdiest', 'st'), ('educated', 'ed'),
('termite', 'te'), ('phonograph', 'ph'),
('sense', 'se'), ('amalgam', 'am'), ('magma', 'ma'),
('enlighten', 'en')]
As you can see the 2 initial characters are the same as the last 2.
My thought is to filter any word that has the length of 4 characters or more, and with the first 2 characters of the word matching the last two.
So far I am up to word that is 4 or more characters.
[A-Za-z]{4,}
I don't need a complete program, I only need the regex.