-2

I have a text file with the values that look like this -

#3#6796#
#8226#16#
#8227#16#
#8256#8033#
#8254#8256#8033#
#8270#8256#8033#
#8272#8256#8033#
#8242#8081#
#8241#8242#8081#
#8243#8242#8081#
#8254#8242#8081#
#8265#8242#8081#

Number can be of any length but typically they are from 1 to 5 digits. I need to find duplicate(s) numbers in one particular string (e.g. not the whole file)

So, for example I need to find strings like -

#8241#8241#8081#
#8243#8242#8243#
#8254#8242#8254#
#8081#8242#8081#

(you can see repeating number in each string above - these are the ones of interest)... I cannot figure out regex for this, so far I was only able to find dup's in whole file but that's not what I need...

SGM
  • 97
  • 1
  • 9

1 Answers1

5

Try this:

\b(\d+)\b.*\b\1\b

It'll find a number (thank's to the word boundaries - \b - only whole numbers) and then match anything .* until the number is found again (\1 back reference). If the repeating one isn't found it doesn't match.

See it here at regex101.

Regards

SamWhan
  • 8,296
  • 1
  • 18
  • 45
  • Better use `\b(\d++)\b.*\b\1\b`, otherwise `3#23` will match. – Sebastian Proske May 11 '16 at 08:54
  • @SebastianProske I won't argue that without testing, but my initial thought was that since the boundary is inside the capture it'll be used in the back reference match as well. – SamWhan May 11 '16 at 08:57
  • @SebastianProske And I tested... And you're right. I must say I'm surprised though... Changing the answer. – SamWhan May 11 '16 at 09:00
  • Backreferences will not take zero-length-assertions into consideration that were made for the inital match, they just try to match the content of the group they are referencing at, so these assertions have to be made for the backreference again, if they are necessary. This is consistent for all regex-engines I know. – Sebastian Proske May 11 '16 at 09:06