0

I am trying to workout the regex for searching string which satisfies count of letters where not in specific order

such as:

AAABBBCCCDDD
BBBAAADDDCCC
CCCAAABBBDDD

are TRUE:

so far, I have got A{3}B{3}C{3}D{3} would matches the first line, but for other lines would be needing different order.

is there any great solution that would work out?

Elo
  • 11
  • 2

1 Answers1

2

You can match and capture a letter, then backreference that captured character. Repeat the whole thing as many times as needed, which looks to be 4 here:

(?:([A-Z])\1{2}){4}

https://regex101.com/r/vrQVgD/1

If the same character can't appear as a sequence more than once, I don't think this can be done in such a DRY manner, you'll need separate capture groups:

([A-Z])\1{2}(?!\1)([A-Z])\2{2}(?!\1|\2)([A-Z])\3{2}(?!\1|\2|\3)([A-Z])\4{2}

https://regex101.com/r/vrQVgD/2

which is essentially 4 of a variation on the below put together:

(?!\1|\2|\3)([A-Z])\4{2}

The (?!\1|\2|\3) checks that the next character hasn't occurred in any of the previously matched capture groups.

CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
  • I may have to search letters with different count rather than 4 for all. Also may have to find a string which would consists with more than 40 different chars with different count. – Elo Jan 21 '21 at 21:24
  • If they're that dynamic, if you want DRY code, I think a regex alone probably wouldn't be the right approach. You could match (any) sequential character sequences with `findAll` and then iterate through the matches programatically. – CertainPerformance Jan 21 '21 at 21:33