0

what would be the regular expression to find duplicate set of digits in a numeric string?

Suppose

String s="0.1234523452345234";

From this string I need to obtain "2345". I tried the following regex-

String s="0.1234523452345234";
String regex="(\\d+)\\1+\\b";
Pattern p=Pattern.compile(regex);
Matcher m=p.matcher(s);
if(m.find())
{
    System.out.println(m.group(0));
}

But the output is

523452345234

While i need to print

2345

Shubham
  • 13
  • 4

2 Answers2

2

"(\\d+)\\1+\\b" macthes any sequence of digits followed immediately by this sequence at least once. It can be followed by multiple occurences of the sequence (the + quantifier). The regex also enforces a word boundary after the last matching sequence.

I think what you are looking for is the following regex:

"(\\d+).*\\1" (without word boundary, anything between your sequences, and only one repetition of the sequence. Example:

0.1234789897897123499
  ^^^^         ^^^^----  (\\d+) and \\1
      ^^^^^^^^^--------  .*

If your longest run needs to be followed immediately by the duplicate (no fillers inbetween), then drop the .* from the regex.

group(0) will return the full match (e.g. 12347898978971234), group(1) will contain the first capturing group (e.g. 1234).

knittl
  • 246,190
  • 53
  • 318
  • 364
0

I tried this regular expression that finds the number that duplicates one time , it can be shown by m.group(1) the first occurence :

    String s="0.1234523452345234";
    String regex="([0-9]+)\\1";
    Pattern p=Pattern.compile(regex);
    Matcher m=p.matcher(s);
    if(m.find())
    {
        System.out.println(m.group(1));
    }

Output :

2345
Karam Mohamed
  • 843
  • 1
  • 7
  • 15
  • i wont accept your answer cause of the repeated occurrence as mine code had the same output so...i'd wait if someone could get the exact desired output...meanwhile i'd solve my question without regex. Thanks – Shubham May 21 '20 at 15:03
  • well its just a try anyway , im already trying to see if i can get the exact output that you want , in that case i'll update my answer and inform you @Shubham – Karam Mohamed May 21 '20 at 15:05
  • 1
    @MohamedKaram: Hint: "lookahead assertions". What you want to match isn't 123123, it's 123 *that has 123 after it*. – cHao May 21 '20 at 15:30
  • Yes exactly i just have to print m.group(0) – Karam Mohamed May 21 '20 at 15:36
  • 1
    ...or, just match `"([0-9]+)(?=\\1)"`. – cHao May 21 '20 at 15:37