0

I would like to extract all possible substrings B##### M##### CB##### CM##### LB##### LM##### (where # are digits) from a string. Each string can contain one or more of these possible substring.

The result which a string like ("LB03452 - Test, name of the file B12345, test2 - name of second file") should be the String list {LB03452, B12345}.

Mathias
  • 177
  • 2
  • 10
  • Possible duplicate of [Getting every possible permutation of a string or combination including repeated characters in Java](https://stackoverflow.com/questions/5113707/getting-every-possible-permutation-of-a-string-or-combination-including-repeated) – bembas Oct 05 '18 at 19:53

2 Answers2

5

You can solve this using the Pattern and Matcher classes. Here's a tiny example that you could adapt as needed:

String input = "LB03452 - Test, name of the file B12345, test2 - name of second file";
List<String> output = new ArrayList<>();
Pattern p = Pattern.compile("(B|M|CB|CM|LB|LM)[0-9]+");
Matcher m = p.matcher(input);

while (m.find()) {
    output.add(m.group());
}

If I print the output

System.out.println(output);

I get:

[LB03452, B12345]
Ben P.
  • 52,661
  • 6
  • 95
  • 123
0

I finally used the following inelegant version, because the entry string comes from Text Recognition and contains some recognition errors. Said method provides sometimes better result.

public List extract_file_references(String string){

    List<String> output = new ArrayList<>();

    for(int i=0; i<string.length(); i++) {

        if ((Character.toString(string.charAt(i)).equals("B") || Character.toString(string.charAt(i)).equals("M")) && Character.isDigit(string.charAt(i+1)) && Character.isDigit(string.charAt(i+2)) && Character.isDigit(string.charAt(i+3)) && Character.isDigit(string.charAt(i+4)) && Character.isDigit(string.charAt(i+5)) ) {
            output.add(string.substring(i, i+6));
        } else if (Character.toString(string.charAt(i)).equals("C") && ((Character.toString(string.charAt(i+1)).equals("B"))||Character.toString(string.charAt(i+1)).equals("M")) && Character.isDigit(string.charAt(i+2)) && Character.isDigit(string.charAt(i+3)) && Character.isDigit(string.charAt(i+4)) && Character.isDigit(string.charAt(i+5)) && Character.isDigit(string.charAt(i+6)) ) {
            output.add(string.substring(i, i+7));
        } else if (Character.toString(string.charAt(i)).equals("L") && ((Character.toString(string.charAt(i+1)).equals("B"))||Character.toString(string.charAt(i+1)).equals("M")) && Character.isDigit(string.charAt(i+2)) && Character.isDigit(string.charAt(i+3)) && Character.isDigit(string.charAt(i+4)) && Character.isDigit(string.charAt(i+5)) && Character.isDigit(string.charAt(i+6))) {
            output.add(string.substring(i, i+7));
        }
    }//fin du for

    return output;

    }// fin de extract_file_reference
Mathias
  • 177
  • 2
  • 10