0

I want to check one long string contain multiple string.

I am trying to use below command.

          String[] words = {"GAGGAG", "AGGAC"};
          Pattern pattern = Pattern.compile("GAGGAG|AGGAC");
          if(pattern.matcher("GAGGAGGTC").find()){
                 System.out.println("find");
          }else{
                 System.out.println("Not find");
          }

Results supposed to be Not Find because "GAGGAGGTC" contain "GAGGAG" but does not contain "AGGAC"

How can I give option from "or" to "And"

And There is one more option.

          String[] words = {"GAGGAG", "AGGAC"};
          Pattern pattern = Pattern.compile("GAGGAG|AGGAC");
          if(pattern.matcher("GAGGAGGAC").find()){
                 System.out.println("find");
          }else{
                 System.out.println("Not find");
          }        

This is also should show "Not find". Because There is not allowing overlap part. "GAGGAG" and "AGGAC" is overlapping "AG" part from "GAGGAGGAAC"

VLAZ
  • 26,331
  • 9
  • 49
  • 67
clear.choi
  • 835
  • 2
  • 6
  • 19
  • 1
    Do you have to use regular expressions? Using `contains` is much simpler. – August Jan 16 '15 at 22:44
  • 1
    This might help you http://stackoverflow.com/questions/469913/regular-expressions-is-there-an-and-operator – Upio Jan 16 '15 at 22:44

3 Answers3

2

You must need to use a alternation operator | like below.

Pattern.compile("GAGGAG.*AGGAC|AGGAC.*GAGGAG");

Explanation:

  • GAGGAG.*AGGAC Matches the GAGGAG plus .* any character would present in-between and must have a AGGAC substring.

  • | OR operator, so that it would match any order.

  • AGGAC matches AGGAC , .* zero or more characters plus GAGGAG string.

Example 1:

  Pattern pattern = Pattern.compile("GAGGAG.*AGGAC|AGGAC.*GAGGAG");
  if(pattern.matcher("GAGGAGGAC").find()){
         System.out.println("find");
  }else{
         System.out.println("Not find");
  }   // Output: Not find

Example 2:

Pattern pattern = Pattern.compile("GAGGAG.*AGGAC|AGGAC.*GAGGAG");
  if(pattern.matcher("GAGGAGAGGAC").find()){
         System.out.println("find");
  }else{
         System.out.println("Not find");
  }   
}    // Output: find

Example 3:

Pattern pattern = Pattern.compile("GAGGAG.*AGGAC|AGGAC.*GAGGAG");
  if(pattern.matcher("AGGACFOOGAGGAG").find()){
         System.out.println("find");
  }else{
         System.out.println("Not find");
  }  // Output: find
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
1

You don't need a regex for that purpose.

Use String#contains:

public boolean checkContainsAll(String sentence, String[] words) {
    for(String word : words) {
        if(!sentence.contains(word)) {
            return false;
        }
    }
    return true;
}

In your example:

String[] words = {"GAGGAG", "AGGAC"};
String sentence = "GAGGAGGTC";
if(checkContainsAll(sentence, words)) {
    System.out.println("The sentence " + sentence + " contains all words");
} else {
    System.out.println("The sentence " + sentence +" does not contain all words.");
}

DEMO


UPDATE

To check that there is no overlapping, the simplest solution in my example would be to remove the words if they are found in the given sentence, so that they will not be present for next checks:

public boolean checkContainsAll(String sentence, String[] words) {
    for(String word : words) {
        if(!sentence.contains(word)) {
            return false;
        }
        sentence = sentence.replace(word, "");
    }
    return true;
}

DEMO

BackSlash
  • 21,927
  • 22
  • 96
  • 136
  • I am really sorry it's great but I forgot to mention since begging that does not allow overlapping part. – clear.choi Jan 16 '15 at 23:13
  • I mean sentence "GAGGAGGTC" not allow because there is overlapping part. but "GAGGAGNNNAGGAC" allow because it's not overlapping – clear.choi Jan 16 '15 at 23:14
0

Change your regex to this for a "and" operator

(?=GAGGAG)(?=AGGAC)
Chris Stillwell
  • 10,266
  • 10
  • 67
  • 77