-1

I found the following regex in one of the Android Source file:

String regex = "\\s+(?i)src=\"cid(?-i):\\Q" + attachment.mContentId + "\\E\"";
if(string.matches(regex)) {
    Print -- Matched
} else {
    Print -- Not Found
}

NOTE: attachment.mContentId will basically have values like C4EA83841E79F643970AF3F20725CB04@gmail.com

I made a sample code as below:

String content = "Hello src=\"cid:something@gmail.com\" is present";
    String contentId = "something@gmail.com";
    String regex = "\\s+(?i)src=\"cid(?-i):\\Q" + contentId + "\\E\"";
    if(content.matches(regex))
        System.out.println("Present");
    else
        System.out.println("Not Present");

This always gives "Not Present" as output.

But when I am doing the below:

System.out.println(content.replaceAll(regex, " Replaced Value"));

And the output is replaced with new value. If it is Not Present, then how could replaceAll work and replace the new value? Please clear my confusions.

Can anybody say what kind of content in string will make the control go to the if part?

Chandra Sekhar
  • 18,914
  • 16
  • 84
  • 125

2 Answers2

2

String regex = "\\s+(?i)src=\"cid(?-i):\\Q" + attachment.mContentId + "\\E\"";

Break it down:

\\s+ - Match 1 or more spaces 

(?i) - Turn on case-insensitive matching for the subsequent string

src=\"cid - match src="cid

(?-i) - Turn off case-insensitive matching

: - Obviously a colon

\\Q - Treat all following stuff before \\E as literal characters, 
      and not control characters. Special regex characters are disabled until \\E

attachment.mContentId - whatever your string is

\\E - End the literal quoting sandwich started by \\Q

\" - End quote

So it will match a string like src="cid:YOUR-STRING-LITERAL"

Or, to use your own example, something like this string will match (there are leading white space characters):

            src="cid:C4EA83841E79F643970AF3F20725CB04@gmail.com"

For your update

The problem you're running into is using java.lang.String.matches() and expecting it does what you think it should.

String.matches() (and Matcher) has a problem: it tries to match the entire string against the regular expression.

If you use this regex:

String regex = "\\s+(?i)src=\"cid(?-i):\\Q" + attachment.mContentId + "\\E\"";

And this input:

String content = "Hello src=\"cid:something@gmail.com\" is present";

content will never match the regex because the entire string doesn't match the regular expression.

What you want to do is use Matcher.find - this should work for you.

String content = "Hello src=\"cid:something@gmail.com\" is present";
String contentId = "something@gmail.com";
Pattern pattern = Pattern.compile("\\s+(?i)src=\"cid(?-i):\\Q" + contentId + "\\E\"");

Matcher m = pattern.matcher(content);

if(m.find())
    System.out.println("Present");
else
    System.out.println("Not Present");

IDEone example: https://ideone.com/8RTf0e

wkl
  • 77,184
  • 16
  • 165
  • 176
  • @ChandraSekhar I added a section at the end explaining why your code doesn't work. – wkl Mar 20 '14 at 14:12
  • Wow. Thanks a lot. A small question. I have tried content.matches("(.*)" + regex + "(.*)") and it is working fine. can u suggest which one I should use? – Chandra Sekhar Mar 20 '14 at 14:23
  • 2
    If you only need to verify that your string match some regex surrounding the regex with .* is a valid option. If instead you need to operate on every single match (i.e. replacing some portion of it or extracting some value) than using Matcher.find is a better approach. – nivox Mar 20 '14 at 14:56
0

That regex will match any

src="cid:contentId"

where only contentId needs to match case sensitive. For instance giving your example contentId (C4EA83841E79F643970AF3F20725CB04@gmail.com) these strings will match:

SrC="CiD:C4EA83841E79F643970AF3F20725CB04@gmail.com" src="cid:C4EA83841E79F643970AF3F20725CB04@gmail.com" SRC="CID:C4EA83841E79F643970AF3F20725CB04@gmail.com"

while these will not match:

src="cid:c4Ea83841e79F643970aF3f20725Cb04@GmaiL.com" src="cid:C4EA83841E79F643970AF3F20725CB04@GMAIL.COM"

Also the contentId part is escaped (\Q ... \E) so that the regex engine will not consider special characters inside it.

nivox
  • 2,060
  • 17
  • 18
  • The reason is that String.match method requires the regex to match the full string. Your regex required a string starting with space, followed by `src="cid:contentId"`. If you try to match a string that doesn't exactly respect this specification you will get false. The String.replaceAll method instead searches in the string any substring that matches and replaces it. You can make your example print present just by doing the following: `regex= ".*" + regex + ".*"` which will tell the regex engine to match any string which contains a substring matching the original regex. – nivox Mar 20 '14 at 14:53