-2

I have some SQL that starts as follows:

String sql = "SELECT "+
        "    SI.SITE_ID "; ....

Eventually I want to write a regular expression which, based on the literal string (column name) "SITE_ID", will find the fully qualified column name (with the "SI." on front). After writing what I thought would work for that purpose (Pattern.compile("\\s+\\w+\\." + "SITE_ID" + "\\s+") and then, eventually, extract a capture) but it not returning the result I expected, I decided to simplify.

Now though even though I have simplified as much as I can possibly think to do, simply to search for the string literal "SITE_ID" in the sql variable, it still returns false, but sql.indexOf() returns a value greater than -1, so sql does contain the string:

boolean foundSiteId = Pattern.compile("SITE_ID").matcher(sql).matches(); // false
int siteIdPos = sql.indexOf("SITE_ID"); // 12

I find this surprising; it's not as though I'm trying to anchor "SITE_ID" to the front with ^ or the end with $. Additionally I have gone out to https://www.freeformatter.com/java-regex-tester.html (because re-compiling code over and over is time consuming) to try, and if I enter both "SITE_ID" (without the quotes) as the "Java Regular Expression" and "Entry to test against" it does return true. However if I provide " SITE_ID " with a leading and trailing space to test against, it returns true.

I guess I must just have some fundamental mis-understanding of Java regular expressions, though I am reasonably versed in them from other languages. What am I doing wrong, thanks.

Dexygen
  • 12,287
  • 13
  • 80
  • 147

1 Answers1

2

Call find() and you would get true, like

boolean foundSiteId = Pattern.compile("SITE_ID").matcher(sql).find();

As for your original goal, you can do something like

String sql = "SELECT " 
        + "    SI.SITE_ID ";
Pattern p = Pattern.compile("\\b(\\w+\\.SITE_ID)");
Matcher m = p.matcher(sql);
if (m.find()) {
    System.out.println(m.group(1));
}

I get (as I think you wanted)

SI.SITE_ID
Elliott Frisch
  • 198,278
  • 20
  • 158
  • 249
  • Hmm my first stab at it didn't work but then again a) I'm probably still trying to put too much on one line and b) I did add "\\W" (a non-word character, right?) at the end of the pattern (but outside the capturing parens) because for instance I might need to work against both "foo" and "foo_bar", but I would want to stop at a comma or white space for instance. Once I get it going probably early tomorrow I'll upvote -- just wish I could accept too but the question's been closed :( – Dexygen May 17 '18 at 22:56
  • @GeorgeJempty Add another `\\b` (that indicates word boundary). `Pattern p = Pattern.compile("\\b(\\w+\\.SITE_ID)\\b");` – Elliott Frisch May 17 '18 at 23:00
  • That worked thanks – Dexygen May 18 '18 at 12:29