I have a list of strings that I need to filter through using regex. Some of the strings may contain URLS in the form of '(random_chars).(random_chars).(random_chars).(random_chars)...' etc.
I am trying to create a regex that will find such URLS but ignore URLS where the first set of (random_chars) does not match 'java'.
For example the strings below:
"test string (test.url.com) abcdef java.lang.Assertion uvwxyz www.google.com abcdef"
I'd expect it to match test.url.com and www.google.com but not java.lang.Assertion
"another test string /abc/xyz/lib/def/GH.tr test 200."
I wouldn't want it to match GH.tr
My current regex will match the below:
- test.url.com
- java.lang.Assertion
- www.google.com
- GH.tr
This is my current regex, and I have attempted to use a negative lookahead:
(?!java)(?:(?:\w+\.)+[\w]+)
What have I missed with my regex?