0

Problem: I am trying to match a link and replace it with an empty string. To add to that I do not want to match links with .png, only all other links.

So far I have come up with:

(https?|ftp|gopher|telnet|file|Unsure|http):((//)|(\\\\))+[\\w\\d:#@%/;$()~_?\\+-=\\\\\\.&]*

But when I have added negative look behind for .png I have not been successful.

So basically what might be the correct regular expression to match a http link but not match http link with .png?

I would prefer a greedy approach.

Please find the expected input/output below

Expected Results

The Text Input 1:

    <img id="segmentForm:wfib" style="border: 0px;"            
    src="http://localhost:8080/example/emg/WidgetFillInTheBlankRed.com" alt=""    />
    <img src="../../img/WidgetFillInTheBlankGreen.png" alt="POB1" />

The Text Output 1 [Not Same as Input, Link Matched and Replaced with Empty String]

    <img id="segmentForm:wfib" style="border: 0px;" src="" alt="" />
    <img src="../../img/WidgetFillInTheBlankGreen.png" alt="POB1" />

The Text Input 2:

    <img id="segmentForm:wfib" style="border: 0px;"         

    src="http://localhost:8080/example/emg/WidgetFillInTheBlankRed.png" alt="" />
    <img src="../../img/WidgetFillInTheBlankGreen.png" alt="FIB1" />

The Text Output 2 [Same as Input]

    <img id="segmentForm:wfib" style="border: 0px;"     

    src="http://localhost:8080/example/emg/WidgetFillInTheBlankRed.png" alt="" />
    <img src="../../img/WidgetFillInTheBlankGreen.png" alt="FIB1" />
RealSkeptic
  • 33,993
  • 7
  • 53
  • 79
ripher
  • 103
  • 2
  • 9

1 Answers1

2
(https?|ftp|gopher|telnet|file|Unsure|http):((\/\/)|(\\))+[\w\d:#@%\/;$()~_?\+-=\\\.&]*(?<!png)$

Tested using https://regex101.com/

http://server/page.com -> 1 Match
http://server/page.png -> No Match
Smutje
  • 17,733
  • 4
  • 24
  • 41
  • When I test with (https?|ftp|gopher|telnet|file|Unsure|http):((//)|(\\\\))+[\\w\\d:#@%/;$()~_?\\+-=\\\\\\.&]*(?<!png)$ all links are getting accepted. When I remove the $ then I get src="g". Can you test with the whole text that I have provided in the question. – ripher Feb 22 '15 at 21:34
  • Basically it is not really working when I input https://localhost:8080/example/ex/Just.png I get just "g" – ripher Feb 22 '15 at 22:00
  • Just to add on to this, I see that your solution works in regex101. I see in the debugger it is backtracking for not having png. The same solution is not working in Java, do you think java library behave differently for negative lookbehind? – ripher Feb 23 '15 at 03:03