Replace Regex Variable

Question

Hi I’m trying to use a replaceAll in java, to delete some html content of image:

This is my input

String html = '&nbsp;asd<i>&nbsp;qwe qwe<u>qweqwe</u></i><u>wqeqwesd.<img alt="vechile" src="urldirectionstring" style="float:left; height:190px; width:400px" /></u>';

So what I’m trying to do is replace all content of <img ...> and just return in replace this:

"Image Url: urldirectionstring";

So just replace the tag img, all the rest, let it, only touch this tag, and for now I have this, but its not enougth;

String replaceImg = html.replaceAll("<img[^>]*/>","Image Url: "+$srcImgdirection);

So, as you can see, I don’t have an idea how to get the urldirectionstring as variable in the replace.

----------- LAST EDIT -----------

I found this regex to get the urlstringdirection, but now I don’t how to replace it only and add the text:

String replaceImg = html.replaceAll("<img.*src="(.*)"[^>]*/?>","Image Url: "+$srcImgdirection);

are you aware that there are libraries for parsing HTML properly, and regex are not very suited to the task? — Patrick Parker, Feb 24 '17 at 13:38
I agree with Patrick but for future application of `replaceAll()`: you can access the capturing groups in the replacement string via `$group_number`, .e.g `replaceAll("src=\"([^\"]*)\"","src=\"prefix$1suffix\"")` to surround the attribute content with `"prefix"` and `"suffix". — Thomas, Feb 24 '17 at 13:41
However, as Patrick already pointed out regular expressions are no good fit for irregular languages such as hmtl (e.g. what happens with nested tags?) unless you _really_ know _everything_ that is to be expected. As an example, your expression ` — Thomas, Feb 24 '17 at 13:46
im using a library that generate the element image so are always with teh same style @Thomas tahnks! — Alberto Acuña, Feb 24 '17 at 14:48
And if you should ever upgrade to a newer version of that library, are you certain it will continue to generate HTML that can be parsed that way? See also http://stackoverflow.com/questions/701166/can-you-provide-some-examples-of-why-it-is-hard-to-parse-xml-and-html-with-a-reg — VGR, Feb 24 '17 at 15:24

score 2 · Accepted Answer · 2017-02-24T14:19:50.930

You could use:

String replaceImg = html.replaceAll(".*<img.*src=\"(.*?)\".*", "Image Url: $1");

This replaces the entire string and the output would be only Image Url: urldirectionstring (note that $1 contains the string matched in the expression, but just the part inside the parenthesis - basically each pair of parenthesis create "groups" that can be referenced later; as the regex contains only one pair, that's the first group, so you can reference it with $1)

If you want to replace only the img tag and keep the other tags intact, you could use:

String replaceImg = html.replaceAll("<img.*src=\"(.*?)\"[^>]*/?>", "Image Url: $1");

In this case, the output will be:  asd qwe qweqweqwewqeqwesd.Image Url: urldirectionstring

Replace Regex Variable

1 Answers1