Problem:
I have a servlet that generate reports, more specifically the table body of a report. It is a black box, we do not have access to the source code.
Nevertheless, its working satisfactory, and the servlet is not planned to be rewritten or replaced anytime soon.
We need to modify its response text in order to update a few links it generates to other reports, I was thinking of doing it with a filter that would find the anchor text and replace it using a regex.
Research:
I ran into this question that has a regex filter. It should be what I need, but then maybe not.
I am not trying to parse HTML in the strict sense of the parsing term, and I am not working with the full spec of the language. What I have is a subset of HTML tags that compose a table body, and does not have nested tables, so the HTML subset generated by the servlet is not recursive.
I just need to find / replace the anchors targets and add an attribute to the tag.
So the question is:
I need to modify the output of a servlet in order to change all links of the kind:
<a href="http://mypage.com/servlets/reports/?a=report&id=MyReport&filters=abcdefg">
into links like:
<a href="http://myOtherPage.com/webReports/report.xhtml?id=MyReport&filters=abcdefg" target="_parent">
Should I use the regex filter written by @ Jeremy Stein or is there a better solution?