1

I am using the maven replacer plugin and I've run into a situation where I have a regular expression that matches across lines which I need to run on the input file until all matches have been replaced. The configuration for this expression looks like this:

<regexFlags>
    <regexFlag>DOTALL</regexFlag>
</regexFlags>
<replacements>   
    <replacement>
        <token>\@([^\n\r=\@]+)\@=([^\n\r]*)(.*)(\@default\.\1\@=[^\n\r]*)(.*)</token>
        <value>@$1@=$2$3$5</value>
    <replacement>
<replacements>

The input could look like this:

@d.e.f@=y
@a.b.c@=x
@h.i.j@=aaaa
@default.a.b.c@=QQQ
@asdfasd.fasdfs.asdfa@=23423
@default.h.i.j@=234
@default.RR.TT@=393993

and I want the output to look like this:

@d.e.f@=y
@a.b.c@=x
@h.i.j@=aaaa

@asdfasd.fasdfs.asdfa@=23423

@default.RR.TT@=393993

The intention is to re-write the file, but without the tokens with a @default prefix, where another token without the prefix has already been defined.

@default.a.b.c@=QQQ and @default.h.i.j@=234 have been removed from the output because other tokens already contains a.b.c and h.i.j.

The current problem I have is that the replacer plugin only replaces the first match, so my output looks like this:

@d.e.f@=y
@a.b.c@=x
@h.i.j@=aaaa

@asdfasd.fasdfs.asdfa@=23423
@default.h.i.j@=234
@default.RR.TT@=393993

Here, @default.a.b.c=QQQ is gone, which is correct, but @default.h.i.j@=234 is still present.

If I were writing this in code, I think I could probably just loop while attempting to match on the entire output, and break when there are no matches. Is there a way to do this with the replacer plugin?


Edit: I may have over simplified my example. A more realistic one is:

@d.e.f@=y
@a.b.c@=x
@h.i.j@=aaaa
@default.a.b.c@=QQQ
@asdfasd.fasdfs.asdfa@=23423
@default.h.i.j@=234
@default.RR.TT@=393993
@x.y.z@=0
@default.q.r.s@=1
@l.m.n@=8.3
@q.r.s@=78
@blah.blah.blah@=blah

This shows that it's possible for a default.x.x.x=y to precede a x.x.x=y token (as @default.q.r.s@=1 preceedes @q.r.s@=78`), my prior example wasn't clear about this. I do actually have an expression to capture this, it looks a bit like this:

\@default\.([^\n\r=@|]+)@=([^\n\r|]*)(.*)@\1@=([^\n\r|]*)(.*)

I know line separators are missing from this even though they were in the other one - I was experimenting with removing all line separators and treating it as a single line but that hasn't helped. I can resolve this problem simply by running each replacement multiple times by copying and pasting the configurations a few times, but that is not a good solution and will fail eventually.

FrustratedWithFormsDesigner
  • 26,726
  • 31
  • 139
  • 202

2 Answers2

1

I don't believe you could solve this problem as is, a work-around is to reverse the order of the file top to bottom, perform lookahead regex and then reverse the result order
pattern = @default\.(.*?)@[^\r\n]+(?=[\s\S]*@\1@) Demo

another way (depending on the capabilities of "Maven") is to run this pattern

@(.*)(@[\s\S]*)@default\.\1.*  

and replace with @$1$2 Demo in a loop until there are no matches

then run this pattern

@default\.(.*)@.*(?=[\s\S]*\1)  

and replace with nothing Demo in a loop until there are no matches

alpha bravo
  • 7,838
  • 1
  • 19
  • 23
  • Ok, this is my fault for over simplifying the example. I have a pattern very similar to this, to capture situations where the `default` line comes before the non-`default` line. The problem is that if I have multiple tokens with both default and non-default variants, the patterns will match once. A solution that works, but is not great is simply copying and pasting each `...` element multiple times to ensure that all possible matches are replaced, but that will break as soon as I have *n* ``s and *n+1* default/non-default combinations. – FrustratedWithFormsDesigner Dec 13 '13 at 15:20
  • @FrustratedWithFormsDesigner then you would execute the pattern, reverse the output top to bottom, execute the same pattern again on the output then reverse the order to get the desired results. – alpha bravo Dec 13 '13 at 15:36
  • the final order of the output doesn't matter: I just need to get rid of all tokens beginning with `default` when the same token exists without the prefix. Also, these regular expressions are being executed from Maven, I don't know of any plugins I could add to this sequence of build steps to reverse the lines in a file. I suppose I could write one, but that seems like an extreme solution. – FrustratedWithFormsDesigner Dec 13 '13 at 18:58
  • @FrustratedWithFormsDesigner made another suggestion above – alpha bravo Dec 13 '13 at 20:12
  • Yes, that is my problem: I don't know of a way to get maven to "loop until there are no matches". Maven is a plugin-based build tool: http://maven.apache.org/ I am using a regular expression within a Maven POM file. Since Maven is not an actual programming language, I can't just write "loop until no matches", I am working within the constraints of the available plugins. And I'm not really sure if what I want to do can be done in Maven, with this plugin or a different one. – FrustratedWithFormsDesigner Dec 13 '13 at 21:26
0

It doesn't look like the replacer plugin can actually do what I want. I got around this by using regular expressions to build multiple filter files, and then applying them to the resource files.

My original goal had been to use regular expressions to build a single, clean, and tidy filter file. In the end, I discovered that I was able to get away with just using multiple filters (not as clean or tidy) and apply them in the correct order.

FrustratedWithFormsDesigner
  • 26,726
  • 31
  • 139
  • 202