1

I have big xml in java object. And I want to replace

<countryChannel countryCode="CountryCode"/>

with 

<countryChannel countryCode="CountryCode" active="true"></countryChannel>

Here is samle xml (input)

</articleMedia>
                    <channels>
                        <countryChannel countryCode="CountryCode"/>
                    </channels>

</articleMedia>
                    <channels>
                        <countryChannel countryCode="CountryCode"/>
                    </channels>

                    </articleMedia>
                    <channels>
                        <countryChannel countryCode="CountryCode"/>
                    </channels>

May I know please using regex how can I select only "/>" portion of all strings preceding by countryChannel countryCode="CountryCode" ?

I have one regex which only selects the whole strings https://regex101.com/r/NLHy2Y/1, but how can I select only all "/>" preceding by "countryChannel countryCode="CountryCode"" ?

Squeez
  • 343
  • 2
  • 3
  • 15

1 Answers1

1

In this case you don't even need a regex. You can use String.replace() with the right texts:

String input = "<countryChannel countryCode=\"CountryCode\"/>\r\nsalala\r\n<countryChannel countryCode=\"CountryCode\"/>";
String replacement = input.replace("<countryChannel countryCode=\"CountryCode\"/>", "<countryChannel countryCode=\"CountryCode\" active=\"true\"></countryChannel>");
System.out.println(replacement);

Here is a trick: if you want to edit an XML as text, then you must make some assumptions about how the xml is serialized. In this case I made the assumption that:

  1. All you want to edit only those <countryChannel> tags that have one countryCode attribute
  2. And their value is always CountryCode
  3. And all those tags are serialized like this: <countryChannel countryCode="CountryCode"/>

Probably you want to include other country codes too. As long as they don't contain quotes, you can do it with the following regex: "<countryChannel countryCode=\"([^\"]*)\"/>" and use the backreference $1 in the replacement. In this case you need the String.replaceAll() method, because that evaluates regexs. This is how the code looks like:

String input = "<countryChannel countryCode=\"CountryCode123\"/>\r\nsalala\r\n<countryChannel countryCode=\"CountryCode456\"/>";
String replacement = input.replaceAll("<countryChannel countryCode=\"([^\"]*)\"/>", "<countryChannel countryCode=\"$1\" active=\"true\"></countryChannel>");
System.out.println(replacement);

Explanation: [^...] is a negated character class. I.e. everything, except those characters. So [^"]* matches characters, except the quote. Which is cool because we want to stop matching at the end of the actual attribute.

So, you can check your big xml file and make sure that you have the right assumptions.

Disclaimer:

Do not put such regexes into production. These regexes are cool for editing files for yourself, as long as you check them manually. However, for production you better use XSLT.

Tamas Rev
  • 7,008
  • 5
  • 32
  • 49