1

I have a String content that contains (among other text) some XML. I'd like to search inside of this XML for sensitive payment data that should be masked out (eg credit card number).

The string is not a single XML content (that I could parse using JAXB or traverse with dom), but also contains other values like headers, eg:

Response-Code: 200 Encoding: ISO-8859-1 Content-Type: text/xml Headers: {connection=[Keep-Alive], ... <SOAP:Envelope xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/"> <SOAP:Body> ... <ns2:Payment> <ns2:CreditCard Number="1234567723" />

What is the best to find the content and replace the numbers using value.replaceAll(".", "X");? I mean, how can I best find these values to be replaced inside the xml?

membersound
  • 81,582
  • 193
  • 585
  • 1,120

2 Answers2

2

Couldn't you try to get the index of the String cn = "CreditCard Number=" and then replace the sub-string which starts from there and goes for 16 characters from there (length of a credit card number) forward?

Or am I wrong in assuming that you have the whole header as a string?

You could also do pattern matching with a regex expression.

Nishant Lakhara
  • 2,295
  • 4
  • 23
  • 46
Octoshape
  • 1,131
  • 8
  • 26
  • This may work for a quick-and-dirty solution in a very narrowly defined context, but be aware of its [**significant limitations**](http://stackoverflow.com/a/20219284/290085). – kjhughes Nov 26 '13 at 14:09
2

Be careful of taking shortcuts such as string or even regex replacements against XML. You can easily miss many variations:

  • Number could appear as an attribute on elements other than CreditCard.
  • Insignificant whitespace could intervene between the CreditCard element and Number attribute.
  • Attribute order is insignificant in XML, so Number could appear as the first attribute on one occasion but in another position on other occasions.

See also Can you provide some examples of why it is hard to parse XML and HTML with a regex?

It's really not hard to do it the right way robustly:

  1. Get the XML message by using the appropriate calls in a Web Services framework, or, if you must, scan ahead to the XML lexically.
  2. Use a real XML parser. Make a simple modification to a common identity transformation/copy routine that echos everything out except for the element/attribute value which you wish to replace.
Community
  • 1
  • 1
kjhughes
  • 106,133
  • 27
  • 181
  • 240