0

I'd like to mask out sensitive credit card details. Therefore trying to create matcher that finds:

  • that there is a CreditCard tag
  • that the same line has a Number field
  • that the same line has a CVC field

<CreditCard Number="123456789" CVC="111" />

Then I want to replace the numbers/values that are found. So far I have: (CreditCard.*CVC=").*?". This would match the string CreditCard Number="123456789" CVC="111".

What do I have to change so that only the numbers inside either CVC or Number double quotes are matched?

membersound
  • 81,582
  • 193
  • 585
  • 1,120
  • 2
    Try XPath instead of Regex. – Bergi Nov 26 '13 at 14:34
  • 3
    What language are you using? It would probably be a much better idea to use an XML parser – Explosion Pills Nov 26 '13 at 14:35
  • Java, but the string where the XML is placed does not ONLY consist of xml, but lot's of other data. So I cannot use a parser... – membersound Nov 26 '13 at 14:37
  • 1
    Are those attributes always in this order? Or could CVC occur before Number? – Tim Pietzcker Nov 26 '13 at 14:38
  • 3
    [Don't parse XML with regexes](http://stackoverflow.com/questions/8577060/why-is-it-such-a-bad-idea-to-parse-xml-with-regex) – alroc Nov 26 '13 at 14:40
  • Probably it will have the same order, but I'm not sure about the future... – membersound Nov 26 '13 at 14:41
  • Why are you parsing XML with regex? You lose the benefits of using a markup language that has many excellent libraries for manipulating values like you want. – MadConan Nov 26 '13 at 14:48
  • I know I lose, but as I wrote above the content that I want to match is not only xml, but also lot's of noise content. Then I first would have to strip out everything that does not belong right to the xml request, which feels like doing unnecessary double work. – membersound Nov 26 '13 at 14:52

1 Answers1

2

Lookahead and Lookbehind are the magic words. Here is an example to match your CVC number...

(?<=CVC=\")\d+(?=\")