0

I'm not very well versed in regex; I use it from time to time, but never anything serious (I'm proud when I get a lookback to work, for example). Anyway, I am dealing with an issue where I have an XML output log that is ~9000 lines long, and it contains large Envelopes of useless information I want to pare out.

I could do this manually, but I have close to 30 of these babies to go through, so I felt like I would give a regex a go.

Here's what I'm searching for:

<_INFO_UNTIL_FIRST_LEFT_BRACE> " " <debug> _ALL_CHARS_BETWEEN_THIS_AND_THE_FOLLOWING_LITERAL_CHARACTER_SEQUENCE_: 
                <string>Device.IP.Interface.</string>
            </ParameterNames>
        </cwmp:GetParameterValues>
    </soapenv:Body>
</soapenv:Envelope>

I'm sorry that this is kind of a clunky thread, but it's quite the large chunk of text, and for some reason regexpal isn't being very nice with the .*

Joshua
  • 4,270
  • 10
  • 42
  • 62
  • 1
    PLEASE consider using an XML parser instead of a regex. Here's one of the many arguments on SO: http://stackoverflow.com/questions/335250/parsing-xml-with-regex-in-java – peter.murray.rust May 16 '13 at 21:33
  • @peter.murray.rust While I still ended up using regex (easier since I am guaranteed that this structure will not change), I had not looked into XML parsers before, so you gave me incentive to do so. Thank you. – Joshua May 17 '13 at 15:55

1 Answers1

1

You mean <debug>.*?</debug> tested it in NP++ works fine. The ? behind the .* makes it non greedy so it searches for the shortest possible string to match.

Or you want something like this <debug>.*?</?\w+> this one ends as soon as it finds any tag.

B8vrede
  • 4,432
  • 3
  • 27
  • 40