0

Please help me. I need to find all log that contain <field id="0" value="0200"/> AND <field id="24" value="001"/>. I use this regex <log.+?<field id="0" value="0200"/>.+?<field id="24" value="001"/>.+?</log> using Notepad++, that does not work very well.

<log>
  <receive>
    <isomsg direction="incoming">
      <header>6008610000</header>
      <field id="0" value="0200"/>
      <field id="3" value="440000"/>
      <field id="11" value="000001"/>
      <field id="24" value="001"/>
      <field id="41" value="12345678"/>
      <field id="42" value="0000012345678"/>
    </isomsg>
  </receive>
</log>
<log>
  <receive>
    <isomsg direction="incoming">
      <header>6008610000</header>
      <field id="0" value="0300"/>
      <field id="3" value="440000"/>
      <field id="11" value="000002"/>
      <field id="24" value="002"/>
      <field id="41" value="12345678"/>
      <field id="42" value="0000012345678"/>
    </isomsg>
  </receive>
</log>
<log>
  <receive>
    <isomsg direction="incoming">
      <header>6008610000</header>
      <field id="0" value="0200"/>
      <field id="3" value="440000"/>
      <field id="11" value="000002"/>
      <field id="24" value="001"/>
      <field id="41" value="12345678"/>
      <field id="42" value="0000012345678"/>
    </isomsg>
  </receive>
</log>
<log>
  <receive>
    <isomsg direction="incoming">
      <header>6008610000</header>
      <field id="0" value="0200"/>
      <field id="3" value="440000"/>
      <field id="11" value="000002"/>
      <field id="24" value="004"/>
      <field id="41" value="12345678"/>
      <field id="42" value="0000012345678"/>
    </isomsg>
  </receive>
</log>
danieljph
  • 47
  • 5
  • It's bad juju to regex match XML. It doesn't work very well, because XML is a contextual thing, where regex isn't. See: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Sobrique Mar 08 '16 at 10:55

2 Answers2

0

First thing is that you should use some XML parser for this purpose.

If you really want to do this with regex, you need to note:

  • .+ wildcard does not match new line character. Try [\s\S]+ instead,
  • acceptable fields might appear in random order,
  • your regex shouldn't match more than one <log></log> block.

Edit:

If regex is the only option, and you just want to find all occurrances of mentioned fields, you can try some ugly regex, like

((<field id="0" value="0200".*)\s*((<field.*)*\s*)*?(<field id="24" value="001".*))+?|((<field id="24" value="001".*)\s*((<field.*)*\s*)*?(<field id="0" value="0200".*))+?`

which means: match text that starts with <field id="0" value="0200", than having only <field.* lines, ending with <field id="24" value="001".* or starts with <field id="24" value="001".* etc.

kolejnik
  • 136
  • 5
0

Notepad++ does not support multiline regex.

anon
  • 1