1

I have XML text. In that XML text I have tags named <pskc:KeyPackage> this tag may or may not contain different tags and one of that tag may be <pskc:IssueNo>1</pskc:IssueNo>. With regular expressions i want to select all <pskc:KeyPackage> tags, which contain <pskc:IssueNo>1</pskc:IssueNo> tag. How can i accomplish this?

please see regex on flowing link: https://regex101.com/r/7HICeu/2

This is my sample input:

<pskc:KeyPackage>  

 <testTag>val1</testTag>

  <pskc:IssueNo>1</pskc:IssueNo>

  <testTag2>val2</testTag2>

</pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>2</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>3</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>1</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>2</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>3</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>1</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>2</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>3</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>1</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>2</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>3</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>1</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>2</pskc:IssueNo>

  </pskc:KeyPackage>
  <pskc:KeyPackage>

      <pskc:IssueNo>3</pskc:IssueNo>

  </pskc:KeyPackage>

i want to mach following tags:

<pskc:KeyPackage>  

 <testTag>val1</testTag>

  <pskc:IssueNo>1</pskc:IssueNo>

  <testTag2>val2</testTag2>

</pskc:KeyPackage>

and i also want to match each tag that look like this:

  <pskc:KeyPackage>

      <pskc:IssueNo>1</pskc:IssueNo>

  </pskc:KeyPackage>

I want to say again that <pskc:KeyPackage> tag will contain many different tags, like it does in following example:

<pskc:KeyPackage>  

 <testTag>val1</testTag>

  <pskc:IssueNo>1</pskc:IssueNo>

  <testTag2>val2</testTag2>

</pskc:KeyPackage>

I want to mach the whole <pskc:KeyPackage> tag only if it contains <pskc:IssueNo>1</pskc:IssueNo>. How can i accomplish this?

P.S. I have also tried many different regex-es, on of them is following regex: <pskc:KeyPackage>[\s\S]*<pskc:IssueNo>1<\/pskc:IssueNo>[\s\S]*<pskc:KeyPackage> but it maches whole xml string.

Thank you

Enlico
  • 23,259
  • 6
  • 48
  • 102
Tornike Shavishvili
  • 1,244
  • 4
  • 16
  • 35

1 Answers1

2

This works:

/<pskc:KeyPackage>((?!<\/pskc:KeyPackage>).)*<pskc:IssueNo>1<\/pskc:IssueNo>.*?<\/pskc:KeyPackage>/gs

(I don't know all regex flavors, but it looks like this works for Perl, JS, and Python.)

How it works:

  • it matches <pskc:KeyPackage>,
  • followed by any amount (the first *) of any characters (the first .) including newlines (the s flag), each of which matches where </pskc:KeyPackage> does not match ((?!…)),
  • followed by <pskc:IssueNo>1</pskc:IssueNo>.
  • Then it also matches up the the closest (.*?) closing </pskc:KeyPackage>.

Other two details:

  • /s need to be escaped, \/,
  • depending on your application, you might want to use non-capturing parentheses for the first group (change the first ( to (?:), whose sole purpose is to let you apply the first * to it.
Enlico
  • 23,259
  • 6
  • 48
  • 102