0

I have a header XML node like <Fund and Footer node which is </Fund>, so I wrote something like this to retrieve the message associated with this ID Every XML has a id "33969871" (refer script below)

Provided I give the ID and run this (bash) it should find the ID and traverse back to the top of the message(i,e - <Fund and then to the bottom of the message (i.e </Fund>) and the output should that XML

Input file

<Fund LastUpdate="2017-05-23T10:32:53.563000000">   
<ID>13779321</ID>    
</Fund>    
<Fund LastUpdate="2017-05-23T10:32:53.563000000">    
<ID>13779322</ID>    
</Fund>    
<Fund LastUpdate="2017-05-23T10:32:53.563000000">    
<ID>13779323</ID>    
</Fund>    

My awk command

/usr/xpg4/bin/awk '/\<Fund/{flag=1;found=j=0; delete a}
  flag{a[++j]=$0}                            /'33969781'/ && flag{found=1}        
       /\<\/Fund>/{flag=0                      # Ending pattern & found show our array
               if(found){for (i=1;i<=j;i++){
                          print a[i]}}}' ABC_866.xml

But I do not get the results.

halfer
  • 19,824
  • 17
  • 99
  • 186
  • 3
    [Don't use regex to parse context sensitive languages](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags). Use an XML parser instead, like xmlstarlet. – Aserre Jan 09 '18 at 14:09
  • 2
    what should be the final output? – RomanPerekhrest Jan 09 '18 at 14:41
  • Your XML tags were missing in this question, because you didn't use the preview window prior to submitting it. Please always preview questions and ensure they are actually readable before publishing - this will save volunteers from needing to repair your question. – halfer Jan 12 '18 at 21:58

3 Answers3

1

You could use xpath

xpath -q -e '//Fund/ID[text()='13779321']/..' test.xml 

prints

<Fund LastUpdate="2017-05-23T10:32:53.563000000">   
  <ID>13779321</ID>    
</Fund>

for

<root>
  <Fund LastUpdate="2017-05-23T10:32:53.563000000">   
   <ID>13779321</ID>    
  </Fund>    
  <Fund LastUpdate="2017-05-23T10:32:53.563000000">    
    <ID>13779322</ID>    
   </Fund>    
  <Fund LastUpdate="2017-05-23T10:32:53.563000000">    
    <ID>13779323</ID>    
  </Fund>  
</root>
jschnasse
  • 8,526
  • 6
  • 32
  • 72
0

You can do it with a single grep statement:

ABC_866.xml:

<Fund LastUpdate="2017-05-23T10:32:53.563000000">   
<ID>13779321</ID>    
</Fund>    
<Fund LastUpdate="2017-05-23T10:32:53.563000000">    
<ID>13779322</ID>    
</Fund>    
<Fund LastUpdate="2017-05-23T10:32:53.563000000">    
<ID>13779323</ID>    
</Fund>    

Grep command and output:

# grep -B 1 -A 1 13779322 ABC_866.xml
<Fund LastUpdate="2017-05-23T10:32:53.563000000">
<ID>13779322</ID>
</Fund>

Explaining command:

-B : lines before matching line

-A : lines after matching line

AFAbyss
  • 195
  • 2
  • 10
0

with gawk's multi-char RS support and assuming the formatting of the files is as shown.

$ awk -v RS='</Fund>' '/13779321/{print $0 RT}' file

<Fund LastUpdate="2017-05-23T10:32:53.563000000">
<ID>13779321</ID>
</Fund>
karakfa
  • 66,216
  • 7
  • 41
  • 56