1

givin this file.xml

  <session id = 1111>

    <query text = text1 >
            <response>
                    firstresponse1
            </response>
            <response>
                    secondresponse1
            </response>
    </query>
    <query text = secondtext >
            <response>
                    !!!aresponse2!!!
            </response>
            <response>
                    !!!aresponse3!!!
            </response>
    </query>
    <query text = thirdtext>
            <response>
                    firstreponse3
            </response>
            <response>
                    secondresponese4
            </response>
    </query>
    </session>

i want to get both response tags in secondtext

output :

!!!aresponse2!!!

!!!aresponse3!!!

.

what is the most efficient way to do that ?

  • 3
    Bash provides no tools for parsing XML. I expect you'd need to use external tools, probably something that lets you query based on [XPath](https://en.wikipedia.org/wiki/XPath). What have you tried? – ghoti Nov 16 '17 at 15:20
  • 3
    Note that your input file is not a well formed XML. Attributes must be quoted in XML. – choroba Nov 16 '17 at 15:22
  • 1
    Possible duplicate of [Extract XML Value in bash script](https://stackoverflow.com/questions/17333755/extract-xml-value-in-bash-script) – R4F6 Nov 16 '17 at 15:54

1 Answers1

4

The right way is to use XML/HTML parsers like xmllint or xmlstarlet.

xmllint solution:

xmllint --html --xpath "//query[@text='secondtext']/response/text()" file.xml 2>/dev/null

The output:

            !!!aresponse2!!!

            !!!aresponse3!!!
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105