0

I have trouble getting my regex right for the below use case.

<LOB>
            <LOBStatusInfo>
                <LOB>Mobile</LOB>
                <Status>Active</Status>
            </LOBStatusInfo>
            <LOBStatusINfo>
                <LOB>Voice</LOB>
                <Status>Active</Status>
            </LOBStatusInfo>
            <LOBStatusInfo>
                <LOB>Internet</LOB>
                <Status>Disconnect</Status>
            </LOBStatusInfo>
        </LOBStatus>

In the above XML, I'm looking to extract only the status corresponding to Voice (which is active).

So far, I was able to get the LOB itself, but not the corresponding status.

ps: I'm a newbie, please pardon if the details weren't enough.

1 Answers1

1

We don't parse XML with regex, check: Using regular expressions with HTML tags Instead, you can use and a proper xml parser. What is your environment, language ?

Test :

Input file

 <LOB>
    <LOBStatus>
        <LOBStatusInfo>
            <LOB>Mobile</LOB>
            <Status>Active</Status>
        </LOBStatusInfo>
        <LOBStatusInfo>
            <LOB>Voice</LOB>
            <Status>Active</Status>
        </LOBStatusInfo>
        <LOBStatusInfo>
            <LOB>Internet</LOB>
            <Status>Disconnect</Status>
        </LOBStatusInfo>
    </LOBStatus>
</LOB>

Command

(just an example, now in shell, but the query can be used in any language of your choice)

xmllint --xpath '//LOB[text()="Voice"]/../Status/text()' file.xml

or

xmllint --xpath '//LOB[text()="Voice"]/following-sibling::Status/text()' file.xml

Output:

Active
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223