0

I have this xml file

<?xml version="1.0" encoding="UTF-8"?>
<?mso-infoPathSolution solutionVersion="1.0.0.182" productVersion="15.0.0" PIVersion="1.0.0.0" href="http://sp01/hp/Therapy/Forms/template.xsn" name="urn:schemas-microsoft-com:office:infopath:Therapy:-myXSD-2013-03-01T10-07-30" ?>
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.3"?>
<my:myFields
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns:pc="http://schemas.microsoft.com/office/infopath/2007/PartnerControls"
        xmlns:ma="http://schemas.microsoft.com/office/2009/metadata/properties/metaAttributes"
        xmlns:d="http://schemas.microsoft.com/office/infopath/2009/WSSList/dataFields"
        xmlns:q="http://schemas.microsoft.com/office/infopath/2009/WSSList/queryFields"
        xmlns:dfs="http://schemas.microsoft.com/office/infopath/2003/dataFormSolution"
        xmlns:dms="http://schemas.microsoft.com/office/2009/documentManagement/types"
        xmlns:xhtml="http://www.w3.org/1999/xhtml"
        xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/2013-03-01T10:07:30"
        xmlns:xd="http://schemas.microsoft.com/office/infopath/2003"
        xml:lang="en-us">
    <my:PatientID>1</my:PatientID>
    <my:Name>name</my:Name>
    <my:Age>29.0000000000000</my:Age>
    <my:Gender>gender</my:Gender>
    <my:Date>2015-12-09</my:Date>
    <my:group1>
        <my:group2>
            <my:field1>1</my:field1>
            <my:PName>pname</my:PName>
            <my:PPrice>10000.0000000000</my:PPrice>
            <my:field11 xsi:nil="true"></my:field11>
        </my:group2>
        <my:group2>
            <my:field1>9</my:field1>
            <my:PName>pname
            </my:PName>
            <my:PPrice>10000.0000000000</my:PPrice>
            <my:field11 xsi:nil="true"></my:field11>
        </my:group2>
    </my:group1>
    <my:field4></my:field4>
    <my:field5></my:field5>
    <my:Status>false</my:Status>
    <my:Confirm>false</my:Confirm>
    <my:field6></my:field6>
    <my:field7></my:field7>
    <my:field8></my:field8>
    <my:TPrice>20000</my:TPrice>
    <my:field12></my:field12>
    <my:field13></my:field13>
    <my:insurance>1</my:insurance>
    <my:Partner>partner</my:Partner>
    <my:Doctor>doctor</my:Doctor>
</my:myFields>

And I want to filter this with regex and get only the value of tag , i.e.

<my:group1>
    <my:group2>
        <my:field1>1</my:field1>
        <my:PName>pname</my:PName>
        <my:PPrice>10000.0000000000</my:PPrice>
        <my:field11 xsi:nil="true"></my:field11>
    </my:group2>
    <my:group2>
        <my:field1>9</my:field1>
        <my:PName>pname
        </my:PName>
        <my:PPrice>10000.0000000000</my:PPrice>
        <my:field11 xsi:nil="true"></my:field11>
    </my:group2>
</my:group1>

I tried to filter it with this regex

<my:group1>(.*\r*\n*)*<\/my:group1>

but seems like i'm going by wrong direction. How do i filter ANY character between my keywords, including new lines?

Josh Crozier
  • 233,099
  • 56
  • 391
  • 304
vcmkrtchyan
  • 2,536
  • 5
  • 30
  • 59

2 Answers2

2

You can use this regex (demo):

<my:group1>(.|\n|\r)*<\/my:group1>

But please, please, please use an xml parser to parse xml, not regex.

enrico.bacis
  • 30,497
  • 10
  • 86
  • 115
1

How do i filter ANY character between my keywords, including new lines?

Since the . character doesn't include newline characters, you could use the s flag so that it matches all characters, including newlines - example.

/<my:group1>(.*)<\/my:group1>/s

Alternatively, you could also use a character set to match all whitespace characters (\s) and all non-whitespace characters (\S), which will essentially match everything - example.

<my:group1>([\s\S]*)<\/my:group1>
Josh Crozier
  • 233,099
  • 56
  • 391
  • 304