0

I have an XML as below

<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope
    xmlns="http://com/uhg/uht/uhtSoapMsg_V1"
    xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
    <env:Header>
        <uhtHeader
            xmlns="http://com/uhg/uht/uhtHeader_V1">
            <consumer>COMET</consumer>
            <auditId></auditId>
            <sendTimestamp>2020-09-03T18:15:40.942-05:00</sendTimestamp>
            <environment>P</environment>
            <businessService version="24">getClaimHistory</businessService>
            <status>success</status>
        </uhtHeader>
    </env:Header>
    <env:Body>
        <srvcRspn
            xmlns="http://com/uhg/uht/getClaimHistory_V24">
            <srvcErrList arrayType="srvcErrOccur[1]" type="Array">
                <srvcErrOccur>
                    <orig>Foundation</orig>
                    <rtnCd>00</rtnCd>
                    <explCd>000</explCd>
                    <desc></desc>
                </srvcErrOccur>
            </SrvcErrList>
        </srvcRspn>
    </env:Body>
</env:Envelope>

I want to remove all the attribute values with "http" like below:

<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope
    xmlns=""
    xmlns:env="">
    <env:Header>
        <uhtHeader
            xmlns="">
            <consumer>COMET</consumer>
            <auditId></auditId>
            <sendTimestamp>2020-09-03T18:15:40.942-05:00</sendTimestamp>
            <environment>P</environment>
            <businessService version="24">getClaimHistory</businessService>
            <status>success</status>
        </uhtHeader>
    </env:Header>
    <env:Body>
        <srvcRspn
            xmlns="">
            <srvcErrList arrayType="srvcErrOccur[1]" type="Array">
                <srvcErrOccur>
                    <orig>Foundation</orig>
                    <rtnCd>00</rtnCd>
                    <explCd>000</explCd>
                    <desc></desc>
                </srvcErrOccur>
            </SrvcErrList>
        </srvcRspn>
    </env:Body>
</env:Envelope>

I have tried several ways but none of them has worked for me. Can anyone suggest what is fastest way to do it in VB.NET/C#.

The actual response is very large (approx 100000 lines of XML minimum) and using for each will consume a good amount of time. Is there any parsing method or LINQ query method which can do it faster.

Meghdut Saha
  • 85
  • 1
  • 14
  • Both VB.NET and C# have exactly the same speed. The language used to access .net API means nothing to speed. – Cleptus Sep 04 '20 at 14:46
  • I can not use For Each to traverse each node and check, is there any faster method to do it is what I actually need – Meghdut Saha Sep 04 '20 at 14:47
  • You do not need to remove XML attributes, you are trying to get rid of the XML namespaces. – Cleptus Sep 04 '20 at 14:51
  • Does this answer your question? [How to remove all namespaces from XML with C#?](https://stackoverflow.com/questions/987135/how-to-remove-all-namespaces-from-xml-with-c) – Cleptus Sep 04 '20 at 14:51
  • @Cleptus Thank for redirecting me to correct place. It worked!!! – Meghdut Saha Sep 04 '20 at 15:02
  • Why are you deleting a namespace? Do you realize that this way you get a completely different xml? For example, `Meghdut Saha` without the namespace will be `Meghdut`. And it may well be two different people. – Alexander Petrov Sep 04 '20 at 15:30
  • I am deleting the namespace for end user usage only. Not in the actual response. For end users, they need the values only. So I am trying to hide it for security purpose. – Meghdut Saha Sep 04 '20 at 16:18

1 Answers1

1

I got the way to do it using Regex as below:

Return Regex.Replace(xmlDoc, "((?<=<|<\/)|(?<= ))[A-Za-z0-9]+:| xmlns(:[A-Za-z0-9]+)?="".*?""", "")

It serves my purpose completely. Thanks Cleptus for your quick reference.

Meghdut Saha
  • 85
  • 1
  • 14