1

I have the following XML string:

<row>
  <id>
    <old></old>
    <new>2151</new>
  </id>
  <name>
    <old></old>
    <new>test</new>
  </name>
  <amount>
    <old></old>
    <new>62</new>
  </amount>
</row>

I need to parse and replace the string so that the end result will have the format:

columnname|oldvalue|newvalue^

Therefore the above example would look like this:

id||2151^name||test^amount||62^

The XML string will always contain one row node. The nodes within it (ig. id, name, amount, etc) will vary and it can range for a couple of nodes to over 100 nodes. The structure will however be always the same.

Is is possible to do it in SQL server directly or do I need to use a CLR function in order to use regular expressions?

The function must be generic because the column names will be different.

Ivan-Mark Debono
  • 15,500
  • 29
  • 132
  • 263
  • What if you have another row? It will be attached to first row or will separate? How many different nodes (id, name, amount) will be? – Darka Mar 17 '15 at 06:58
  • @Darka Edited my question to answer your questions. – Ivan-Mark Debono Mar 17 '15 at 06:59
  • Sql server vs clr ?? It depend upon real scenario.We can't say by just looking at 1 small sample data .If no.of column and no . of rows are less and not very frequently use then we can use below query. – KumarHarsh Mar 17 '15 at 09:11

3 Answers3

1

You can use local-name() to retrieve a node's name:

select  col1.value('local-name(.)', 'varchar(max)') + '|' +
        col1.value('(./old)[1]', 'varchar(max)') + '|' +
        col1.value('(./new)[1]', 'varchar(max)') + '^'
from    @x.nodes('/row/*') as tbl(col1)

-->
id||2151^
name||test^
amount||62^

Example at SQL Fiddle.

Andomar
  • 232,371
  • 49
  • 380
  • 404
  • select ','+ col1.value('local-name(.)', 'varchar(max)') + ' || ' + col1.value('(./new)[1]', 'varchar(max)') + ' ^ ' from @xml.nodes('/row/*') as tbl(col1) for xml path('') – KumarHarsh Mar 17 '15 at 09:09
0

This should do, if the pattern of the XML is just same and not going to change.

string checkStr1 = @"<row>
    <id>
    <old></old>
    <new>2151</new>
    </id>
    <name>
    <old></old>
    <new>test</new>
    </name>
    <amount>
    <old></old>
    <new>62</new>
    </amount>
</row>";
Regex r1 = new Regex(@"\<([a-z]+)\>(?:\t|\r|\n|\s)*\<old\>(.*?)\<\/old\>(?:\t|\r|\n|\s)*\<new\>(.*?)\<\/new\>(?:\t|\r|\n|\s)*</[a-z]+?>");
MessageBox.Show(r1.Replace(checkStr1, m => "^" + m.Groups[1].Value + "|" + m.Groups[2].Value + "|" + m.Groups[3].Value));

And finally replace the "row" opening and closing tags...

Hope this helps...

Vinod Kumar
  • 981
  • 1
  • 6
  • 14
  • It is said, "If all you have is a hammer, everything looks like a nail." There are better ways to parse XML than [regular expressions](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – Andomar Mar 17 '15 at 08:06
  • Hey Andomar, you are right about better ways for handling XML than regex. I used regex since Ivan or Darka was OK with regex.. Anyway thanks.. – Vinod Kumar Mar 17 '15 at 09:03
0

Try this,

Declare @xml xml='<row>
  <id>
    <old></old>
    <new>2151</new>
  </id>
  <name>
    <old></old>
    <new>test</new>
  </name>
  <amount>
    <old></old>
    <new>62</new>
  </amount>
</row>'


select 'id||'+  xmlData.Col.value('(./id)[1]', 'varchar(MAX)')+'^' +
'name||'+ xmlData.Col.value('(./name)[1]', 'varchar(MAX)')+'^' +
'amount||'+ xmlData.Col.value('(./amount)[1]', 'varchar(MAX)')+'^'
from  @xml.nodes('//row') xmlData(Col)
KumarHarsh
  • 5,046
  • 1
  • 18
  • 22