3

I have got fields like this:

UPDATE</transactionType><column><name>prio</name><newValue>5</newValue><oldValue>1</oldValue><newValue>aaa<oldValue>10863321</oldValue></column></row></table></businessObjectChanges>
UPDATE</transactionType><column><name>prio</name><newValue>51</newValue><oldValue>11</oldValue><newValue>bbb<oldValue>10863321</oldValue></column></row></table></businessObjectChanges>

and I am trying get extract text after first <newValue> from the left side. It will be either one or two numbers/letters. Also, at the same time I want to get first<oldValue> looking from left. SO the results are:

newValue oldValue
5        1
51       11
Kalenji
  • 401
  • 2
  • 19
  • 42
  • 1
    MySQL 8.0 has [REGEXP_SUBSTR()](https://dev.mysql.com/doc/refman/8.0/en/regexp.html#function_regexp-substr) but if you use 5.5, you better just fetch the whole string into your app and parse it there. – Bill Karwin Jan 17 '19 at 17:10
  • `aaa` and `bbb` don't have closing tag. Check [12.11 XML Functions](https://dev.mysql.com/doc/refman/5.5/en/xml-functions.html). Try [db-fiddle](https://www.db-fiddle.com/f/2asFey8mZy4obL1G3SwvD6/0). – wchiquito Jan 17 '19 at 21:47
  • @Kalenji is it onlya part of an XML or you have exacly what you paste here? – dgebert Jan 31 '19 at 21:02

4 Answers4

2

Since it's an incomplete XML, let's use simple string functions.

LOCATE can find a position of a sub-string.

LEFT gets a sub-string from the start till a position.

And from that sub-string the SUBSTRING_INDEX function is handy to get the characters after the final tag.

Example code:

-- test table
drop table if exists YourTable;
create table YourTable (col varchar(1000));

-- Sample data
insert into YourTable (col) values
('UPDATE</transactionType><column><name>prio</name><newValue>5</newValue><oldValue>1</oldValue><newValue>aaa<oldValue>10863321</oldValue></column></row></table></businessObjectChanges>'),
('UPDATE</transactionType><column> <name>prio</name><newValue>51</newValue><oldValue>11</oldValue><newValue>bbb<oldValue>10863321</oldValue></column></row></table></businessObjectChanges>');

-- Query
SELECT 
 SUBSTRING_INDEX(LEFT(col, LOCATE('</oldValue>', col)-1),'>',-1) AS oldValue,
 SUBSTRING_INDEX(LEFT(col, LOCATE('</newValue>', col)-1),'>',-1) AS newValue
FROM YourTable;

Result:

oldValue    newValue
1           5
11          51

A test on rextester here

Side-note:

In MySql 8 you could also use REGEXP_SUBSTR for this.

SELECT  
REGEXP_SUBSTR(col,'(?<=<oldValue>)[^<>]*(?=</oldValue)',1,1) as oldValue,
REGEXP_SUBSTR(col,'(?<=<newValue>)[^<>]*(?=</newValue>)',1,1) as newValue
FROM YourTable;

A test on db<>fiddle here

(But be silent about it. Some would frown upon you for using regex to parse XML. F.e. here.
But then again, an invalid XML isn't really an XML)

LukStorms
  • 28,916
  • 5
  • 31
  • 45
1

As I undersand you paste only a part of your xml field. If it is a valid xml, you can use an xpath expression by function ExtractValue

I will provide you a simple example:

Data definition

create table Test(id integer, title varchar(2000));
insert into Test(id, title) values(1, "<a><b>X</b><b>Y</b></a>");

Query

select ExtractValue(title, '/a/b[1]') from Test;

This query returns the first element b inside tag a (note the "1 in the query). The result in this case is X.

So in your case you can use two ExtractValue functions in single query to select the first newValue tag and the first oldValue tag.

Link: ExtractValue function in MySQL Documentation

dgebert
  • 1,235
  • 1
  • 13
  • 30
  • 1
    Potential problem: "NULL is returned if xml_frag contains elements which are not properly nested or closed, and a warning is generated..." – Rick James Jan 30 '19 at 19:53
1

Since it is not proper XML, let's try some low-level string tools.

mysql> SELECT SUBSTRING_INDEX(
                  SUBSTRING_INDEX(
'UPDATE</transactionType><column><name>prio</name><newValue>51</newValue><oldValue>11</oldValue><newValue>bbb<oldValue>10863321</oldValue></column></row></table></businessObjectChanges>',
                                  '</newValue>', 1),
                              '<newValue>', -1) AS x;
+----+
| x  |
+----+
| 51 |
+----+
1 row in set (0.00 sec)

Explanation:

  • The Inner SUBSTRING_INDEX gets the substring through the first </newValue>.
  • The outer SUBSTRING_INDEX gets the text after the first <newValue>

Give it a try on other strings.

This should work for any MySQL version at least going back a decade.

<oldvalue> should work the same way, and can be a second 'column' in the SELECT.

Rick James
  • 135,179
  • 13
  • 127
  • 222
1

SUBSTRING_INDEX function is useful in some contexts. For "well formed" values that contain the specified tags, we can get a useable result. But this approach falls apart (returns potentially unexpected results) when the values aren't well formed.

SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(  t.foo  ,'</oldValue>',1),'<oldValue>',-1) AS first_oldValue
     , SUBSTRING_INDEX(SUBSTRING_INDEX(  t.foo  ,'</newValue>',1),'<newValue>',-1) AS first_newValue
  FROM ( SELECT 'UPDATE</transactionType><column><name>prio</name><newValue>5</newValue><oldValue>1</oldValue><newValue>aaa<oldValue>10863321</oldValue></column></row></table></businessObjectChanges>' AS foo 
         UNION ALL 
         SELECT 'UPDATE</transactionType><column><name>prio</name><newValue>51</newValue><oldValue>11</oldValue><newValue>bbb<oldValue>10863321</oldValue></column></row></table></businessObjectChanges>'
       ) t
 WHERE t.foo LIKE '%<oldValue>%</oldValue>%'
   AND t.foo LIKE '%<newValue>%</newValue>%'

We can incorporate the integrity checks into the expressions in the SELECT list

SELECT CASE WHEN t.foo LIKE '%<oldValue>%</oldValue>%' THEN 
         SUBSTRING_INDEX(SUBSTRING_INDEX( t.foo ,'</oldValue>',1),'<oldValue>',-1)
       END AS first_oldValue
     , CASE WHEN t.foo LIKE '%<newValue>%</newValue>%' THEN
         SUBSTRING_INDEX(SUBSTRING_INDEX( t.foo ,'</newValue>',1),'<newValue>',-1) 
       END AS first_newValue
  FROM ( SELECT 'UPDATE</transactionType><column><name>prio</name><newValue>5</newValue><oldValue>1</oldValue><newValue>aaa<oldValue>10863321</oldValue></column></row></table></businessObjectChanges>' AS foo 
         UNION ALL 
         SELECT 'UPDATE</transactionType><column><name>prio</name><newValue>51</newValue><oldValue>11</oldValue><newValue>bbb<oldValue>10863321</oldValue></column></row></table></businessObjectChanges>'
       ) t
spencer7593
  • 106,611
  • 15
  • 112
  • 140