5

Some external data vendor wants to give me a data field - pipe delimited string value, which I find quite difficult to deal with.

Without help from an application programming language, is there a way to transform the string value into rows?

There is a difficulty however, the field has unknown number of delimited elements.

DB engine in question is MySQL.

For example:

Input: Tuple(1, "a|b|c")

Output:

Tuple(1, "a")
Tuple(1, "b")
Tuple(1, "c")
dreftymac
  • 31,404
  • 26
  • 119
  • 182
  • possible duplicate of [Can Mysql Split a column?](http://stackoverflow.com/questions/1096679/can-mysql-split-a-column) – Yuriy Galanter Aug 19 '13 at 00:06
  • @YuriyGalanter Not a duplicate. My question asks for data transform, not data extraction. –  Aug 19 '13 at 00:07
  • 2
    Same thing - split string. Insert results into table, it's a common task in many SQL dialects. Just Google "MySQL Split string" – Yuriy Galanter Aug 19 '13 at 00:09
  • Look, it certainly won't be hard if number of delimited values are known beforehand. But in my case, the number is unknown. –  Aug 19 '13 at 00:12
  • 1
    Number of items *doesn't matter*. Take a look for example at http://blog.fedecarg.com/2009/02/22/mysql-split-string-function/ mentioned in that SO answer. It takes string with *arbitrary* number of elements. And that's just one example, if it doesn't suit you - there're many others. E.g. you may need a function that returns a table from which you can make a SELECT to do INSERT of split data into your own table. – Yuriy Galanter Aug 19 '13 at 00:15
  • thank you @YuriyGalanter, I just thought it through, indeed the number of delimited elements does not matter, I could use a loop to deal with it. –  Aug 19 '13 at 00:29
  • **See also:** https://stackoverflow.com/a/56675141/42223 – dreftymac Jun 19 '19 at 23:12
  • **See also:** https://stackoverflow.com/questions/2938427/converting-delimited-string-to-multiple-values-in-mysql – dreftymac Jun 19 '19 at 23:13

3 Answers3

4

It may not be as difficult as I initially thought.

This is a general approach:

  1. Count number of occurrences of the delimiter length(val) - length(replace(val, '|', ''))
  2. Loop a number of times, each time grab a new delimited value and insert the value to a second table.
4

Use this function by Federico Cargnelutti:

 CREATE FUNCTION SPLIT_STR(
 x VARCHAR(255),
 delim VARCHAR(12),
 pos INT
 )
   RETURNS VARCHAR(255)
   RETURN REPLACE(SUBSTRING(SUBSTRING_INDEX(x, delim, pos),LENGTH(SUBSTRING_INDEX(x, delim, pos -1)) + 1),
delim, '');

Usage

 SELECT SPLIT_STR(string, delimiter, position)

you will need a loop to solve your problem.

Charaf JRA
  • 8,249
  • 1
  • 34
  • 44
2

Although your issue is probably long time resolved, I was looking for a solution to the very same problem you had. I solved it with the help of a procedure referenced here with slight adaptions to serve multi-byte characters (such as the German Umlauts) in the string by using CHAR_LENGTH() instead of LENGTH().

DELIMITER $$
    CREATE FUNCTION SPLIT_STRING(val TEXT, delim VARCHAR(12), pos INT) RETURNS TEXT
    BEGIN
        DECLARE output TEXT;
        SET output = REPLACE(SUBSTRING(SUBSTRING_INDEX(val, delim, pos), CHAR_LENGTH(SUBSTRING_INDEX(val, delim, pos - 1)) + 1), delim, '');
        IF output = '' THEN
            SET output = null;
        END IF;
        RETURN output;
    END $$

    CREATE PROCEDURE TRANSFER_CELL()
    BEGIN
        DECLARE i INTEGER;
        SET i = 1;
        REPEAT
            INSERT INTO NewTuple (id, value)
            SELECT id, SPLIT_STRING(value, '|', i)
            FROM Tuple
            WHERE SPLIT_STRING(value, '|', i) IS NOT NULL;
            SET i = i + 1;
        UNTIL ROW_COUNT() = 0
        END REPEAT;
    END $$
DELIMITER ;

CALL TRANSFER_CELL() ;

DROP FUNCTION SPLIT_STRING ;
DROP PROCEDURE TRANSFER_CELL ;
Matthias Hagemann
  • 1,328
  • 1
  • 12
  • 13