1

I need one help on splitting multiple values from multiple columns into another column. Below is an example

CREATE TABLE split
(
    `Col_1`             VARCHAR(120),
    `Col_2`                 VARCHAR(50),
    `Col_3`                VARCHAR(20),
    `Col_4`                 VARCHAR(50)
);

Insert into split (Col_1,Col_2,Col_3,Col_4) values ('ABC','1','10',null);
Insert into split (Col_1,Col_2,Col_3,Col_4) values ('DEF','2,3','30,40',null);
Insert into split (Col_1,Col_2,Col_3,Col_4) values ('GHI','4,5','50','500,600,700');

select * from split;

+-------+-------+-------+-------------+
| Col_1 | Col_2 | Col_3 | Col_4       |
+-------+-------+-------+-------------+
| ABC   | 1     | 10    | NULL        |
| DEF   | 2,3   | 30,40 | NULL        |
| GHI   | 4,5   | 50    | 500,600,700 |
+-------+-------+-------+-------------+

I am no expert in this, but have been playing around and have managed to split only col_2 into multiple rows as below:

SELECT
  Col_1,Col_2,Col_3,Col_4,
  SUBSTRING_INDEX(SUBSTRING_INDEX(split.Col_2, ',', numbers.n), ',', -1) Col_2_NEW,
  SUBSTRING_INDEX(SUBSTRING_INDEX(split.Col_3, ',', numbers.n), ',', -1) Col_3_NEW,
  SUBSTRING_INDEX(SUBSTRING_INDEX(split.Col_4, ',', numbers.n), ',', -1) Col_4_NEW
FROM
  (SELECT 1 n UNION ALL SELECT 2
   UNION ALL SELECT 3 UNION ALL SELECT 4) numbers INNER JOIN split
  ON CHAR_LENGTH(split.Col_2) - CHAR_LENGTH(REPLACE(split.Col_2, ',', ''))>=numbers.n-1
ORDER BY Col_2, n;

+-------+-------+-------+-------------+-----------+-----------+-----------+
| Col_1 | Col_2 | Col_3 | Col_4       | Col_2_NEW | Col_3_NEW | Col_4_NEW |
+-------+-------+-------+-------------+-----------+-----------+-----------+
| ABC   | 1     | 10    | NULL        | 1         | 10        | NULL      |
| DEF   | 2,3   | 30,40 | NULL        | 2         | 30        | NULL      |
| DEF   | 2,3   | 30,40 | NULL        | 3         | 40        | NULL      |
| GHI   | 4,5   | 50    | 500,600,700 | 4         | 50        | 500       |
| GHI   | 4,5   | 50    | 500,600,700 | 5         | 50        | 600       |
+-------+-------+-------+-------------+-----------+-----------+-----------+

However, I would like to split, col_3 and col_4 into new as well, so it gives me below output.

+-------+-------+-------+-------------+-----------+-----------+-----------+
| Col_1 | Col_2 | Col_3 | Col_4       | Col_2_NEW | Col_3_NEW | Col_4_NEW |
+-------+-------+-------+-------------+-----------+-----------+-----------+
| ABC   | 1     | 10    | NULL        | 1         | 10        | NULL      |
| DEF   | 2,3   | 30,40 | NULL        | 2         | 30        | NULL      |
| DEF   | 2,3   | 30,40 | NULL        | 2         | 40        | NULL      |
| DEF   | 2,3   | 30,40 | NULL        | 3         | 30        | NULL      |
| DEF   | 2,3   | 30,40 | NULL        | 3         | 40        | NULL      |
| GHI   | 4,5   | 50    | 500,600,700 | 4         | 50        | 500       |
| GHI   | 4,5   | 50    | 500,600,700 | 4         | 50        | 600       |
| GHI   | 4,5   | 50    | 500,600,700 | 4         | 50        | 700       |
| GHI   | 4,5   | 50    | 500,600,700 | 5         | 50        | 500       |
| GHI   | 4,5   | 50    | 500,600,700 | 5         | 50        | 600       |
| GHI   | 4,5   | 50    | 500,600,700 | 5         | 50        | 700       |
+-------+-------+-------+-------------+-----------+-----------+-----------+

I have searched all over and so far, they are splitting the row into one column only and have not been able to find problem something similar to mine. Maybe some join is missing or some union, I don't know as I am not good at queries. Can anyone help me here? without asking me to read guide or handbooks :-)

Thanks in advance

VikiT
  • 78
  • 7

2 Answers2

2

You can try one of the the recommendation in this thread here. Something along the lines of

 SELECT s.[Col_1], Split.a.value('.', 'VARCHAR(100)') AS String  
 FROM  (SELECT [Col_1], 
        CAST ('<M>' + REPLACE([Col_2], ',', '</M><M>') + '</M>' AS XML) AS String  
        FROM  split) AS s
 CROSS APPLY String.nodes ('/M') AS Split(a);

iterated over your columns should work fine.

Edit: Didn't see that this is MySQL, sorry. See below for a working solution.

The following code should work for the first two columns.

1.) Create table:

CREATE TABLE split(
    `Col_1`    VARCHAR(120),
    `Col_2`    VARCHAR(50),
    `Col_3`    VARCHAR(20),
    `Col_4`    VARCHAR(50)
);

INSERT INTO split (Col_1,Col_2,Col_3,Col_4) values ('ABC','1','10',null);
INSERT INTO split (Col_1,Col_2,Col_3,Col_4) values ('DEF','2,3','30,40',null);
INSERT INTO split (Col_1,Col_2,Col_3,Col_4) values ('GHI','4,5','50','500,600,700');

which leads to

SELECT * FROM split;

Col_1       Col_2       Col_3       Col_4
ABC         1           10          (null)
DEF         2,3         30,40       (null)
GHI         4,5         50          500,600,700

2.) Split the strings in Col_2:

SELECT
  split.Col_1,
  SUBSTRING_INDEX(SUBSTRING_INDEX(split.Col_2, ',', numbers.n), ',', -1) Col_2,
  Col_3,
  Col_4
FROM
  (select 1 n UNION ALL
  select 2 UNION ALL select 3 UNION ALL
  select 4 UNION ALL select 5) numbers INNER JOIN split
  ON CHAR_LENGTH(split.Col_2)
 -CHAR_LENGTH(REPLACE(split.Col_2, ',', ''))>=numbers.n-1
ORDER BY Col_1, Col_2;

3.) Result:

Col_1       Col_2       Col_3       Col_4
ABC         1           10          (null)
DEF         2           30,40       (null)
DEF         3           30,40       (null)
GHI         4           50          500,600,700
GHI         5           50          500,600,700

Here is a SQL fiddle with the above code: http://sqlfiddle.com/#!9/948fcb/4. You should be able to iterate from there. Just comment this post if you need more guidance.

Important caveat: This works with up to 5 comma-separated strings in one column.

Solution is inspired by fthiella's answer to SQL split values to multiple rows.

Roman
  • 4,744
  • 2
  • 16
  • 58
  • Thank you Mihai. Didn't have coffee, yet — changed the solution to MySQL. – Roman Sep 04 '18 at 15:18
  • Thank you @RomanAbashin . I followed the exacmple from fthiella example itself. But the query you gave me is only giving me split till Col_3, How do I split Col_4. This, I was able to acheive with the original query, I gave, but not able to split one more row. The final output I want is in my original questions. Thanks again. – VikiT Sep 07 '18 at 09:35
0

Just wanted to answer with final query, which is giving me desired output as required. Posting it here, as maybe it will help others.

SELECT distinct Col_1,Col_2,Col_3,Col_4,
 SUBSTRING_INDEX(SUBSTRING_INDEX(t.Col_2, ',', n.n), ',', -1) Col_2_New,
 SUBSTRING_INDEX(SUBSTRING_INDEX(t.Col_3, ',', n1.n), ',', -1) Col_3_New,
 SUBSTRING_INDEX(SUBSTRING_INDEX(t.Col_4, ',', n2.n), ',', -1) Col_4_New
  FROM split t CROSS JOIN 
  (
   SELECT a.N + b.N * 10 + 1 n
     FROM 
    (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 ) a
   ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 ) b
   ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 ) c
    ORDER BY n 
   ) n,
  (
   SELECT a.N + b.N * 10 + 1 n
     FROM 
    (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 ) a
   ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 ) b
   ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 ) c
    ORDER BY n 
   ) n1,
  (
   SELECT a.N + b.N * 10 + 1 n
     FROM 
    (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 ) a
   ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 ) b
   ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 ) c
    ORDER BY n 
   ) n2
WHERE coalesce(n.n,'0') <= 1 + (LENGTH(coalesce(t.Col_2,'0')) - LENGTH(REPLACE(coalesce(t.Col_2,'0'), ',', '')))
 AND coalesce(n1.n,'0') <= 1 + (LENGTH(coalesce(t.Col_3,'0')) - LENGTH(REPLACE(coalesce(t.Col_3,'0'), ',', '')))
 AND coalesce(n2.n,'0') <= 1 + (LENGTH(coalesce(t.Col_4,'0')) - LENGTH(REPLACE(coalesce(t.Col_4,'0'), ',', '')))
 ORDER BY 1,2,3,4,5,6,7 ;
VikiT
  • 78
  • 7