0

I have below column with values and I want to split values into multiple column values

First 4 character represents Location Code Second 3 character represents Cost Code Last 6 character represents Account Code

COLUMN A
---------------
AA12.F07.123456
XX34 SA8 676868
YY13_SS3_798798
HJ88.657769
.898798

Expected Output:

ColA    ColB    ColC
---------------------
AA12    F07     123456
XX34    SA8     676868
YY13    SS3     798798
HJ88    NULL    657769
NULL    NULL    898798

The last before row don't have Cost Code which is 3 digit The Last row don't have Location Code which is 4 digit

Can you please tell me how to achieve this?

  • What is this for? It probably is going to be easier to do in your logic tier... It also would have been a good idea to store this in the second format to begin with – NullUserException Oct 08 '18 at 20:27
  • DBMS is SQL Server – Danny Danny Oct 08 '18 at 20:34
  • In SQL Server, I've have a table with column having the above values and I want to split them into three columns – Danny Danny Oct 08 '18 at 20:35
  • Possible duplicate of [How to split a single column values to multiple column values?](https://stackoverflow.com/questions/5123585/how-to-split-a-single-column-values-to-multiple-column-values) – Aura Oct 08 '18 at 20:39
  • 2
    Given a table, `#t`, with a column `x`, this gets you pretty close (but there will likely be other edge cases like the 4th row that won't work) : `;WITH t AS (SELECT x = REPLACE(REPLACE(x,' ','.'),'_','.') FROM #t) SELECT PARSENAME(x,3),PARSENAME(x,2),PARSENAME(x,1) FROM t;` – Aaron Bertrand Oct 08 '18 at 20:55
  • I would do something like `SELECT a, b, c FROM Input CROSS APPLY parse_input(input_text) out`, where `parse_input` returns a row with columns `a`, `b`, and `c`. – plalx Oct 08 '18 at 21:08
  • 1
    Could there be a row like `.A4B.` or perhaps one like `H459_`? Are space dot and underscore the only valid delimiters? – ATC Oct 08 '18 at 21:36
  • @AaronBertrand `PARSENAME` was my first (and likely the most performant) thought too but it will not work here. Each string would require 2 "delimiters" for your solution to work. – Alan Burstein Oct 09 '18 at 01:27
  • @AlanBurstein well you could easily add a delimiter in cases where there is only one, all depends on where you want to sacrifice readability I guess. – Aaron Bertrand Oct 09 '18 at 01:47
  • So many comments I'm surprised nobody mentioned what a bad data structure you have in the first place... Read [Is storing a delimited list in a database column really that bad?](http://stackoverflow.com/questions/3653462/is-storing-a-delimited-list-in-a-database-column-really-that-bad), where you will see a lot of reasons why the answer to this question is **Absolutely yes!** – Zohar Peled Oct 09 '18 at 03:48

3 Answers3

1

Just use substring() and judicious use of pattern matching:

select (case when a like '[^_. ][^_. ][^_. ][^_. ][_. ]%' then left(a, 4) 
        end) as col1,
       (case when a like  '[^_. ][^_. ][^_. ][^_. ][_. ][^_. ][^_. ][^_. ][^_. ][_. ]%' then substring(a, 6, 3) 
        end) as col2,
       (case when a like '[^_. ][^_. ][^_. ][^_. ][^_. ][^_. ]' then right(a, 6) end) as col3
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

Based on your examples, the following should get you the desired results...

IF OBJECT_ID('tempdb..#TestData', 'U') IS NULL 
BEGIN   -- DROP TABLE #TestData;
    CREATE TABLE #TestData (
        ColumnA VARCHAR(17) NOT NULL 
        );

    INSERT #TestData(ColumnA) VALUES
        ('AA12.F07.123456'),
        ('XX34 SA8 676868'),
        ('YY13_SS3_798798'),
        ('HJ88.657769'),
        ('.898798');
END;

--SELECT * FROM #TestData td;

--==========================================

SELECT 
    td  .ColumnA,
    ca.ColA,
    cb.ColB,
    cc.Colc
FROM
    #TestData td
    CROSS APPLY ( VALUES (CASE WHEN td.ColumnA LIKE '[0-Z][0-Z][0-Z][0-Z][^0-Z]%' OR td.ColumnA LIKE '[0-Z][0-Z][0-Z][0-Z]' THEN SUBSTRING(td.ColumnA, 1, 4) END) ) ca (ColA)
    CROSS APPLY ( VALUES (CASE WHEN td.ColumnA LIKE '%[^0-Z][0-Z][0-Z][0-Z][^0-Z]%' THEN SUBSTRING(td.ColumnA, 6, 3) END) ) cb (ColB)
    CROSS APPLY ( VALUES (CASE WHEN td.ColumnA LIKE '%[0-Z][0-Z][0-Z][0-Z][0-Z][0-Z]' THEN RIGHT(td.ColumnA, 6) END) ) cc (Colc);

Results:

ColumnA           ColA ColB Colc
----------------- ---- ---- ------
AA12.F07.123456   AA12 F07  123456
XX34 SA8 676868   XX34 SA8  676868
YY13_SS3_798798   YY13 SS3  798798
HJ88.657769       HJ88 NULL 657769
.898798           NULL NULL 898798
Jason A. Long
  • 4,382
  • 1
  • 12
  • 17
1

What Jason Long posted is the way I would do it. Building off his answer, I simplified this to one CROSS APPLY and no CASE statements:

SELECT t.ColumnA,
       colA = SUBSTRING(t.ColumnA, col.a, 4), 
       colB = SUBSTRING(t.ColumnA, col.b, 3),
       colC = SUBSTRING(t.ColumnA, col.c, 6)
FROM   #TestData AS t
CROSS APPLY (VALUES(
  NULLIF(PATINDEX('[A-Z][A-Z][0-9][0-9][^A-Z0-9]%', t.ColumnA),0),
  NULLIF(PATINDEX('%[^A-Z0-9][A-Z][A-Z0-9][A-Z0-9][^A-Z0-9]%', t.ColumnA),0)+1,
  NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9]', t.ColumnA),0))) AS col(a,b,c);
Alan Burstein
  • 7,770
  • 1
  • 15
  • 18