-1

I have prepared this demo which I would like to use string_split or other functions in SQL server possibly in a case statement to achieve an output like this:

Writer   
Sabao Fulano, Sapato Feio, Jose Perreira
Toze Jose
Feijao Mauricio

EDIT

The difference between this question and the one suggested in the comments is that here, there isn't only comma to be split but also an "XX" that comes with every name in this fashion:

  1. “Sabao Fulano, XX Some_Random_Text” to become “Sabao Fulano”

  2. “Sapato Feio, XX, Jose Perreira, XX” to become “Sapato Feio, Jose Perreira”

Basically, if there is XX without a comma next to it, then leave only the first instance, say "Sabao Fulano", or if there is XX + "," then remove the two of them (second case above). The idea is to leave only names separated by a comma.

tavalendo
  • 857
  • 2
  • 11
  • 30
  • 2
    I don't follow your question. If your output is CSV data, then why would `STRING_SPLIT` be involved? By the way, storing CSV data in your SQL tables is a generally bad idea. If you find yourself using `STRING_SPLIT` often, it probably implies bad table design. – Tim Biegeleisen Aug 22 '18 at 11:09
  • Possible duplicate of [T-SQL split string](https://stackoverflow.com/questions/10914576/t-sql-split-string) – Ivan Starostin Aug 22 '18 at 11:15
  • `STRING_SPLIT` is a table-function, not a scalar function. You can't use it within a `CASE` expression (not statement). I really don't understand what your goal is here. – Thom A Aug 22 '18 at 11:44
  • The goal is to have just names and not XX. So XX should be filtered out. – tavalendo Aug 22 '18 at 11:50
  • Not XX or not the element that start with XX?!? I don't see "Article 3" in the requested result. It seems like you do not want any element that start with XX. If this is not the case, then simply use REPLACE to remove the ', XX' – Ronen Ariely Aug 22 '18 at 12:08
  • 1
    @Larnu there is not limitation to use STRING_SPLIT or table-function inside a CASE statement. I am not sure what you meant by this sentence "`You can't use it within a CASE expression`". Just to clarify, here is simple code: `select CASE WHEN (SELECT COUNT(*) FROM STRING_SPLIT('a,b',',') t) > 1 then 1 ELSE 0 END`. I am pretty sure that this is not what you meant ;-) – Ronen Ariely Aug 22 '18 at 13:17

2 Answers2

1

Good day,

The requirement is not fully clear since in the description you say "The goal is to have just names and not XX." but in the requested result you don't have the text "Article 3" which exists in the source.

If you simply want to remove the ", XX" then we can use simple REPLACE like bellow:

WITH WriterTbl AS (
    SELECT 'Sabao Fulano, XX, Sapato Feio, XX, Jose Perreira, XX' AS Writer UNION ALL
    SELECT 'Toze Jose, XX' UNION ALL
    SELECT 'Feijao Mauricio, XX Article 3'
)
select REPLACE(WriterTbl.Writer,', XX',', ')
from WriterTbl

But if you actually want to remove the parts that start with ", XX" and you requirement is to use STRING_SPLIT as you asked in the original question (assuming you will not change the question), then we can use STRING_SPLIT to find the parts of the text we need to remove and than we can use the function STRING_AGG in order to concatenate the text again

There is one important point which you must remember and I am pretty sure that most people do not think about it: STRING_SPLIT does not GUARANTEED the order of the result, which means that by splitting the text and re-concatenate it, you might get different order

According to these limitations and your requirements, please check this solution:

WITH WriterTbl AS (
    SELECT 'Sabao Fulano, XX, Sapato Feio, XX, Jose Perreira, XX' AS Writer UNION ALL
    SELECT 'Toze Jose, XX' UNION ALL
    SELECT 'Feijao Mauricio, XX Article 3'
),
MyCTE AS(
    select MyGroup = ROW_NUMBER() OVER (order by (select null)), t1.Writer
    from WriterTbl t1
)
SELECT STRING_AGG(t2.[value], ',') 
FROM MyCTE t1
CROSS APPLY (
    SELECT * from STRING_SPLIT (t1.Writer, ',') t2
) t2
where not t2.[value] like ' XX%'
group by MyGroup

Note! this task is NOT recommended, and I highly recommend you to re-think about your requirements and your architecture!

Sub-Note! Even if you stay with the need to get this result from this input, then you should probably create your own function and not use the built-in function STRING_SPLIT. There is no reason to split the data in order to concatenate it back. In your User Function you can simple remove the the text that start with ", XX" and close with "," or if this is the last part of the value. The solution here is according to your requirement to use STRING_SPLIT

Ronen Ariely
  • 2,336
  • 12
  • 21
0

Try this query:

select case when lastComma < firstXX then substring(writer, 1, lastComma - 1)
            else replace(replace(writer, 'XX', ''), ', ,',',') end
from (
    select *,
           charindex('XX', writer) firstXX,
           len(writer) - charindex(',', reverse(writer)) + 1 lastComma
    from WriterTbl
) a

Based on last index of a comma and first occurence of XX it's decided in outer query if you want to remove last entry or just replace XX (and additionally I added replacing double comma to one).

Michał Turczyn
  • 32,028
  • 14
  • 47
  • 69
  • Hi Michał, (1) As much as I understand (I hope I am mistaken), the OP basic requirement was to use string_split "`I would like to use string_split`". This might be a School related requirement for example, which in this case we need to follow even if this is not the best solution (as I wrote in my answer). (2) Your solution needed some adjustment for example for this input: `"'a, XX, b, XX, c, XX'"` (returns comma at the end). Or for this input: "`'a, XX b c, XX, d'`" (returns the "b c"), and so on... – Ronen Ariely Aug 22 '18 at 13:07
  • Was not able to run it. But I took @RonenAriely's answer as acceptable. Thanks for the effort and time. – tavalendo Aug 23 '18 at 09:37