Extracting somewhat reliably placed substrings from a longer string in SQL

Question

So I have a dataset that looks roughly like this:

01/02/2022, "bananas, apples and oranges", "accepted", "John"   
01/05/2022, "bananas and grapes", "accepted", "Mary"  
01/08/2022, "kale and bananas", "rejected", "John"

I need to transform it into:

01/02/2022, "bananas", "accepted", "John"
01/02/2022, "apples", "accepted", "John"
01/02/2022, "oranges", "accepted", "John"
01/05/2022, "bananas, "accepted", "Mary"

etc.

My problem is extracting the substrings for the types of fruit. The pattern is reliable in that the last delimiter is always " and ", and if there is another one before it, it's always ", ". So I can use CHARINDEX to find the position of the start of each word. But I struggle with finding the position of the end of the word. I could probably do it with a ton of CASE WHENs but is there an easier way?

(I asked this question earlier but I did a bad job of describing the problem then.)

While asking a question, you need to provide a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example): (1) DDL and sample data population, i.e. CREATE table(s) plus INSERT T-SQL statements. (2) What you need to do, i.e. logic and your code attempt implementation of it in T-SQL. (3) Desired output, based on the sample data in the #1 above. (4) Your SQL Server version (SELECT @@version;). — Yitzhak Khabinsky, Feb 08 '22 at 17:00
Are we looking at 2 columns here or 4? Include the actual table schema / columns to make it clear. Are double quotes part of the data?? — Stu, Feb 08 '22 at 17:00
If I understand correctly, you want to split a delimited string into multiple rows. There are many questions in SO related to this, such as this [this one](https://stackoverflow.com/questions/57245618/sql-server-split-delimited-string-into-rows). — Aziz, Feb 08 '22 at 17:04
I'd suggest you use a proper format such as JSON or XML, it would be far easier to parse in SQL. But if that's what you have then that's what you have. — Charlieface, Feb 08 '22 at 20:08

Aaron Bertrand · Answer 1 · 2022-02-08T17:10:26.643

Given this data:

CREATE TABLE dbo.whatever
(
  d date, 
  fruits varchar(255), 
  status varchar(12), 
  name   varchar(32)
 );

INSERT dbo.whatever VALUES
('20220102', 'bananas, apples and oranges', 'accepted', 'John'),   
('20220105', 'bananas and grapes', 'accepted', 'Mary'),
('20220108', 'kale and bananas', 'rejected', 'John');

You want to split strings but based on two delimiters. So before you split them, you can just replace one with the other:

SELECT w.d, fruit = LTRIM(f.value), w.status, w.name
FROM dbo.whatever AS w 
CROSS APPLY 
STRING_SPLIT(REPLACE(w.fruits, ' and ', ', '), ',') AS f;

Results:

d	fruit	status	name
2022-01-02	bananas	accepted	John
2022-01-02	apples	accepted	John
2022-01-02	oranges	accepted	John
2022-01-05	bananas	accepted	Mary
2022-01-05	grapes	accepted	Mary
2022-01-08	kale	rejected	John
2022-01-08	bananas	rejected	John

Example db<>fiddle

Oh. That actually looks like a great idea. Thank you! – bookhouseboy Feb 08 '22 at 17:09 — bookhouseboy, Feb 08 '22 at 17:09

Extracting somewhat reliably placed substrings from a longer string in SQL

1 Answers1