0

I have a large text file that will be loaded into a temp table. The file will have sample text like below,

We the People of the United States@ in Order to form a more perfect Union@ establish Justice@ insure domestic Tranquility@ provide for the common defense@ promote the general Welfare@ and secure the Blessings of Liberty to ourselves and our Posterity@ do ordain and establish this Constitution for the United States of America@

I want to extract the text that appears before each "@" symbol and insert each extracted line into separate records in a temp table.

CREATE TABLE #Test
(
    SerialNumber VARCHAR(10),
    Text         VARCHAR(MAX)
);   

After extracting the string the text should be inserted into the temp table.

enter image description here

So far, this is the code I can think of, Need help to extract the remaining text and insert it into the temp table.

DECLARE @test nvarchar(max)

SET @test = 'We the People of the United States@  in Order to form a more perfect Union@ establish 
             Justice@ insure domestic Tranquility@ provide for the common defense@ promote the general 
             Welfare@ and secure the Blessings of Liberty to ourselves and our Posterity@ do ordain and 
              establish this Constitution for the United States of America@'

SELECT 
    '1' AS SerialNumber, 
    LEFT(@test, CHARINDEX('@', @test) - 2) AS Text;

Thanks in advance!

Dan Guzman
  • 43,250
  • 3
  • 46
  • 71
  • Sounds like a simple _split_ function. https://stackoverflow.com/a/42848348/1690217 is a good option for this but this dup has a lot more detail: https://stackoverflow.com/q/2647/1690217 – Chris Schaller Feb 21 '21 at 04:45
  • 1
    This answer has a lot of good advice: https://stackoverflow.com/a/19935646/1690217 basically split on @ and you're good to go, if you are using SQL Sever 2016 then its built in, for everything else have a look at XML versions – Chris Schaller Feb 21 '21 at 04:48
  • @Charlieface STRING_SPLIT() not available in 2012? – dkritz Feb 21 '21 at 05:37

1 Answers1

0

First, I would make the first column an integer and not a string. Presumably, the purpose is to capture the ordering of the results.

Second, this is a little tricky. If you trust that string_split() returns substrings in order -- which it does in practice but is not documented to always do -- then you can use:

SELECT IDENTITY(int) as SerialNumber, s.value as text
INTO test1
FROM STRING_SPLIT(@test, '@') s;

The problem is that this includes a final row that is empty -- because of the final @. We need to get rid of that. If no other rows are empty, then the simplest method is to add:

WHERE s.value <> ''

That works in this case, but a general solution is more complicated.

If you don't want to trust the built-in ordering, then you can add an order by:

ORDER BY CHARINDEX('@' + s.value + '@', '@' + @test + '@')

This assumes that there are no duplicate lines in the results. Once again, that is true for this problem, but not in general.

Here is a db<>fiddle.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786