0

I have a field in a sql table but I need to parse it via charindex, but the lttle caveat is, I don't know how many pieces there are.

The field data would look like the following:

(Image: "filename=a.jpg"), (Image: "filename=b.jpg") 

But the question I'm not sure how many filenames there will be in this string, so i need to dynamically build this out this could be 1 or this could be 100.

Any suggestions?

Thanks

Shnugo
  • 66,100
  • 9
  • 53
  • 114
Peter
  • 19
  • 2

3 Answers3

0

Since you cannot know in advance how many values you will extract from each value, I would suggest to represent the results as records, not columns.

If you are using SQL Server 2016 or higher, you can use function STRING_SPLIT() to turn CSV parts to records. Then, SUBSTRING() and CHARINDEX() can be used to extract the relevant information:

declare @t table ([txt] varchar(200)) insert into @t VALUES ('(Image: "filename=a.jpg"),(Image: "filename=b.jpg")')

SELECT value, SUBSTRING(
    value, 
    CHARINDEX('=', value) + 1, 
    LEN(value) - CHARINDEX('=', value) - 2
)
FROM @t t 
CROSS APPLY STRING_SPLIT(t.txt , ',') 

Demo on DB Fiddle:

DECLARE @t table ([txt] varchar(200))
INSERT INTO @t VALUES ('(Image: "filename=a.jpg"),(Image: "filename=b.jpg")')

SELECT value, SUBSTRING(
    value, 
    CHARINDEX('=', value) + 1, 
    LEN(value) - CHARINDEX('=', value) - 2
)
FROM @t t 
CROSS APPLY STRING_SPLIT(t.txt , ',') 
GO
value                     | (No column name)
:------------------------ | :---------------
(Image: "filename=a.jpg") | a.jpg           
(Image: "filename=b.jpg") | b.jpg           

NB : this assumes that the value to extract is always located after the first equal sign and until 2 characters before the end of string. If the pattern is different, you may need to adapt the SUBSTRING()/CHARINDEX() calls.

GMB
  • 216,147
  • 25
  • 84
  • 135
  • The OP uses v2012, so no `STRING_SPLIT()` available... Another flaw would be the fact, that `STRING_SPLIT()` does not guarantee to return the element's order correctly. This would go against *Pick the element by its position* (if needed)... [This answer](https://stackoverflow.com/a/38275075/5089204) provides a secure and fast approach using `OPENJSON` instead (but this needs v2016 too). – Shnugo Mar 05 '19 at 08:11
0

The real issue is: This is breaking 1.NF. You should never ever store more than one piece of data in one cell. Such CSV-formats are a pain in the neck and you really should use a related side table to store your image hints one by one.

Nevertheless, this can be handled:

--A mockup table

DECLARE @mockup TABLE(ID INT IDENTITY,YourString VARCHAR(1000));
INSERT INTO @mockup VALUES
 ('(Image: "filename=a.jpg"), (Image: "filename=b.jpg") ')
,('(Image: "filename=aa.jpg"), (Image: "filename=bb.jpg"), (Image: "filename=cc.jpg"), (Image: "filename=dd.jpg"), (Image: "filename=ee.jpg")');

--Pick one element by its position:

DECLARE @position INT=2;

SELECT CAST('<x>' + REPLACE(t.YourString,',','</x><x>') + '</x>' AS XML)
       .value('/x[position()=sql:variable("@position")][1]','nvarchar(max)')
FROM @mockup t;

The trick is, to transform the string to XML and use XQuery to fetch the needed element by its position. The intermediate XML looks like this:

<x>(Image: "filename=a.jpg")</x>
<x> (Image: "filename=b.jpg") </x>

You can use some more replacements and L/RTRIM() to get it cleaner.

Read table data

And if you want to create a clean side table and you need all data neatly separated, you can use a bit more of the same:

SELECT CAST('<x><y><z>' 
           + REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                    t.YourString,'(','') --no opening paranthesis
                                ,')','') --no closing paranthesis
                                ,'"','') --no quotes
                                ,' ','') --no blanks
                                ,'=','</z><z>')                --Split at "="
                                ,':','</z></y><y><z>')         --Split at ":"
                                ,',','</z></y></x><x><y><z>')  --Split at ","
                    + '</z></y></x>' AS XML)
FROM @mockup t;

This returns

<x>
  <y>
    <z>Image</z>
  </y>
  <y>
    <z>filename</z>
    <z>a.jpg</z>
  </y>
</x>
<x>
  <y>
    <z>Image</z>
  </y>
  <y>
    <z>filename</z>
    <z>b.jpg</z>
  </y>
</x>

And with this you would get a clean EAV-table (

WITH Casted AS
(
    SELECT ID
          ,CAST('<x><y><z>' 
               + REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                        t.YourString,'(','')
                                    ,')','')
                                    ,'"','')
                                    ,' ','')
                                    ,'=','</z><z>')
                                    ,':','</z></y><y><z>')
                                    ,',','</z></y></x><x><y><z>') 
                        + '</z></y></x>' AS XML) AS CastedToXml
    FROM @mockup t
)
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS ID
      ,ID AS oldId
      ,eachElement.value('y[1]/z[1]','varchar(max)') AS DataType
      ,eachElement.value('y[2]/z[1]','varchar(max)') AS ContentType
      ,eachElement.value('y[2]/z[2]','varchar(max)') AS Content
FROM Casted
CROSS APPLY CastedToXml.nodes('/x') A(eachElement)

The result

+----+-------+----------+-------------+---------+
| ID | oldId | DataType | ContentType | Content |
+----+-------+----------+-------------+---------+
| 1  | 1     | Image    | filename    | a.jpg   |
+----+-------+----------+-------------+---------+
| 2  | 1     | Image    | filename    | b.jpg   |
+----+-------+----------+-------------+---------+
| 3  | 2     | Image    | filename    | aa.jpg  |
+----+-------+----------+-------------+---------+
| 4  | 2     | Image    | filename    | bb.jpg  |
+----+-------+----------+-------------+---------+
| 5  | 2     | Image    | filename    | cc.jpg  |
+----+-------+----------+-------------+---------+
| 6  | 2     | Image    | filename    | dd.jpg  |
+----+-------+----------+-------------+---------+
| 7  | 2     | Image    | filename    | ee.jpg  |
+----+-------+----------+-------------+---------+
Community
  • 1
  • 1
Shnugo
  • 66,100
  • 9
  • 53
  • 114
0

I used a table value function

ALTER FUNCTION [dbo].[Fn_sqllist_to_table](@list  AS VARCHAR(8000),
                                           @delim AS VARCHAR(10))
RETURNS @listTable TABLE(
  Position INT,
  Value    VARCHAR(8000))
AS
  BEGIN
      DECLARE @myPos INT

      SET @myPos = 1

      WHILE Charindex(@delim, @list) > 0
        BEGIN
            INSERT INTO @listTable
                        (Position,Value)
            VALUES     (@myPos,LEFT(@list, Charindex(@delim, @list) - 1))

            SET @myPos = @myPos + 1

            IF Charindex(@delim, @list) = Len(@list)
              INSERT INTO @listTable
                          (Position,Value)
              VALUES     (@myPos,'')

            SET @list = RIGHT(@list, Len(@list) - Charindex(@delim, @list))
        END

      IF Len(@list) > 0
        INSERT INTO @listTable
                    (Position,Value)
        VALUES     (@myPos,@list)

      RETURN
  END 

By calling it via

select * into #test from tableX as T
cross apply [Fn_sqllist_to_table](fieldname,'(')

and then just substringed the value into the final table

Peter
  • 19
  • 2
  • Hi Peter, this splitter function is a very bad choice - at least, if performance matters... There are much better approaches. This is a *multi-statement-TVF* using a `WHILE` loop. Two *no-no*s... – Shnugo Mar 06 '19 at 08:49