2

Simple enough question I think.

I have a dataset, quite large with a bit of free-text name data. I need to to link this to our employee table.

There's a whole set of different ways people have entered the 'owner' in to this fields (John Smith, J.Smith, John Smith (JSMITH), Company:John Smith/Client: John Smith, ect.)

Most of these are fine, but the problem I have is with the ones where multiple names have been entered. For example; "John Smith / Joe Bloggs".

I have a pre-created Table-Valued function which takes in a string and a delimiter, then returns a table with the results of the split.

dbo.Split('John Smith / Joe Bloggs')

id     val
1      John Smith
2      Joe Bloggs

The issue I have is that I need these results to come back for each row within an existing dataset. So for example, my query selecting the Owner, RefNumber and OSProjectCode fro my 'ProjectActions' table containing the following data:

RefNumber    OSProjectCode   Owner
1            1234            Bill Baggins
2            1234            John Smith / Joe Bloggs

would come out looking like this:

RefNumber    OSProjectCode   Owner
1            1234            Bill Baggins
2            1234            John Smith
2            1234            Joe Bloggs

What I've tried to far is attempt to join on the results of the function - but unsurprisingly it wont let me send in the column from ProjectsActions into the function like that.

SELECT a.val AS [Owner], pa.[RefNumber], pa.[OSProjectCode]
FROM dbo.ProjectsActions pa 
INNER JOIN dbo.Split(pa.[Owner], '/') a

Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "pa.Owner" could not be bound.

The only way I can think of doing this, which seems a little too bulky and messy, is the below:

;with base as(
SELECT
     pa.RefNumber
,    pa.OSProjectCode
,    (SELECT val FROM dbo.Eval(pa.Owner) WHERE id = 1) AS [First]
,    (SELECT val FROM dbo.Eval(pa.Owner) WHERE id = 2) AS [Second]
FROM ProjectsActions pa
)
SELECT
    a.RefNumber
,   a.OSProjectCode
,   a.First AS [Owner]
FROM base a WHERE a.First IS NOT NULL
UNION ALL
SELECT
    b.RefNumber
,   b.OSProjectCode
,   b.Second AS [Owner]
FROM base b WHERE a.First IS NOT NULL

Surely there's a better way? Something more similar to my first attempt - joining to the results within each row?

Any feedback or ideas would be much appreciated.

Cheers, Scott.

EDIT: FYI if anyone comes accross this with a similar issue, but are missing the 'split' part - I use a function found elsewhere on stackoverflow. https://stackoverflow.com/a/14600765/1700309

Community
  • 1
  • 1
Scott Allen
  • 513
  • 2
  • 13
  • [How to get multiple rows into one line as a string?](http://stackoverflow.com/questions/7958816/how-to-get-multiple-rows-into-one-line-as-a-string) – Lukasz Szozda Aug 24 '15 at 17:39

1 Answers1

2

You need to use an APPLY as your join.

SELECT
    a.val AS [Owner],
    pa.[RefNumber],
    pa.[OSProjectCode]
FROM dbo.ProjectsActions pa 
    CROSS APPLY dbo.Split(pa.[Owner], '/') a

The CROSS APPLY acts like an INNER JOIN passing the row-level value into your table-valued function. If you expect split function returns NULL if it can't split the value (NULL, empty, etc), you can use OUTER APPLY so that the NULL won't drop that row out of your result set. You can also add a COALESCE to fall back to the [owner].

SELECT
    COALESCE(a.val, pa.[Owner]) AS [Owner],
    pa.[RefNumber],
    pa.[OSProjectCode]
FROM dbo.ProjectsActions pa 
    OUTER APPLY dbo.Split(pa.[Owner], '/') a
Jason W
  • 13,026
  • 3
  • 31
  • 62
  • Interesting stuff on the Outer Apply. I was going to select from the table where the row doesn't contain a '/' then using a union all against the ones which do contain a '/' with the Cross Apply. This looks much neater though. – Scott Allen Aug 24 '15 at 18:00
  • 1
    Glad it will work for you :) It's a great tool in cases such as this. – Jason W Aug 24 '15 at 18:05