2

I am using MS SQL 2005 I have a problem, which currently I am battling to get a solution.

I have a Table, with a these columns : NameList;Time

The Namelist Column has comma delimited data in it. The table data is as such:

Namelist    Time
John Smith, Jeremy Boyle, Robert Brits, George Aldrich  5
John Smith, Peter Hanson    15
Jeremy Boyle, Robert Brits  10
....

I need some sort of SQL expression that will provide me with this end result:

Name    Total_Time
John Smith  20
Jeremy Boyle    15
Robert Brits    15

Etc...... Basically the expression must find All the names in the rows and math those names with the names in the other rows and add the times together for each user.

The Idea I have is to convert the comma delimited data into rows and count the distinct records of each then somehow know what the time for it is... then multiply..... but I have no idea on how to implement it

Any Help would be much appreciated

Thanks,

JayT
  • 107
  • 1
  • 2
  • 9
  • 5
    Solution: Normalize your data. – Ignacio Vazquez-Abrams Feb 26 '10 at 12:01
  • 2
    Duplicate of http://stackoverflow.com/questions/1766532/search-if-a-string-word-exists-between-two-different-tables-in-a-comma-delimited – Lieven Keersmaekers Feb 26 '10 at 12:15
  • @Lieven, there most likely is a duplicate question for this. However, the one in your link is NOT IT. This is not the typical "split" question, it also requires a CROSS APPLY, see my answer or this question: http://stackoverflow.com/questions/2192342/how-to-seperate-the-string-value-and-create-temp-to-store-seperate-string-in-temp/2192495#2192495 – KM. Feb 26 '10 at 12:37
  • Yet another example of in how many different stupid ways can the people in non-IT departments collect their data. Then they come to their database colleagues and want them to use this data for some marketing action or some other crap.. Solution: Forbid Excel! – ercan Feb 26 '10 at 12:47
  • There's no company that runs exclusively on fully normalized databases written by IT specialists. The reason for that is that IT projects are expensive, hard to control, and prone to failure. Executive summary: Excel pwns normalization :) – Andomar Feb 26 '10 at 13:12
  • @KM - Once he has the names in a temp table, these can be used to JOIN the temp table with his namelist using a LIKE followed by a group by to SUM the times. It perhaps is not an *exact* match but given the "duplicate", OP should be able to work it out. – Lieven Keersmaekers Feb 26 '10 at 13:18

4 Answers4

5

I prefer the number table approach to split a string in TSQL

For this method to work, you need to do this one time table setup:

SELECT TOP 10000 IDENTITY(int,1,1) AS Number
    INTO Numbers
    FROM sys.objects s1
    CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)

Once the Numbers table is set up, create this split function:

CREATE FUNCTION [dbo].[FN_ListToTable]
(
     @SplitOn  char(1)      --REQUIRED, the character to split the @List string on
    ,@List     varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN 
(

    ----------------
    --SINGLE QUERY-- --this will not return empty rows
    ----------------
    SELECT
        ListValue
        FROM (SELECT
                  LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(@SplitOn, List2, number+1)-number - 1))) AS ListValue
                  FROM (
                           SELECT @SplitOn + @List + @SplitOn AS List2
                       ) AS dt
                      INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
                  WHERE SUBSTRING(List2, number, 1) = @SplitOn
             ) dt2
        WHERE ListValue IS NOT NULL AND ListValue!=''

);
GO 

You can now easily split a CSV string into a table and join on it:

select * from dbo.FN_ListToTable(',','1,2,3,,,4,5,6777,,,')

OUTPUT:

ListValue
-----------------------
1
2
3
4
5
6777

(6 row(s) affected)

Your can now use a CROSS APPLY to split every row in your table like:

DECLARE @YourTable table (NameList varchar(5000), TimeOf int)
INSERT INTO @YourTable VALUES ('John Smith, Jeremy Boyle, Robert Brits, George Aldrich',  5)
INSERT INTO @YourTable VALUES ('John Smith, Peter Hanson',    15)
INSERT INTO @YourTable VALUES ('Jeremy Boyle, Robert Brits',  10)

SELECT
    st.ListValue AS NameOf, SUM(o.TimeOf) AS TimeOf
    FROM @YourTable  o
        CROSS APPLY  dbo.FN_ListToTable(',',o.NameList) AS st
    GROUP BY st.ListValue
    ORDER BY st.ListValue

OUTPUT:

NameOf                  TimeOf     
----------------------- -----------
George Aldrich          5          
Jeremy Boyle            15         
John Smith              20         
Peter Hanson            15         
Robert Brits            15         

(5 row(s) affected)

Using this, I would recommend that you alter your table design and use this output to INSERT into a new table. That would be a more normalized approach. Also Don't use reserved words for column names, it makes it a hassle. Notice how I use "NameOf" and "TimeOf", so I avoid using reserved words.

KM.
  • 101,727
  • 34
  • 178
  • 212
1

Either: Search for other answers to fix your data on the fly, slowly, and repeatedly

Or: Normalise. Why do you think normalisation exists and why people bang on about it?

Community
  • 1
  • 1
gbn
  • 422,506
  • 82
  • 585
  • 676
  • +1. For "normalisation exists and why people bang on about it?" – Guru Feb 26 '10 at 12:21
  • 1
    Include a snark about SQL injection, and you have the perfect architecture astronaut answer. But you still won't have the sum of time per user :) – Andomar Feb 26 '10 at 12:43
  • @Andomar: I couldn't be bothered to adopt one of my previous answers (or yours, or KM's) if the OP could not be bothered to search first. Have you not had enough of the same question time and time again... Now, this is different... http://stackoverflow.com/questions/2340998/stored-procedure-in-sql-server-order-by-desc/2341162#2341162 – gbn Feb 26 '10 at 12:49
  • Searching for this kind of thing is not as easy for everyone. Anyway I enjoyed answering, or I wouldn't have. (Btw, not my downvote :)) – Andomar Feb 26 '10 at 13:00
1

You could create a table-valued function to split the namelist into many rows:

if object_id('dbo.fnSplitNamelist') is not null
    drop function dbo.fnSplitNamelist
go
create function dbo.fnSplitNamelist(
    @namelist varchar(max))
returns @names table (
    name varchar(50))
as 
    begin
    declare @start int
    declare @end int
    set @start = 0
    while IsNull(@end,0) <> len(@namelist) + 1
        begin
        set @end = charindex(',', @namelist, @start)
        if @end = 0
            set @end = len(@namelist) + 1

        insert into @names select ltrim(rtrim(
            substring(@namelist,@start,@end-@start)))

        set @start = @end + 1
        end
    return
    end
go

You can use a cross apply to return the names for each namelist. Then you can use group by to sum the time per user:

declare @YourTable table (namelist varchar(1000), time int)

insert into @YourTable
select 'John Smith, Jeremy Boyle, Robert Brits, George Aldrich',  5
union all select 'John Smith, Peter Hanson',  15
union all select 'Jeremy Boyle, Robert Brits',  10

select fn.name, sum(t.time)
from @YourTable t
cross apply fnSplitNamelist(t.namelist) fn
group by fn.name

This results in:

George Aldrich      5
Jeremy Boyle        15
John Smith          20
Peter Hanson        15
Robert Brits        15
Andomar
  • 232,371
  • 49
  • 380
  • 404
0

The best option is to normalise the data. Then it would be a lot easier to work with.

The second best option would be to use a recursive query to pick a name at a time from each name list and return as a list of separate names and their respecive time from each record, then use grouping to sum the times for each name.

No need for user defined functions or pre-created tables. ;)

with NameTime ([Name], [Time], Namelist)
as (
  select cast(null as varchar(100)), [Time], Namelist
  from NamelistTime
  union all
  select
    case when Pos = 0 then NameList else substring(Namelist, 1, Pos - 1) end,
    [Time],
    case when Pos = 0 then null else substring(NameList, Pos + 2, len(Namelist) - Pos - 1) end
  from (
    select [Time], Namelist, Pos = charindex(', ', Namelist)
    from NameTime
  ) x
  where Namelist is not null
)
select [Name], sum([Time])
from NameTime
where [Name] is not null
group by [Name]

In contrast, working with normalised data, it would be as simple as:

select p.Name, sum(n.Time)
from NamelistTime n
inner join Person p on p.PersonId = n.PersonId
group by p.Name
Guffa
  • 687,336
  • 108
  • 737
  • 1,005