0

Based on the great solution provided by Sergey for my c# question here: Filter out path strings beginning with strings, i went away to craft a similar thing for T-SQL. I can refactor this to use in memory table as opposed to cursors, but i would really like to see if there is a set-based approach that i can take to achieve the same goal.

Scenario: I have paths in the db, like so: C:\Users, C:\Users\cheese, D:\Shadow\stuff ,D:\Shadow. I need to filter the paths leaving the rootmost ones only (e.g. out of the 4 above, leave only C:\Users, D:\Shadow)

This is what i have right now:

ALTER PROCEDURE [dbo].[GetPaths]
@guy NVARCHAR(MAX)
AS
DECLARE 
@tempPath NVARCHAR(MAX) = '',
@Path NVARCHAR (MAX),
@filteredPath TABLE (FilteredPath NVARCHAR(MAX))
BEGIN
    SET NOCOUNT ON;
    IF (@guy IS NOT NULL)
    BEGIN
        DECLARE C Cursor FOR
    SELECT Paths.Path 
            WHERE
                Paths.Username = @guy 
            ORDER BY Paths.Path DESC
            OPEN C
                FETCH NEXT FROM C INTO @Path 
            WHILE @@Fetch_Status=0 BEGIN
                IF (CHARINDEX(@tempPath, @Path = 0)
                BEGIN
                INSERT INTO @filteredPath(FilteredPath)
                        VALUES (@Path)
                END
                SET @tempPath = @Path 
                FETCH NEXT FROM C into @Path 
            END
            CLOSE C
            DEALLOCATE C
                SELECT * FROM @filteredPath
END
END
Community
  • 1
  • 1
zaitsman
  • 8,984
  • 6
  • 47
  • 79

3 Answers3

1

It is generally better to do things in the database as a select query rather than using cursors. I believe the following will make efficient use of an index on Paths(path):

select p.*
from Paths p
where not exists (select 1
                  from Paths p2
                  where p.path like p2.path + '%' and
                        p.path <> p2.path
                 );

(Unfortunately, I cannot test this today.)

By the way, your cursor expression is missing the from clause and that might have to do with why it doesn't work.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • this is awesome! I would up it two times if i could. This was about twice as fast as the solutions offered by others (based on the execution plan) and about 5% of the performance of what i came up with. Thank you SO much! – zaitsman Feb 20 '14 at 08:30
1

Here's a thought.

If you take the length of the path and subtract the length of the path with the back-slash removed, you only want the ones with a value of 1. Something like this:

SELECT Paths.Path 
FROM   Paths
WHERE  Paths.Username = @guy 
       And Len(Path) - Len(Replace(Path, '\', '')) = 1
ORDER BY Paths.Path DESC

This assumes that your paths are all consistently formatted. If you have root paths that also include a trailing back-slash, then this solution will not work for you. Basically, this query will only returns paths with a single back-slash.

George Mastros
  • 24,112
  • 4
  • 51
  • 59
  • This works (and it would work for me also as i control the inputs). Thank you for the time you took to answer my question. – zaitsman Feb 20 '14 at 08:23
1

Can you not simply rewrite this as a query using a CASE statement to differentiate between those paths at the 'root' level and those that have sub-directories:

SELECT  DISTINCT
        CASE (CHARINDEX('\', Path, CHARINDEX('\', Path) + 1))
          WHEN 0 THEN Path
          ELSE SUBSTRING(Path, 1, (CHARINDEX('\', Path, CHARINDEX('\', Path) + 1) - 1))
        END
  FROM  Paths
  WHERE Username = @guy
Martin
  • 16,093
  • 1
  • 29
  • 48