1

I've implemented a File Explorer in a Windows application and the file structure is stored in a T-SQL database using the hierarchyid type.

There is a requirement when copying folders to deal with name clashes in a similar way to Explorer, but not exactly the same. I have achieved this in C# with a recursive method but now I am told that it needs to be done in SQL and no overwriting of folders is allowed.

For example, I have the following folder structure in the root directory: -

  • Folder
  • Folder (2)
  • Folder (4)

Elsewhere I have the following folders which I would like to copy to the root directory: -

  • Folder
  • Folder (2)
  • Folder (3)

so...

  • Folder would become Folder (3) because that name is and 3 is the next available number.
  • Folder (2) would become Folder (2) (2) because that name is taken.
  • Folder (3) would become Folder (3) (2) because Folder (3) is now taken after our rename operation above.

I honestly believe this is a recursive problem because after picking a new name, other folders may also need to be renamed to take that into account, then further folders need to take that into account, etc...

My thoughts are a recursive CTE but I am really struggling to come up with anything. Here is the test data: -

declare @existing table
(
    folderid uniqueidentifier,
    displayname varchar(20)
)

declare @folderstocopy table
(
    folderid uniqueidentifier,
    displayname varchar(20)
)

insert @existing
values (newid(),'Folder'),(newid(),'Folder (2)'), (newid(),'Folder (4)')


insert @folderstocopy
values (newid(),'Folder'),(newid(),'Folder (2)'), (newid(),'Folder (3)')

--TODO some logic to deal with name clashes and insert from @folderstocopy into @existing

select * from @existing

Thanks in advance for your help.

  • 1
    so let me get this straight....you've attempted to duplicate the file system inside a database? Why? – Mitch Wheat Jan 25 '18 at 13:53
  • https://stackoverflow.com/questions/935098/database-structure-for-tree-data-structure – Mitch Wheat Jan 25 '18 at 13:56
  • There is nothing in that structure that lends itself to recursive manipulation besides the fact that the display name semantics you stated indicates that there could be 1..N appended numbers to indicate uniqueness. From a sql perspevctive, I would move the indicator out of the displayname and let the schema indicate the duplicity. In your example, the recursion would be in the string manipulation on the name. – Ross Bush Jan 25 '18 at 14:07
  • Ross, are you able to elaborate more...do you mean add the indicator into a separate column? How would I achieve the recursion? – Kelvin Melvin Henry Handley Jan 26 '18 at 16:21

1 Answers1

1

This is not possible without using a loop/cursor of some form. A CTE is a convenient way to walk a static hierarchy, but your problem requires a dynamic hierarchy.

In a static hierarchy there is a deterministic number of steps for each entry point, regardless of how many objects pass through that point. For example, there are two reports between myself and the CEO. If I want to calculate the length of walk for several other employees the length of my walk through the hierarchy will always be two.

In your example, the length of walk changes based on the other folders you want to copy and the order that you process them. If you want to copy 'Folder' and 'Folder (3)', then either 'Folder' or 'Folder (3)' will have a different length of walk depending on which walks the hierarchy first. After the first folder processed has walked the hierarchy then the hierarchy is changed making the hierarchy dynamic.

The below code walks the existing hierarchy for each folder, but doesn't produce the desired answer because the both 'Folder' and 'Folder (3)' end up with the same name

 declare @existing table (folderid uniqueidentifier, displayname varchar(20));

declare @folderstocopy table (folderid uniqueidentifier, displayname varchar(20));

insert @existing
values
    (newid(), 'Folder')
  , (newid(), 'Folder (2)')
  , (newid(), 'Folder (4)');

insert @folderstocopy
values
    (newid(), 'Folder')
  , (newid(), 'Folder (2)')
  , (newid(), 'Folder (3)');

--TODO some logic to deal with name clashes and insert from @folderstocopy into @existing
with cte1 as
    (
        -- anchor
        select
            a.folderid
          , a.displayname
          , b.displayname as existingmatch
          , 1 as lvl
        from
            @folderstocopy as a
        left join
            @existing as b
            on
            a.displayname = b.displayname

        -- recursive
        union all
        select
            a.folderid
          , a.displayname
          , b.displayname as existingmatch
          , a.lvl + 1
        from
            cte1 as a
        inner join
            @existing as b
            on
            a.displayname + ' (' + cast(lvl + 1 as varchar(255)) + ')' = b.displayname
    )
select
    a.folderid
  , a.displayname as originaldisplayname
  , case
        when a.MaxLvl = 1
            then
            a.displayname
        else
            a.displayname + ' (' + cast(a.MaxLvl as varchar(255)) + ')' end as newdisplayname
from
    (
        select
             cte1.folderid
           , cte1.displayname
           , max(case when existingmatch is not null then lvl + 1 else lvl end) as MaxLvl
        from cte1
        group by
             cte1.folderid
           , cte1.displayname
    ) as a;

One possible solution would be to use the a CTE inside a loop as shown below. At each iteration of the loop you could resolve conflicts by adding the lowest folder id to the hierarchy.

declare @existing table (folderid uniqueidentifier, displayname varchar(20));

declare @folderstocopy table (folderid uniqueidentifier, displayname varchar(20));

insert @existing
values
    (newid(), 'Folder')
  , (newid(), 'Folder (2)')
  , (newid(), 'Folder (4)');

insert @folderstocopy
values
    (newid(), 'Folder')
  , (newid(), 'Folder (2)')
  , (newid(), 'Folder (3)');

--TODO some logic to deal with name clashes and insert from @folderstocopy into @existing
while exists (select * from @folderstocopy)
begin;
    with cte1 as
        (
            -- anchor
            select
                a.folderid
              , a.displayname
              , b.displayname as existingmatch
              , 1 as lvl
            from
                @folderstocopy as a
            left join
                @existing as b
                on
                a.displayname = b.displayname

            -- recursive
            union all
            select
                a.folderid
              , a.displayname
              , b.displayname as existingmatch
              , a.lvl + 1
            from
                cte1 as a
            inner join
                @existing as b
                on
                a.displayname + ' (' + cast(lvl + 1 as varchar(255)) + ')' = b.displayname
        )
       , cte2 as
        (
            select
                a.folderid
              , a.displayname as originaldisplayname
              , case
                    when a.MaxLvl = 1
                        then
                        a.displayname
                    else
                        a.displayname + ' (' + cast(a.MaxLvl as varchar(255)) + ')' end as newdisplayname
            from
                (
                    select
                         cte1.folderid
                       , cte1.displayname
                       , max(case when existingmatch is not null then lvl + 1 else lvl end) as MaxLvl
                    from cte1
                    group by
                         cte1.folderid
                       , cte1.displayname
                ) as a
        )
    insert into @existing (folderid, displayname)
    select
         min(folderid) as folderid
       , a.newdisplayname
    from cte2 as a
    group by
         a.newdisplayname;

    delete from @folderstocopy where folderid in (select folderid from @existing);
end;

select * from @existing;
gomory-chvatal
  • 332
  • 1
  • 3
  • 10
  • Thanks so much - that's a brilliant answer :-) – Kelvin Melvin Henry Handley Feb 12 '18 at 15:04
  • This works in most cases, but when there is an existing folder called "Folder (2)" and I add "Folder", it becomes Folder (3) instead of keeping its name. I've had a look and can't work it out - are you able to help please? This is slightly different to copying I guess, but the name clashing is still needed for adding files. – Kelvin Melvin Henry Handley Feb 23 '18 at 16:31