-1

I'm learning c# and i have a task to:
Sync content of the two directories. Given the paths of the two dirs - dir1 and dir2,then dir2 should be synchronized with dir 1:

  • If a file exists in dir1 but not in dir2,it should be copied
    -if a file exists in dir1 and in dir2,but content is changed,then file from dir1 should overwrite the one from dir2
    -if a file exists in dir2 but not in dir1 it should be removed
    Notes: files can be extremly large
    -dir1 can have nested folders
    hints:
    -read and write files async
    -hash files content and compare hashes not content

I have some logic on how to do this,but i dont know how to implement it.I googled the whole internet to get a point to start,but unsuccesseful.
I started with this:

using System.Security.Cryptography;

class Program
{
    static void Main(string[] args)
    {
        string sourcePath = @"C:\Users\artio\Desktop\FASassignment\root\dir1";
        string destinationPath = @"C:\Users\artio\Desktop\FASassignment\root\dir2";

        string[] dirsInSourcePath = Directory.GetDirectories(sourcePath, "*", SearchOption.AllDirectories);
        string[] dirsInDestinationPath = Directory.GetDirectories(destinationPath, "*", SearchOption.AllDirectories);

        var filesInSourcePath = Directory.GetFiles(sourcePath, "*", SearchOption.AllDirectories);
        var filesInDestinationPath = Directory.GetFiles(destinationPath,"*",SearchOption.AllDirectories);



        //Directories in source Path
        foreach (string dir in dirsInSourcePath)
        {
            Console.WriteLine("sourcePath:{0}", dir);
            Directory.CreateDirectory(dir);
        }

        //Directories in destination path
        foreach (string dir in dirsInDestinationPath)
        {
            Console.WriteLine("destinationPath:{0} ", dir);
        }

        //Files in source path
        foreach (var file in filesInSourcePath)
        {
            Console.WriteLine(Path.GetFileName(file));
        }

        //Files in destination path
        foreach (var file in filesInDestinationPath)
        {
            Console.WriteLine(Path.GetFileName(file));
        }
    }

}  

As i understand,i should check if in dir1 are some folders and files,if true,copy them in folder 2,and so on,but how to do this? i'm burning my head out two days already and have no idea.. please help.

Edit: For the first and second point i got a solution. :

public static void CopyFolderContents(string sourceFolder, string destinationFolder, string mask, Boolean createFolders, Boolean recurseFolders)
    {
        try
        {
            /*if (!sourceFolder.EndsWith(@"\")) { sourceFolder += @"\"; }
            if (!destinationFolder.EndsWith(@"\")) { destinationFolder += @"\"; }*/

            var exDir = sourceFolder;
            var dir = new DirectoryInfo(exDir);
            SearchOption so = (recurseFolders ? SearchOption.AllDirectories : SearchOption.TopDirectoryOnly);

            foreach (string sourceFile in Directory.GetFiles(dir.ToString(), mask, so))
            {
                FileInfo srcFile = new FileInfo(sourceFile);
                string srcFileName = srcFile.Name;

                // Create a destination that matches the source structure
                FileInfo destFile = new FileInfo(destinationFolder + srcFile.FullName.Replace(sourceFolder, ""));

                if (!Directory.Exists(destFile.DirectoryName) && createFolders)
                {
                    Directory.CreateDirectory(destFile.DirectoryName);
                }

                if (srcFile.LastWriteTime > destFile.LastWriteTime || !destFile.Exists)
                {
                    File.Copy(srcFile.FullName, destFile.FullName, true);
                }
            }
        }
        catch (Exception ex)
        {
            System.Diagnostics.Debug.WriteLine(ex.Message + Environment.NewLine + Environment.NewLine + ex.StackTrace);
        }
    }  

It's not perfect,but it works.How this function should be improved: add async copy,and compare hashes of files not to copy again the identical ones. How to do it?

NerdyStudent
  • 171
  • 1
  • 14
  • 1
    Take a look at [FileSystemWatcher Class](https://learn.microsoft.com/en-us/dotnet/api/system.io.filesystemwatcher?view=net-6.0) ... this should help you to notify your app whenever modification or change happen in source folder – Ibram Reda Mar 02 '22 at 20:21
  • The sync in real time is the task two,but this i should do manually :( – NerdyStudent Mar 02 '22 at 20:29
  • 1
    So basically, you want to know how to copy files asyncronaslly ? .... then take a look at this qustion [Non-blocking file copy in C#](https://stackoverflow.com/questions/882686/non-blocking-file-copy-in-c-sharp) and also for get file hashing you can see [How do I do a SHA1 File Checksum in C#?](https://stackoverflow.com/questions/1993903/how-do-i-do-a-sha1-file-checksum-in-c) , finally to check file existence you use [File.Exists](https://learn.microsoft.com/en-us/dotnet/api/system.io.file.exists?view=net-6.0) Method .... all this resouces could help you happy coding – Ibram Reda Mar 02 '22 at 20:44
  • 1
    The hashing hint is a red herring. It likely won't make things work better because we're still looking at every file on every run, and you still need to read an entire file to create its hash. Instead, use metadata about the files to detect changes: files with different modified dates or sizes are **definitely** different. You only need to compare if that metadata is the same, and then in most cases you don't need to read the entire contents of both files to detect a change. Now you WILL need to read both sets when the files ARE the same, but you'd need that and more to create the hashes, too. – Joel Coehoorn Mar 02 '22 at 22:58
  • 1
    [continued] Now if you can save a database of previously synced files for dir2 (and can trust the folder will remain unmodified outside your app), then you could use hashes to save some work, because you'd _already know the hash values_ for those files on later runs, and we're only likely to need to compare hashes if you were also likely to need to read the entire file anyway. But it doesn't seem like that's part of the spec here. – Joel Coehoorn Mar 02 '22 at 22:59
  • Now it makes sense,i used the modify time metadata to check the files,and its working pretty good – NerdyStudent Mar 02 '22 at 23:00
  • 1
    Additionally, async is not likely to help you here. at least in the way you'd first expect. Async code allows you to use multiple threads or make better/more efficient use of an existing thread by doing other CPU work while waiting on something like disk I/O... and you ARE expecting to do a lot of I/O here. However, this task is likely to be fully I/O limited. That is, by using aysnc to let the CPU get more files moving while waiting on the ones already in progress to finish, you'll likely make things MUCH SLOWER by forcing a disk to seek around, rather than reading/writing files in sequence. – Joel Coehoorn Mar 02 '22 at 23:02
  • 1
    [continued] However, async can help here, by allowing the app to stay responsive for things like cancel requests or for showing status updates. Just not so much for improving throughput. Additionally, async can help with the steps of comparing metadata, separate from the copying of the files. – Joel Coehoorn Mar 02 '22 at 23:03
  • As i understand,async is much more needed if there are more operations to do and bigger files to move? – NerdyStudent Mar 02 '22 at 23:07
  • @NerdyStudent async will be more helpful if you have LOTS of files to evaluate, but only expect to actually need to copy a few of them. – Joel Coehoorn Mar 02 '22 at 23:07
  • Oh,i really thought i should use them in this task,but these hints wouldnt be such a great improvement to the program. – NerdyStudent Mar 02 '22 at 23:09

1 Answers1

1

So,after some time of much more research,i came up with this solution:

using System.Diagnostics;

class Program
{
    static void Main(string[] args)
    {
        string sourcePath = @"C:\Users\artio\Desktop\FASassignment\root\dir1";
        string destinationPath = @"C:\Users\artio\Desktop\FASassignment\root\dir2";
        var source = new DirectoryInfo(sourcePath);
        var destination = new DirectoryInfo(destinationPath);

        CopyFolderContents(sourcePath, destinationPath, "", true, true);
        DeleteAll(source, destination);
    }

    public static void CopyFolderContents(string sourceFolder, string destinationFolder, string mask, Boolean createFolders, Boolean recurseFolders)
    {
        try
        {

            var exDir = sourceFolder;
            var dir = new DirectoryInfo(exDir);
            var destDir = new DirectoryInfo(destinationFolder);

            SearchOption so = (recurseFolders ? SearchOption.AllDirectories : SearchOption.TopDirectoryOnly);

            foreach (string sourceFile in Directory.GetFiles(dir.ToString(), mask, so))
            {
                FileInfo srcFile = new FileInfo(sourceFile);
                string srcFileName = srcFile.Name;

                // Create a destination that matches the source structure
                FileInfo destFile = new FileInfo(destinationFolder + srcFile.FullName.Replace(sourceFolder, ""));

                if (!Directory.Exists(destFile.DirectoryName) && createFolders)
                {
                    Directory.CreateDirectory(destFile.DirectoryName);
                }

                //Check if src file was modified and modify the destination file
                if (srcFile.LastWriteTime > destFile.LastWriteTime || !destFile.Exists)
                {
                    File.Copy(srcFile.FullName, destFile.FullName, true);
                }
            }
        }
        catch (Exception ex)
        {
            Debug.WriteLine(ex.Message + Environment.NewLine + Environment.NewLine + ex.StackTrace);
        }
    }

    private static void DeleteAll(DirectoryInfo source, DirectoryInfo target)
    {
        if (!source.Exists)
        {
            target.Delete(true);
            return;
        }

        // Delete each existing file in target directory not existing in the source directory.
        foreach (FileInfo fi in target.GetFiles())
        {
            var sourceFile = Path.Combine(source.FullName, fi.Name);
            if (!File.Exists(sourceFile)) //Source file doesn't exist, delete target file
            {
                fi.Delete();
            }
        }

        // Delete non existing files in each subdirectory using recursion.
        foreach (DirectoryInfo diTargetSubDir in target.GetDirectories())
        {
            DirectoryInfo nextSourceSubDir = new DirectoryInfo(Path.Combine(source.FullName, diTargetSubDir.Name));
            DeleteAll(nextSourceSubDir, diTargetSubDir);
        }
    }
}  

It does everything it should,the only missing points are the async copy and sha comparison,but at least i have a solution.

NerdyStudent
  • 171
  • 1
  • 14