1

I have a datamodel like this :

    public class AmpFile
    {
        public string filename { get; set; }
        public string actualpath { get; set; }
    }

Now I have a list of it like this :

[ list member 1 ]    -    filename:  "testfile1.jpg"    -    actualpath:  "C:\testpath\testfile1.jpg" 
[ list member 2 ]    -    filename:  "brickwall.jpg"    -    actualpath:  "C:\testpath\brickwall.jpg" 
[ list member 3 ]    -    filename:  "mydata.txt"    -    actualpath:  "D:\mydata.txt" 
[ list member 4 ]    -    filename:  "testfile1.jpg"    -    actualpath:  "E:\demo\testfile1.jpg" 
[ list member 5 ]    -    filename:  "mydata.txt"    -    actualpath:  "F:\somefolder\mydata.txt" 
[ list member 6 ]    -    filename:  "testfile1.jpg"    -    actualpath:  "F:\somefolder\testfile1.jpg" 
[ list member 7 ]    -    filename:  "testfile2.jpg"    -    actualpath:  "F:\somefolder\testfile2.jpg" 
[ list member 7 ]    -    filename:  "testfile3.jpg"    -    actualpath:  "D:\testfile3.jpg" 

Now I want to find duplicates of each member and if there's a duplicate of it , I want to remove duplicates + the reference itself so the result I want to achieve is :

[ list member 1 ]    -    filename:  "brickwall.jpg"    -    actualpath:  "C:\testpath\brickwall.jpg" 
[ list member 2 ]    -    filename:  "testfile2.jpg"    -    actualpath:  "F:\somefolder\testfile2.jpg" 
[ list member 3 ]    -    filename:  "testfile3.jpg"    -    actualpath:  "D:\testfile3.jpg" 

How can I do it ?

  • Can you be more clear please, what did you call a "duplicate" (same filename ? same path ? Both ?) ? What did you want to delete (only the duplicates ? the duplicates AND the original ?) ? – kiliz May 03 '20 at 04:46
  • @kiliz yes exactly I want to find duplicate filenames like testfile1.jpg without caring the addresses and delete the duplicates and original reference (first found on the list ...) –  May 03 '20 at 04:48
  • 2
    Please don't post data like `[ list member 1 ] - filename: "testfile1.jpg" - actualpath: "C:\testpath\testfile1.jpg"` when valid C# would allow us to write and test code for an answer. – Enigmativity May 03 '20 at 05:15

6 Answers6

4

you can do it with Linq, by using Group by and filter all elements that have count == 1, like the following code:
1 - Prepare list of ampFile:

List<AmpFile> ampFiles = new List<AmpFile>
{
    new AmpFile{filename="testfile1.jpg",actualpath="C:\\testpath\\testfile1.jpg"},
    new AmpFile{filename="brickwall.jpg",actualpath="C:\\testpath\\brickwall.jpg"},
    new AmpFile{filename="mydata.txt",actualpath="D:\\mydata.txt"},
    new AmpFile{filename="testfile1.jpg",actualpath="E:\\demo\testfile1.jpg"},
    new AmpFile{filename="mydata.txt",actualpath="F:\\somefolder\\mydata.txt"},
    new AmpFile{filename="testfile1.jpg",actualpath="F:\\somefolder\\testfile1.jpg"},
    new AmpFile{filename="testfile2.jpg",actualpath="F:\\somefolder\\testfile2.jpg"},
    new AmpFile{filename="testfile3.jpg",actualpath="D:\\testfile3.jpg"},
};

2 - Call groupBy and filter with Where:

List<AmpFile> notDuplicatedAmpFiles = ampFiles.GroupBy(x => x.filename)
    .Where(x => x.Count() == 1)
    .SelectMany(x => x)
    .ToList();

3 - Demo:

foreach(AmpFile ampFile in notDuplicatedAmpFiles)
{
    Console.WriteLine($"fileName :{ampFile.filename}, actualPath :{ampFile.actualpath}");
}

4 - Result:

fileName :brickwall.jpg, actualPath :C:\testpath\brickwall.jpg
fileName :testfile2.jpg, actualPath :F:\somefolder\testfile2.jpg
fileName :testfile3.jpg, actualPath :D:\testfile3.jpg

I hope this help.

Mohammed Sajid
  • 4,778
  • 2
  • 15
  • 20
  • I'd suggest `.Skip(1).Any() == false` over `x.Count() == 1` as the first only iterates the first two elements of a group, but the latter has to iterate the entire group. – Enigmativity May 03 '20 at 05:40
  • 2
    @RoadRunner - On an `ICollection` it is, but not on a pure `IEnumerable`. – Enigmativity May 03 '20 at 07:38
  • 1
    @RoadRunner - But in this case a grouping is a `ICollection<>T>` (it's a `internal class Grouping : IGrouping, IEnumerable, IEnumerable, IList, ICollection`). Time to move on. Nothing to see here. – Enigmativity May 03 '20 at 07:42
  • thank you for this great answer , I accepted your answer because it was the best , Can you tell me how can I reverse this thing and get removed members as a separated list too ? –  May 03 '20 at 16:24
  • @DraculaBytePair you're welcome, change just the filter like ``Where(x => x.Count() >1)`` or use [**Except**](https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.except?view=netcore-3.1) with a custom comparer. – Mohammed Sajid May 03 '20 at 16:47
  • @Sajid I used `.Where(x => x.Count() != 1)` and it worked well. –  May 03 '20 at 16:52
1

I'd suggest this query:

var results =
    from a in list
    group a by a.filename into gas
    where !gas.Skip(1).Any()
    from ga in gas.Take(1)
    select ga;

If you start with this data:

var list = new List<AmpFile>()
{
    new AmpFile() { filename = "testfile1.jpg", actualpath = @"C:\testpath\testfile1.jpg" },
    new AmpFile() { filename = "brickwall.jpg", actualpath = @"C:\testpath\brickwall.jpg" },
    new AmpFile() { filename = "mydata.txt", actualpath = @"D:\mydata.txt" },
    new AmpFile() { filename = "testfile1.jpg", actualpath = @"E:\demo\testfile1.jpg" },
    new AmpFile() { filename = "mydata.txt", actualpath = @"F:\somefolder\mydata.txt" },
    new AmpFile() { filename = "testfile1.jpg", actualpath = @"F:\somefolder\testfile1.jpg" },
    new AmpFile() { filename = "testfile2.jpg", actualpath = @"F:\somefolder\testfile2.jpg" },
    new AmpFile() { filename = "testfile3.jpg", actualpath = @"D:\testfile3.jpg" },
};

...then you get this result:

results

Enigmativity
  • 113,464
  • 11
  • 89
  • 172
  • very well solution and I didn't know I can use SQL like commands with linq , thanks for teaching me this. +1 –  May 03 '20 at 16:25
0

You can use IEquals like code below. Your paths are in different folders so you do not have any duplicates. See below :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            List<AmpFile> files = new List<AmpFile>() {
                new AmpFile() { filename = "testfile1.jpg", actualpath = @"C:\testpath\testfile1.jpg"}, 
                new AmpFile() { filename = "brickwall.jpg", actualpath = @"C:\testpath\brickwall.jpg"}, 
                new AmpFile() { filename = "mydata.txt", actualpath = @"D:\mydata.txt"}, 
                new AmpFile() { filename = "testfile1.jpg", actualpath = @"E:\demo\testfile1.jpg"}, 
                new AmpFile() { filename = "mydata.txt", actualpath = @"F:\somefolder\mydata.txt"}, 
                new AmpFile() { filename = "testfile1.jpg" , actualpath = @"F:\somefolder\testfile1.jpg"}, 
                new AmpFile() { filename = "testfile2.jpg" , actualpath = @"F:\somefolder\testfile2.jpg"}, 
                new AmpFile() { filename = "testfile3.jpg", actualpath = @"D:\testfile3.jpg"}
            };

            List<AmpFile> output = files.Distinct().ToList();
        }
    }
    public class AmpFile : IEquatable<AmpFile>
    {
        public string filename { get; set; }
        public string actualpath { get; set; }

        public Boolean Equals(AmpFile other)
        {
            return ((this.filename == other.filename) && (this.actualpath == other.actualpath));
        }
        public override int GetHashCode()
        {
            return (this.filename + "^" + this.actualpath).GetHashCode();
        }
    }
}
jdweng
  • 33,250
  • 2
  • 15
  • 20
-1

If you don't care about getting a new list instead of deleting from the original list, you can do like this (sorry for complexity, I think it can be easly optimized → adding breaks ect...) :

List<AmpFile> foo(List<AmpFile> files)
{
 List<AmpFile> result = new List<AmpFile>();
 bool add = false;
 foreach(AmpFile file in files)
 {
  add = true;
  foreach(AmpFile alreadyAdded in result)
  {
   if(file.filename == alreadyAdded.filename)
   {
    add = false;
   }
  }
  if(add)
  {
   result.Add(file);
  }
 }
 return result;
}

If you really need to change the original list, you can do something like this (can again be optimized) :

void foo2(List<AmpFile> files)
{
 AmpFile[] temp = files.ToArray();
 List<AmpFile> toDelete = new List<AmpFile>();
 foreach(AmpFile file in temp)
 {
  foreach(AmpFile f in files)
  {
   if(f != file && f.filename == file.filename)
   {
    if(!toDelete.Contains(f))
    {
     toDelete.Add(f);
    }
   }
  }
 }

 foreach(AmpFile file in toDelete)
 {
   files.Remove(file);
 }
}
kiliz
  • 95
  • 4
-1

Running two loops on your list is the fastest way.

List<AmpFile> ampList = new List<AmpFile>();
// Populate list

for (int i = 0; i < ampList.Count; i++)
    for (int j = i + 1; j < ampList.Count; j++)
        if (ampList[j].filename == ampList[i].filename)
            ampList.RemoveAt(j);
jscarle
  • 1,045
  • 9
  • 17
  • 1) You will get some warnings because you're modifying a List while enumerating on it. 2) `.Length` doesn't exists for Lists, use `.Count`. But the general idea is good – kiliz May 03 '20 at 05:13
  • 1
    Good catch for the Count, I'm writing this from memory. You won't get a warning because using the indexer isn't enumerating. You are enumerating the list when you call the enumerator, ei: foreach. – jscarle May 03 '20 at 05:22
  • This code doesn't work. It's keeping one of the items in a duplicate. The OP wants all items removed if they belong to a duplicate. – Enigmativity May 03 '20 at 05:36
-1
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication1
{
public class AmpFile
{
    public string filename { get; set; }
    public string actualpath { get; set; }
}

class Program
{
    static void Main(string[] args)
    {
        List<AmpFile> lstemail = new List<AmpFile>();
        lstemail.Add(new AmpFile { filename = "testfile1.jpg", actualpath= "C:\testpath\testfile1.jpg"});
        lstemail.Add(new AmpFile { filename = "brickwall.jpg", actualpath = "C:\testpath\brickwall.jpg" });
        lstemail.Add(new AmpFile { filename = "mydata.txt", actualpath = @"D:\mydata.txt" });
        lstemail.Add(new AmpFile { filename = "testfile1.jpg", actualpath = @"E:\demo\testfile1.jpg" });

        var myDistinctList = lstemail.GroupBy(i => 
   i.filename).Select(g => g.First()).ToList();
     lstemail = myDistinctList;

    }
}
}

I have used linq better to use over than foreach.