0

I asked a question here : Clean duplicates and their instances from a list

Now I'm facing a new situation , I have a datamodel like this :

  public class AmpFile
    {
        public string filename { get; set; }
        public string actualpath { get; set; }
        public int fileversion { get; set; }
    }

And an example data like this :

| Index | filename        | actualpath                 | fileversion |
|-------|-----------------|----------------------------|-------------|
| 1     | demofile.opt    | d:\optfiles\demofile.opt   | 8           |
| 2     | somefile.opt    | c:\somefile.opt            | 3           |
| 3     | somefile.opt    | f:\files\somefile.opt      | 8           |
| 4     | test.opt        | c:\test.opt                | 5           |
| 5     | demofile.opt    | c:\demofile.opt            | 5           |
| 6     | anothertest.opt | f:\files\anothertest.opt   | 2           |
| 7     | somefile.opt    | c:\somefolder\somefile.opt | 1           |

Okay now I want to find duplicates files with same name by using filename and keep the higher version using fileversion and delete the rest duplicates.

and by deleting I mean deleting their files and from list too.

I tried to figure it out by linq but every harder i try i get more worse results , I need to do this carefully and clean , that's why I'm asking on stackoverflow , to find the best solution.

regards.

  • You must provide the code which doesnt work correctly if you want help fixing it – Ňɏssa Pøngjǣrdenlarp May 04 '20 at 00:34
  • You wont be able to do this all in one LINQ statement. You will need to break it up into several actions through LINQ or multi-line SQL. Get all the entries that when grouped by filename return the highest entry when ordered by fileversion. Then get those Index values and left join that on the original data set and delete records/selected records for deletion where the left join value is NULL – KingOfArrows May 04 '20 at 00:38
  • @KingOfArrows i don't care I just need a good solution. and it's not SQL. it's C# list. –  May 04 '20 at 00:42
  • @ŇɏssaPøngjǣrdenlarp I need a method not fixing my code , My code makes no sense. –  May 04 '20 at 00:42
  • @DraculaBytePair You said you tried figuring it out but couldn't, so I provided the first step which is to break down the problem into a several steps. I cannot provide a code solution as your question does not provide your current code implementation. – KingOfArrows May 04 '20 at 00:49
  • @KingOfArrows I'm currently doing it with multiple loops and if else with linq from my prev question ... so it's not a good way at all , It will be really great if you can provide your method and solution. There's no code required for this question I have a list of AmpFile with that example data , Now how can I find duplicates and get the highest version of them , keep it and remove other dups from disk and the list. –  May 04 '20 at 00:59

2 Answers2

0

Linq

myList.Where(x=> 
 x.fileversion != 
   myList.Where(y => x.filename == y.filename).Max(y=>y.fileversion)
       ).Remove();
Mertuarez
  • 901
  • 7
  • 24
0

Try following :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            List<AmpFile> files = new List<AmpFile>() {
                new AmpFile() { filename = "demofile.opt", actualpath = @"d:\optfiles\demofile.opt", fileversion = 8}, 
                new AmpFile() { filename = "somefile.opt", actualpath = @"c:\somefile.opt", fileversion = 3}, 
                new AmpFile() { filename = "somefile.opt", actualpath = @"f:\files\somefile.opt", fileversion = 8}, 
                new AmpFile() { filename = "test.opt", actualpath = @"c:\test.opt", fileversion = 5}, 
                new AmpFile() { filename = "demofile.opt", actualpath = @"c:\demofile.opt", fileversion = 5}, 
                new AmpFile() { filename = "anothertest.opt" , actualpath = @"f:\files\anothertest.opt", fileversion = 2}, 
                new AmpFile() { filename = "testfile2.jpg" , actualpath = @":\somefolder\somefile.opt", fileversion = 1}
            };

            List<AmpFile> output = files.OrderByDescending(x => x.fileversion)
                .GroupBy(x => x.filename)
                .Select(x => x.First())
                .ToList();

            List<AmpFile> deleteFiles = files.OrderByDescending(x => x.fileversion)
                .GroupBy(x => x.filename)
                .SelectMany(x => x.Skip(1))
                .ToList();
        }
    }
    public class AmpFile
    {
        public string filename { get; set; }
        public string actualpath { get; set; }
        public int fileversion { get; set; }
    }
}

Here is the results

enter image description here

jdweng
  • 33,250
  • 2
  • 15
  • 20
  • First code is putting output into a new variable so it is not wiping any of the data. I use this solution all the time and it works. You do not understand linq. The file is sorting by version so after grouping the first member of each group is the one with the highest version. – jdweng May 04 '20 at 08:06
  • I added picture of the results. – jdweng May 04 '20 at 08:13
  • What database? If here want to delete than simply make the output variable the same as the input variable so you are updating the list object. Deleting is slower than filtering like I'm doing. – jdweng May 04 '20 at 10:18
  • I just provided the linq. Not the code to delete the files. – jdweng May 04 '20 at 10:46
  • I like that SelectMany couple days ago i look for such operation. Some sort of UngroupBy :D – Mertuarez May 04 '20 at 12:30
  • A group produces a two dimension array KeyValuePair[]. So the SelectMany does ungroup, it just turns the two dimension are into a one dimension array object[] with no key. – jdweng May 04 '20 at 13:51