2

I have a set of folders containing log files. Each folder is named as the date the log files were created. I am getting the content of these folders within X days of today and storing the resulting FileInfo in a list. So it is possible to have file info with same file name X times, or less.

I need to keep only the latest files based on create date. So, if the list contains multiple entries where fi.FileName is the same, I need to keep the latest, based on fi.CreateDate and ditch the other instance(s).

I tried something like this but am messing up somewhere:

files = files.GroupBy(i => new {i.FileName, i.CreateDate}).Select(i => i.Last()).ToList();
Heretic Monkey
  • 11,687
  • 7
  • 53
  • 122
NoBullMan
  • 2,032
  • 5
  • 40
  • 93

2 Answers2

0

You can use such a method to get files to purge:

using System.IO;
using System.Linq;
using System.Collections.Generic;

static public IEnumerable<FileInfo> GetTraceFiles(bool sortByDateOnly = true)
{
  string folder = "MyFullPath";   // Can be from some instance
  string prefix = "MyTraceFile-"; // global vars
  string extension = ".log";      // and config
  var list = Directory.GetFiles(folder, prefix + "*" + extension)
                      .Where(f => !IsFileLocked(f))
                      .Select(f => new FileInfo(f))
                      .OrderBy(fi => fi.CreationTime);
  return sortByDateOnly ? list : list.ThenBy(fi => fi.FullName);
}

And this clear method:

static public void ClearTraces(int retain = 0)
{
  var list = GetTraceFiles();
  if ( retain > 0 ) list = list.Take(list.Count() - retain + 1);
  foreach ( var fileInfo in list )
    try 
    { 
      File.Delete(fileInfo.FullName); 
    } 
    catch 
    { 
    }
}

Here it retains retain last files but you can adapt to add a Where clause to use a date before which to erase:

.Where(fi => fi.CreationTime < ...);

Also instead of using the file system creation date and time, it is possible to use the file pattern in case for example MyTrace-YYYY-MM-DD@HH-MM-SS...

IsFileLocked comes from:

Is there a way to check if a file is in use?

  • 1
    This works except I had to change f => new FileInfo(f) to f => new FileInfo(f.Name) and also IsFileLocked does not seem to be part of FileInfo, unless it is some method I have to create myself. – NoBullMan Jun 03 '21 at 17:00
  • as @Alejandro had mentioned using create date seems to be unreliable. I copied files from a couple of folders (called 2021-06-03 and 2021-06-04) which had files with same create dates as corresponding folder names, from a production server to my dev box. Now they all show create date and last-write-date as 06/04. Looks like have to find another way to test. File names themselves don't have the date embedded in the name, rather the folder they are in is the data they were created. – NoBullMan Jun 04 '21 at 20:05
  • @NoBullMan Indeed, unless using a file manager that can conserve dates like Total Commander. That's why I mentioned the filename pattern as a possibility of more independent use. –  Jun 04 '21 at 20:08
  • just thinking out loud here: maybe instead of list of FileInfo I need to use list of objects with one property being FileInfo and the other folder name and use folder name property for sorting. Although FileInfo does have folder name in it. Brain overheating! – NoBullMan Jun 04 '21 at 20:12
0

You must change your sort code as follows:

  files = files.OrderBy(f=>f.CreateDate).GroupBy(i => i.FileName).Select(i => i.Last()).ToList();

This one also will give the same result:

 files =files.GroupBy(i => i.FileName).Select(i => i.OrderByDescending(f=>f.CreateDate).First()).ToList();
aliassce
  • 1,197
  • 6
  • 19
  • This worked fine. However, I had to change the select() to Select(i => i.OrderByDescending(f => f.DirectoryName) since my files are in directories named as dates they were created and f.CreateDate and f.LastWriteDate proved unreliable. especially if you copy them from one location to another location and process the copied files. – NoBullMan Jun 04 '21 at 21:18