2

I have file names with version numbers embedded, similar to NuGet's naming scheme. Examples:

A.B.C.1.2.3.4.zip
A.B.C.1.2.3.5.zip
A.B.C.3.4.5.dll
A.B.C.1.2.3.6.zip
A.B.C.1.2.3.dll
X.Y.Z.7.8.9.0.zip
X.Y.Z.7.8.9.1.zip

Given a pattern "A.B.C.1.2.3", how do I find all those files and directories that match, regardless of version number? I support both major.minor.build.revision and major.minor.build schemes.

That is, given "A.B.C.1.2.3", return the following list:

A.B.C.1.2.3.4.zip
A.B.C.1.2.3.5.zip
A.B.C.1.2.3.6.zip
A.B.C.1.2.3.dll
A.B.C.3.4.5.dll

Bonus points for determining which file name has the highest version.

Mark Richman
  • 28,948
  • 25
  • 99
  • 159

7 Answers7

2

If you know the filenames end with the version, you could Split the filename string on .. Then iterate backwards from the end (skipping the extension) and stop on the first non-numeric string. (TryParse is probably good for this.) Then you can string.Join the remaining parts and you have the package name.

Do this for the search term to find the package name, then each file in the directory, and you can compare just the package names.

31eee384
  • 2,748
  • 2
  • 18
  • 27
  • The problem is having to support semantic versioning. That is, `Some.Human.Readable.Name.1.2.3.ext` or `.1.2.3.4.ext` – Mark Richman Aug 10 '15 at 17:12
  • How is that a problem? Isn't that just additional logic on top? Once this process is over, you have the numeric parts as a result of going through each array element. – 31eee384 Aug 10 '15 at 17:13
  • Won't work if you have Some.2.File.3.Thing.3.4.5.zip. The last 3 or 4 places are what represents the semantic version. – Mark Richman Aug 10 '15 at 17:14
  • I agree with @31eee384. In that example, you would stop stepping backwards once you reached "Thing". So the version would be 3.4.5 – Cory Aug 10 '15 at 17:15
  • 1
    But since you break on the first non-numeric entry (iterating backwards), you get `Some.2.File.3.Thing` as the package name. I actually made this answer in response to @Helb's comment "How would you parse "Dummy.2.Lib.1.2.3" ?" – 31eee384 Aug 10 '15 at 17:15
2

Credits to jdwweng for his answer as well as 31eee384 for his thoughts. This answer basically combines both ideas.

First, you can create a custom class like so:

class CustomFile
{
    public string FileName { get; private set; }
    public Version FileVersion { get; private set; }

    public CustomFile(string file)
    {
        var split = file.Split(".".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);

        int versionIndex;
        int temp;

        for (int i = split.Length - 2; i >= 0; i--)
        {
            if (!Int32.TryParse(split[i], out temp))
            {
                versionIndex = i+1;
                break;
            }
        }

        FileName = string.Join(".", split, 0, versionIndex);
        FileVersion = Version.Parse(string.Join(".", split, versionIndex, split.Length - versionIndex - 1));
    }
}

Using it to parse the filename, you can then filter based on it.

string[] input = new string[] {
    "A.B.C.D.1.2.3.4.zip",
    "A.B.C.1.2.3.5.zip",
    "A.B.C.3.4.5.dll",
    "A.B.C.1.2.3.6.zip",
    "A.B.C.1.2.3.dll",
    "X.Y.Z.7.8.9.0.zip",
    "X.Y.Z.7.8.9.1.zip"
};

var parsed = input.Select(x => new CustomFile(x));
var results = parsed
    .Where(cf => cf.FileName == "A.B.C")
    .OrderByDescending(cf=>cf.FileVersion)
    .ToList();

In this example, the first element would have the highest version.

Cory
  • 1,794
  • 12
  • 21
  • Why `internal set;`, instead of `private set;`? Otherwise, nice! Didn't know `string.Join` had that overload. – 31eee384 Aug 10 '15 at 19:00
  • It should probably be private. I tend to use internal out of habit, but in this instance I think private is better. I'll update it. – Cory Aug 10 '15 at 19:05
1

Try this

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            string[] input = new string[] {
                "A.B.C.1.2.3.4.zip",
                "A.B.C.1.2.3.5.zip",
                "A.B.C.3.4.5.dll",
                "A.B.C.1.2.3.6.zip",
                "A.B.C.1.2.3.dll",
                "X.Y.Z.7.8.9.0.zip",
                "X.Y.Z.7.8.9.1.zip"
            };

            var parsed = input.Select(x => x.Split(new char[] { '.' }))
                .Select(y => new
                {
                    name = string.Join(".", new string[] { y[0], y[1], y[2] }),
                    ext = y[y.Count() - 1],
                    major = int.Parse(y[3]),
                    minor = int.Parse(y[4]),
                    build = int.Parse(y[5]),
                    revision = y.Count() == 7 ? (int?)null : int.Parse(y[6])
                }).ToList();

            var results = parsed.Where(x => (x.major >= 1) && (x.major <= 3)).ToList();

            var dict = parsed.GroupBy(x => x.name, y => y)
                .ToDictionary(x => x.Key, y => y.ToList());

            var abc = dict["A.B.C"];
        }
    }
}
​
​
jdweng
  • 33,250
  • 2
  • 15
  • 20
  • Looks close, but the A.B.C. bit is arbitrary. So hard coding the indexes won't work. Also, the build could contain 3 or 4 elements, depending on the type scheme. – Cory Aug 10 '15 at 17:19
  • Ah, I see the revision bit now.. Nice. However, the name could be This.Is.The.File.Name.1.2.3.4. Using your code, the filename would be This.Is.The – Cory Aug 10 '15 at 17:31
  • Don't guess what the size of the filename is. Let Mark decide. It can easily change the size of the name array by using Take(x) where x is based on the size of the split array. – jdweng Aug 10 '15 at 19:15
  • Update answer to make a dictionary. – jdweng Aug 10 '15 at 19:21
  • Use the filename: "A.B.C.D.E.1.2.3.4.zip" and see what I'm talking about. y[0], y[1], y[2] only takes the first 3 strings, when it should take 5. Also, if it were "A.B.1.2.3.4" it try to take 3 when it should be taking 2. – Cory Aug 10 '15 at 19:50
  • The take number is Take(X - 5), where x is the size of the array. – jdweng Aug 10 '15 at 20:36
0

Try to use regular expression like in example below

    var firstPart = Console.ReadLine();

    var names = new List<string>
    {
        "A.B.C.1.2.3.4.zip",
        "A.B.C.1.2.3.5.zip",
        "A.B.C.1.2.3.6.zip",
        "A.B.C.1.2.3.dll",
        "X.Y.Z.7.8.9.0.zip",
        "X.Y.Z.7.8.9.1.zip"
    };

    var versionRegexp = new Regex("^" + firstPart + "\\.([\\d]+\\.){1}([\\d]+\\.){1}([\\d]+\\.){1}([\\d]+\\.)?[\\w\\d]+$");

    foreach (var name in names)
    {
        if (versionRegexp.IsMatch(name))
        {
            Console.WriteLine(name);
            foreach (Group group in versionRegexp.Match(name).Groups)
            {
                Console.WriteLine("Index {0}: {1}", group.Index, group.Value);
            }
        }
    }

    Console.ReadKey();
0

This works using only LINQ, assuming the file name itself doesn't end with a digit:

List<string> names = new List<string> { "A.B.C.1.2.3.4.zip",
                                        "A.B.C.1.2.3.5.zip",
                                        "A.B.C.3.4.5.dll",
                                        "A.B.C.1.2.3.6.zip" ,
                                        "A.B.C.1.2.3.dll",
                                        "X.Y.Z.7.8.9.0.zip",
                                        "X.Y.Z.7.8.9.1.zip" };

var groupedFileNames = names.GroupBy(file => new string(Path.GetFileNameWithoutExtension(file)
                                                         .Reverse()
                                                         .SkipWhile(c => Char.IsDigit(c) || c == '.')
                                                         .Reverse().ToArray()));

foreach (var g in groupedFileNames)
{
    Console.WriteLine(g.Key);
    foreach (var file in g)
        Console.WriteLine("    " + file);
}
w.b
  • 11,026
  • 5
  • 30
  • 49
  • Won't work if you have `Some.2.File.3.Thing.3.4.5.zip`. The last 3 or 4 places are what represents the semantic version. – Mark Richman Aug 10 '15 at 17:13
0

you can use new Version() to compare versions like this:

List<string> fileNames = new List<string>();
            fileNnames.AddRange(new[] {
                "A.B.C.1.2.3.4.zip",
                "A.B.C.1.2.3.5.zip",
                "A.B.C.3.4.5.dll",
                "A.B.C.1.2.3.6.zip",
                "A.B.C.1.2.3.dll",
                "X.Y.Z.7.8.9.0.zip",
                "X.Y.Z.7.8.9.1.zip" });

            string filter = "a.b.c";

            var files = fileNames
                //Filter the filenames that start with your filter
                .Where(f => f
                      .StartsWith(filter, StringComparison.InvariantCultureIgnoreCase)
                      )
               //retrieve the version number and create a new version element to order by
                .OrderBy(f =>
                    new Version( 
                        f.Substring(filter.Length + 1, f.Length - filter.Length - 5)
                        )
                );

Results

Juan
  • 4,910
  • 3
  • 37
  • 46
0

First of all I think you can use Version class for comparison. I believe function below can get you versions starting with certain name. It matches the starting name then performs a non greedy search until a dot and digit followed by 2 or 3 dot digit pair and any character after.

public static List<Version> GetLibraryVersions(List<string> files, string Name)
{
    string regexPattern = String.Format(@"\A{0}(?:.*?)(?:\.)(\d+(?:\.\d+){{2,3}})(?:\.)(?:.*)\Z", Regex.Escape(Name));
    Regex regex = new Regex(regexPattern);
    return files.Where(f => regex.Match(f).Success).
            Select(f => new Version(regex.Match(f).Groups[1].Value)).ToList();
}
laphbyte
  • 32
  • 3