179

I want to include a batch file rename functionality in my application. A user can type a destination filename pattern and (after replacing some wildcards in the pattern) I need to check if it's going to be a legal filename under Windows. I've tried to use regular expression like [a-zA-Z0-9_]+ but it doesn't include many national-specific characters from various languages (e.g. umlauts and so on). What is the best way to do such a check?

Luke Girvin
  • 13,221
  • 9
  • 64
  • 84
tomash
  • 12,742
  • 15
  • 64
  • 81

27 Answers27

139

From MSDN's "Naming a File or Directory," here are the general conventions for what a legal file name is under Windows:

You may use any character in the current code page (Unicode/ANSI above 127), except:

  • < > : " / \ | ? *
  • Characters whose integer representations are 0-31 (less than ASCII space)
  • Any other character that the target file system does not allow (say, trailing periods or spaces)
  • Any of the DOS names: CON, PRN, AUX, NUL, COM0, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9 (and avoid AUX.txt, etc)
  • The file name is all periods

Some optional things to check:

  • File paths (including the file name) may not have more than 260 characters (that don't use the \?\ prefix)
  • Unicode file paths (including the file name) with more than 32,000 characters when using \?\ (note that prefix may expand directory components and cause it to overflow the 32,000 limit)
rory.ap
  • 34,009
  • 10
  • 83
  • 174
user7116
  • 63,008
  • 17
  • 141
  • 172
  • 12
    +1 for including reserved filenames - those were missed in previous answers. – SqlRyan Apr 20 '09 at 14:41
  • 2
    "AUX" is a perfectly usable filename if you use the "\\?\" syntax. Of course, programs that don't use that syntax have real problems dealing with it... (Tested on XP) – user9876 Dec 02 '09 at 13:19
  • 14
    The correct regex for all these conditions mentioned above is as below:`Regex unspupportedRegex = new Regex("(^(PRN|AUX|NUL|CON|COM[1-9]|LPT[1-9]|(\\.+)$)(\\..*)?$)|(([\\x00-\\x1f\\\\?*:\";|/<>])+)|(([\\. ]+)", RegexOptions.IgnoreCase);` – whywhywhy Feb 19 '15 at 05:58
  • 5
    @whywhywhy I think you've got an extra opening bracket in that Regex. "(^(PRN|AUX|NUL|CON|COM[1-9]|LPT[1-9]|(\\.+)$)(\\..*)?$)|(([\\x00-\\x1f\\\\?*:\";‌​|/<>])+)|([\\. ]+)" worked for me. – Wilky Aug 20 '15 at 21:00
  • 1
    Wilky: your regex will also remove "." within the filename which are perfectly valid. – Hyndrix Nov 08 '15 at 15:12
  • This is better: ``(^(PRN|AUX|NUL|CON|COM[1-9]|LPT[1-9]|(\\.+)$)(\\..*)?$)|(([\\x00-\\x1f\\\\?*:\"​|/<>‌​])+)|(^([\\.]+))`` – Hyndrix Nov 08 '15 at 15:16
  • All regexes above reject filenames that begin with '.', which is allowed by the OS. – dlf Apr 05 '16 at 17:12
  • 2
    Depends on how you define "allowed". _Windows_ allows filenames that begin with a dot but _Explorer_ does not let you name a file as such, unless if also has an extension. For example, `.foo` is not allowed, but `.foo.bar` is. – Rich Jenks Apr 12 '16 at 19:57
  • 8
    I read the same article mentioned in this answer and found through experimentation that COM0 and LPT0 are also not allowed. @dlf this one works with filenames that begin with '.': `^(?!^(?:PRN|AUX|CLOCK\$|NUL|CON|COM\d|LPT\d)(?:\..+)?$)(?:\.*?(?!\.))[^\x00-\x1f\\?*:\";|\/<>]+(?<![\s.])$` – mejdev May 14 '16 at 16:43
  • 1
    Is there a library which handles all of these cases? – nawfal Apr 03 '17 at 14:41
  • 1
    @papaiatis -- "CLOCK$" works just fine for me. Windows 7. – rory.ap Aug 25 '17 at 17:22
  • BTW "the file name is all periods" rule is already contained in "trailing periods or spaces rule" – Oleg Savelyev Aug 13 '18 at 14:24
  • None of those regexes work properly. – Igor Levicki Feb 24 '23 at 01:25
102

You can get a list of invalid characters from Path.GetInvalidPathChars and GetInvalidFileNameChars.

UPD: See Steve Cooper's suggestion on how to use these in a regular expression.

UPD2: Note that according to the Remarks section in MSDN "The array returned from this method is not guaranteed to contain the complete set of characters that are invalid in file and directory names." The answer provided by sixlettervaliables goes into more details.

Community
  • 1
  • 1
Eugene Katz
  • 5,208
  • 7
  • 40
  • 49
  • 11
    This does not answer the question; there are many strings consisting only of valid characters (e.g. "....", "CON", strings hundreds of chars long) that are not valid filenames. – Dour High Arch Jul 21 '13 at 17:57
  • 49
    Anyone else disappointed that MS doesn't provide system level function/API for this capability instead of each developer has to cook his/her own solution? Wondering if there's a very good reason for this or just an oversight on MS part. – Thomas Nguyen Mar 21 '14 at 17:29
  • @High Arch: See answer for question "In C# check that filename is *possibly* valid (not that it exists)". (Although some clever guys closed that question in favour of this one...) – mmmmmmmm Oct 20 '15 at 19:02
69

For .Net Frameworks prior to 3.5 this should work:

Regular expression matching should get you some of the way. Here's a snippet using the System.IO.Path.InvalidPathChars constant;

bool IsValidFilename(string testName)
{
    Regex containsABadCharacter = new Regex("[" 
          + Regex.Escape(System.IO.Path.InvalidPathChars) + "]");
    if (containsABadCharacter.IsMatch(testName)) { return false; };

    // other checks for UNC, drive-path format, etc

    return true;
}

For .Net Frameworks after 3.0 this should work:

http://msdn.microsoft.com/en-us/library/system.io.path.getinvalidpathchars(v=vs.90).aspx

Regular expression matching should get you some of the way. Here's a snippet using the System.IO.Path.GetInvalidPathChars() constant;

bool IsValidFilename(string testName)
{
    Regex containsABadCharacter = new Regex("["
          + Regex.Escape(new string(System.IO.Path.GetInvalidPathChars())) + "]");
    if (containsABadCharacter.IsMatch(testName)) { return false; };

    // other checks for UNC, drive-path format, etc

    return true;
}

Once you know that, you should also check for different formats, eg c:\my\drive and \\server\share\dir\file.ext

David Clarke
  • 12,888
  • 9
  • 86
  • 116
Steve Cooper
  • 20,542
  • 15
  • 71
  • 88
  • doesn't this only test the path, not the filename? – Eugene Katz Sep 17 '08 at 12:57
  • 31
    string strTheseAreInvalidFileNameChars = new string( System.IO.Path.GetInvalidFileNameChars() ) ;     Regex regFixFileName = new Regex("[" + Regex.Escape(strTheseAreInvalidFileNameChars ) + "]"); – rao Oct 19 '10 at 14:36
  • 2
    A little research from people would work wonders. I've updated the post to reflect the changes. – Erik Philips Dec 22 '13 at 06:03
  • 1
    2nd piece of code doesn't compile. "Cannot convert from char[] to string – Paul Hunt Apr 25 '14 at 12:03
  • 1
    +1 for the code, but please replace `Path.GetInvalidPathChars()` with `Path.GetInvalidFileNameChars()` as `Path.GetInvalidPathChars()` is obsolete now – Ashkan Mobayen Khiabani Nov 02 '17 at 08:44
  • 1
    @AshkanMobayenKhiabani: InvalidPathChars is obsolete but GetInvalidPathChars does not. – IvanH Mar 31 '20 at 13:27
26

Try to use it, and trap for the error. The allowed set may change across file systems, or across different versions of Windows. In other words, if you want know if Windows likes the name, hand it the name and let it tell you.

  • 2
    This seems to be the only one that tests against all constraints. Why are the other answers being chosen over this? – gap Mar 07 '12 at 14:53
  • 5
    @gap because it doesn't always work. For example, trying to access CON will often succeed, even though it's not a real file. – Antimony Oct 01 '12 at 14:51
  • 4
    It's always better to avoid the memory overhead of throwing an Exception, where possible, though. – Owen Blacker Oct 02 '12 at 15:40
  • 2
    Also, you might not have permissions to access it; e.g. to test it by writing, even if you can read it if it does or will exist. – CodeLurker Jul 25 '17 at 10:00
  • @OwenBlacker That's needless preoptimization. – arkon Sep 26 '22 at 18:45
24

This class cleans filenames and paths; use it like

var myCleanPath = PathSanitizer.SanitizeFilename(myBadPath, ' ');

Here's the code;

/// <summary>
/// Cleans paths of invalid characters.
/// </summary>
public static class PathSanitizer
{
    /// <summary>
    /// The set of invalid filename characters, kept sorted for fast binary search
    /// </summary>
    private readonly static char[] invalidFilenameChars;
    /// <summary>
    /// The set of invalid path characters, kept sorted for fast binary search
    /// </summary>
    private readonly static char[] invalidPathChars;

    static PathSanitizer()
    {
        // set up the two arrays -- sorted once for speed.
        invalidFilenameChars = System.IO.Path.GetInvalidFileNameChars();
        invalidPathChars = System.IO.Path.GetInvalidPathChars();
        Array.Sort(invalidFilenameChars);
        Array.Sort(invalidPathChars);

    }

    /// <summary>
    /// Cleans a filename of invalid characters
    /// </summary>
    /// <param name="input">the string to clean</param>
    /// <param name="errorChar">the character which replaces bad characters</param>
    /// <returns></returns>
    public static string SanitizeFilename(string input, char errorChar)
    {
        return Sanitize(input, invalidFilenameChars, errorChar);
    }

    /// <summary>
    /// Cleans a path of invalid characters
    /// </summary>
    /// <param name="input">the string to clean</param>
    /// <param name="errorChar">the character which replaces bad characters</param>
    /// <returns></returns>
    public static string SanitizePath(string input, char errorChar)
    {
        return Sanitize(input, invalidPathChars, errorChar);
    }

    /// <summary>
    /// Cleans a string of invalid characters.
    /// </summary>
    /// <param name="input"></param>
    /// <param name="invalidChars"></param>
    /// <param name="errorChar"></param>
    /// <returns></returns>
    private static string Sanitize(string input, char[] invalidChars, char errorChar)
    {
        // null always sanitizes to null
        if (input == null) { return null; }
        StringBuilder result = new StringBuilder();
        foreach (var characterToTest in input)
        {
            // we binary search for the character in the invalid set. This should be lightning fast.
            if (Array.BinarySearch(invalidChars, characterToTest) >= 0)
            {
                // we found the character in the array of 
                result.Append(errorChar);
            }
            else
            {
                // the character was not found in invalid, so it is valid.
                result.Append(characterToTest);
            }
        }

        // we're done.
        return result.ToString();
    }

}
Steve Cooper
  • 20,542
  • 15
  • 71
  • 88
  • 1
    your answer could be better fit here:http://stackoverflow.com/questions/146134/how-to-remove-illegal-characters-from-path-and-filenames?lq=1 – nawfal Jun 12 '13 at 12:37
23

This is what I use:

    public static bool IsValidFileName(this string expression, bool platformIndependent)
    {
        string sPattern = @"^(?!^(PRN|AUX|CLOCK\$|NUL|CON|COM\d|LPT\d|\..*)(\..+)?$)[^\x00-\x1f\\?*:\"";|/]+$";
        if (platformIndependent)
        {
           sPattern = @"^(([a-zA-Z]:|\\)\\)?(((\.)|(\.\.)|([^\\/:\*\?""\|<>\. ](([^\\/:\*\?""\|<>\. ])|([^\\/:\*\?""\|<>]*[^\\/:\*\?""\|<>\. ]))?))\\)*[^\\/:\*\?""\|<>\. ](([^\\/:\*\?""\|<>\. ])|([^\\/:\*\?""\|<>]*[^\\/:\*\?""\|<>\. ]))?$";
        }
        return (Regex.IsMatch(expression, sPattern, RegexOptions.CultureInvariant));
    }

The first pattern creates a regular expression containing the invalid/illegal file names and characters for Windows platforms only. The second one does the same but ensures that the name is legal for any platform.

Spook
  • 25,318
  • 18
  • 90
  • 167
Scott Dorman
  • 42,236
  • 12
  • 79
  • 110
  • 4
    sPattern regex doesn't allow files started with period character. But [MSDN says](http://msdn.microsoft.com/en-us/library/aa365247.aspx) "it is acceptable to specify a period as the first character of a name. For example, ".temp"". I would remove "\..*" to make .gitignore correct file name :) – yar_shukan Sep 10 '14 at 13:51
  • 1
    (I have incrementally made this better and deleted prev comments I left) This one is better than the answer's regex because it allows ".gitignore", "..asdf", doesn't allow '<' and '>' or the yen sign, and doesn't allow space or period at the end (which disallows names consisting only of dots): `@"^(?!(?:PRN|AUX|CLOCK\$|NUL|CON|COM\d|LPT\d)(?:\..+)?$)[^\x00-\x1F\xA5\\?*:\"";|\/<>]+(?<![\s.])$"` – mejdev May 14 '16 at 19:40
  • this fails for all files I tested. running it for C:\Windows\System32\msxml6.dll reports false. – magicandre1981 Jun 14 '16 at 08:10
  • @magicandre1981 You need to give it just the file name, not the fully qualified path. – Scott Dorman Jun 15 '16 at 02:39
  • ok, but I need to check if the full path is valid. I used now a different solution. – magicandre1981 Jun 15 '16 at 04:05
  • Your pattern fails on `.foo.bar`. – rory.ap Aug 25 '17 at 17:43
  • It also allows `<` and `>`. – rory.ap Aug 25 '17 at 17:52
19

One corner case to keep in mind, which surprised me when I first found out about it: Windows allows leading space characters in file names! For example, the following are all legal, and distinct, file names on Windows (minus the quotes):

"file.txt"
" file.txt"
"  file.txt"

One takeaway from this: Use caution when writing code that trims leading/trailing whitespace from a filename string.

Jon Schneider
  • 25,758
  • 23
  • 142
  • 170
10

Simplifying the Eugene Katz's answer:

bool IsFileNameCorrect(string fileName){
    return !fileName.Any(f=>Path.GetInvalidFileNameChars().Contains(f))
}

Or

bool IsFileNameCorrect(string fileName){
    return fileName.All(f=>!Path.GetInvalidFileNameChars().Contains(f))
}
tmt
  • 686
  • 2
  • 10
  • 22
  • Did you mean : "return !fileName.Any(f=>Path.GetInvalidFileNameChars().Contains(f));" ? – Jack Griffin May 23 '18 at 17:27
  • @JackGriffin Of course! Thank you for your attentiveness. – tmt May 24 '18 at 10:53
  • 1
    While this code is very nice to read, we should take into account the sorry internals of `Path.GetInvalidFileNameChars`. Take a look here: https://referencesource.microsoft.com/#mscorlib/system/io/path.cs,289 - for each character of your `fileName`, a clone of the array is created. – Piotr Zierhoffer Mar 12 '20 at 11:45
  • "DD:\\\\\AAA.....AAAA". Not valid, but for your code, it is. – Ciccio Pasticcio Jul 02 '20 at 20:04
  • Are you really comfortable with calling Path.GetInvalidFileNameChars() for each character of the file name? – Koray Aug 25 '22 at 07:56
8

Microsoft Windows: Windows kernel forbids the use of characters in range 1-31 (i.e., 0x01-0x1F) and characters " * : < > ? \ |. Although NTFS allows each path component (directory or filename) to be 255 characters long and paths up to about 32767 characters long, the Windows kernel only supports paths up to 259 characters long. Additionally, Windows forbids the use of the MS-DOS device names AUX, CLOCK$, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, CON, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9, NUL and PRN, as well as these names with any extension (for example, AUX.txt), except when using Long UNC paths (ex. \.\C:\nul.txt or \?\D:\aux\con). (In fact, CLOCK$ may be used if an extension is provided.) These restrictions only apply to Windows - Linux, for example, allows use of " * : < > ? \ | even in NTFS.

Source: http://en.wikipedia.org/wiki/Filename

Martin Faartoft
  • 3,275
  • 2
  • 18
  • 11
7

Rather than explicitly include all possible characters, you could do a regex to check for the presence of illegal characters, and report an error then. Ideally your application should name the files exactly as the user wishes, and only cry foul if it stumbles across an error.

David Clarke
  • 12,888
  • 9
  • 86
  • 116
ConroyP
  • 40,958
  • 16
  • 80
  • 86
6

I use this to get rid of invalid characters in filenames without throwing exceptions:

private static readonly Regex InvalidFileRegex = new Regex(
    string.Format("[{0}]", Regex.Escape(@"<>:""/\|?*")));

public static string SanitizeFileName(string fileName)
{
    return InvalidFileRegex.Replace(fileName, string.Empty);
}
JoelFan
  • 37,465
  • 35
  • 132
  • 205
6

The question is are you trying to determine if a path name is a legal windows path, or if it's legal on the system where the code is running.? I think the latter is more important, so personally, I'd probably decompose the full path and try to use _mkdir to create the directory the file belongs in, then try to create the file.

This way you know not only if the path contains only valid windows characters, but if it actually represents a path that can be written by this process.

kfh
  • 321
  • 1
  • 6
5

Also CON, PRN, AUX, NUL, COM# and a few others are never legal filenames in any directory with any extension.

4

From MSDN, here's a list of characters that aren't allowed:

Use almost any character in the current code page for a name, including Unicode characters and characters in the extended character set (128–255), except for the following:

  • The following reserved characters are not allowed: < > : " / \ | ? *
  • Characters whose integer representations are in the range from zero through 31 are not allowed.
  • Any other character that the target file system does not allow.
Bondolin
  • 2,793
  • 7
  • 34
  • 62
Mark Biek
  • 146,731
  • 54
  • 156
  • 201
4

To complement the other answers, here are a couple of additional edge cases that you might want to consider.

Joe
  • 122,218
  • 32
  • 205
  • 338
3

This is an already answered question, but just for the sake of "Other options", here's a non-ideal one:

(non-ideal because using Exceptions as flow control is a "Bad Thing", generally)

public static bool IsLegalFilename(string name)
{
    try 
    {
        var fileInfo = new FileInfo(name);
        return true;
    }
    catch
    {
        return false;
    }
}
JerKimball
  • 16,584
  • 3
  • 43
  • 55
2

Also the destination file system is important.

Under NTFS, some files can not be created in specific directories. E.G. $Boot in root

Dominik Weber
  • 711
  • 5
  • 13
2

Regular expressions are overkill for this situation. You can use the String.IndexOfAny() method in combination with Path.GetInvalidPathChars() and Path.GetInvalidFileNameChars().

Also note that both Path.GetInvalidXXX() methods clone an internal array and return the clone. So if you're going to be doing this a lot (thousands and thousands of times) you can cache a copy of the invalid chars array for reuse.

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
1

many of these answers will not work if the filename is too long & running on a pre Windows 10 environment. Similarly, have a think about what you want to do with periods - allowing leading or trailing is technically valid, but can create problems if you do not want the file to be difficult to see or delete respectively.

This is a validation attribute I created to check for a valid filename.

public class ValidFileNameAttribute : ValidationAttribute
{
    public ValidFileNameAttribute()
    {
        RequireExtension = true;
        ErrorMessage = "{0} is an Invalid Filename";
        MaxLength = 255; //superseeded in modern windows environments
    }
    public override bool IsValid(object value)
    {
        //http://stackoverflow.com/questions/422090/in-c-sharp-check-that-filename-is-possibly-valid-not-that-it-exists
        var fileName = (string)value;
        if (string.IsNullOrEmpty(fileName)) { return true;  }
        if (fileName.IndexOfAny(Path.GetInvalidFileNameChars()) > -1 ||
            (!AllowHidden && fileName[0] == '.') ||
            fileName[fileName.Length - 1]== '.' ||
            fileName.Length > MaxLength)
        {
            return false;
        }
        string extension = Path.GetExtension(fileName);
        return (!RequireExtension || extension != string.Empty)
            && (ExtensionList==null || ExtensionList.Contains(extension));
    }
    private const string _sepChar = ",";
    private IEnumerable<string> ExtensionList { get; set; }
    public bool AllowHidden { get; set; }
    public bool RequireExtension { get; set; }
    public int MaxLength { get; set; }
    public string AllowedExtensions {
        get { return string.Join(_sepChar, ExtensionList); } 
        set {
            if (string.IsNullOrEmpty(value))
            { ExtensionList = null; }
            else {
                ExtensionList = value.Split(new char[] { _sepChar[0] })
                    .Select(s => s[0] == '.' ? s : ('.' + s))
                    .ToList();
            }
    } }

    public override bool RequiresValidationContext => false;
}

and the tests

[TestMethod]
public void TestFilenameAttribute()
{
    var rxa = new ValidFileNameAttribute();
    Assert.IsFalse(rxa.IsValid("pptx."));
    Assert.IsFalse(rxa.IsValid("pp.tx."));
    Assert.IsFalse(rxa.IsValid("."));
    Assert.IsFalse(rxa.IsValid(".pp.tx"));
    Assert.IsFalse(rxa.IsValid(".pptx"));
    Assert.IsFalse(rxa.IsValid("pptx"));
    Assert.IsFalse(rxa.IsValid("a/abc.pptx"));
    Assert.IsFalse(rxa.IsValid("a\\abc.pptx"));
    Assert.IsFalse(rxa.IsValid("c:abc.pptx"));
    Assert.IsFalse(rxa.IsValid("c<abc.pptx"));
    Assert.IsTrue(rxa.IsValid("abc.pptx"));
    rxa = new ValidFileNameAttribute { AllowedExtensions = ".pptx" };
    Assert.IsFalse(rxa.IsValid("abc.docx"));
    Assert.IsTrue(rxa.IsValid("abc.pptx"));
}
Brent
  • 4,611
  • 4
  • 38
  • 55
1

If you're only trying to check if a string holding your file name/path has any invalid characters, the fastest method I've found is to use Split() to break up the file name into an array of parts wherever there's an invalid character. If the result is only an array of 1, there are no invalid characters. :-)

var nameToTest = "Best file name \"ever\".txt";
bool isInvalidName = nameToTest.Split(System.IO.Path.GetInvalidFileNameChars()).Length > 1;

var pathToTest = "C:\\My Folder <secrets>\\";
bool isInvalidPath = pathToTest.Split(System.IO.Path.GetInvalidPathChars()).Length > 1;

I tried running this and other methods mentioned above on a file/path name 1,000,000 times in LinqPad.

Using Split() is only ~850ms.

Using Regex("[" + Regex.Escape(new string(System.IO.Path.GetInvalidPathChars())) + "]") is around 6 seconds.

The more complicated regular expressions fair MUCH worse, as do some of the other options, like using the various methods on the Path class to get file name and let their internal validation do the job (most likely due to the overhead of exception handling).

Granted it's not very often you need to validation 1 million file names, so a single iteration is fine for most of these methods anyway. But it's still pretty efficient and effective if you're only looking for invalid characters.

Nick Albrecht
  • 16,607
  • 10
  • 66
  • 101
1

I got this idea from someone. - don't know who. Let the OS do the heavy lifting.

public bool IsPathFileNameGood(string fname)
{
    bool rc = Constants.Fail;
    try
    {
        this._stream = new StreamWriter(fname, true);
        rc = Constants.Pass;
    }
    catch (Exception ex)
    {
        MessageBox.Show(ex.Message, "Problem opening file");
        rc = Constants.Fail;
    }
    return rc;
}
Luka Čelebić
  • 1,083
  • 11
  • 21
KenR
  • 11
  • 2
0

I suggest just use the Path.GetFullPath()

string tagetFileFullNameToBeChecked;
try
{
  Path.GetFullPath(tagetFileFullNameToBeChecked)
}
catch(AugumentException ex)
{
  // invalid chars found
}
Tony Sun
  • 19
  • 2
  • Add some explanation with answer for how this answer help OP in fixing current issue – ρяσѕρєя K Jan 10 '17 at 08:30
  • See the doc in the MSDN for the AugumentExcpetion, it reads:path is a zero-length string, contains only white space, or contains one or more of the invalid characters defined in GetInvalidPathChars. -or- The system could not retrieve the absolute path. – Tony Sun Apr 27 '17 at 11:12
  • In theory (according to the docs) this should work, problem is though at least in .NET Core 3.1, it does not. – Michel Jansson Apr 15 '20 at 11:11
0

My attempt:

using System.IO;

static class PathUtils
{
  public static string IsValidFullPath([NotNull] string fullPath)
  {
    if (string.IsNullOrWhiteSpace(fullPath))
      return "Path is null, empty or white space.";

    bool pathContainsInvalidChars = fullPath.IndexOfAny(Path.GetInvalidPathChars()) != -1;
    if (pathContainsInvalidChars)
      return "Path contains invalid characters.";

    string fileName = Path.GetFileName(fullPath);
    if (fileName == "")
      return "Path must contain a file name.";

    bool fileNameContainsInvalidChars = fileName.IndexOfAny(Path.GetInvalidFileNameChars()) != -1;
    if (fileNameContainsInvalidChars)
      return "File name contains invalid characters.";

    if (!Path.IsPathRooted(fullPath))
      return "The path must be absolute.";

    return "";
  }
}

This is not perfect because Path.GetInvalidPathChars does not return the complete set of characters that are invalid in file and directory names and of course there's plenty more subtleties.

So I use this method as a complement:

public static bool TestIfFileCanBeCreated([NotNull] string fullPath)
{
  if (string.IsNullOrWhiteSpace(fullPath))
    throw new ArgumentException("Value cannot be null or whitespace.", "fullPath");

  string directoryName = Path.GetDirectoryName(fullPath);
  if (directoryName != null) Directory.CreateDirectory(directoryName);
  try
  {
    using (new FileStream(fullPath, FileMode.CreateNew)) { }
    File.Delete(fullPath);
    return true;
  }
  catch (IOException)
  {
    return false;
  }
}

It tries to create the file and return false if there is an exception. Of course, I need to create the file but I think it's the safest way to do that. Please also note that I am not deleting directories that have been created.

You can also use the first method to do basic validation, and then handle carefully the exceptions when the path is used.

Maxence
  • 12,868
  • 5
  • 57
  • 69
0

This check

static bool IsValidFileName(string name)
{
    return
        !string.IsNullOrWhiteSpace(name) &&
        name.IndexOfAny(Path.GetInvalidFileNameChars()) < 0 &&
        !Path.GetFullPath(name).StartsWith(@"\\.\");
}

filters out names with invalid chars (<>:"/\|?* and ASCII 0-31), as well as reserved DOS devices (CON, NUL, COMx). It allows leading spaces and all-dot-names, consistent with Path.GetFullPath. (Creating file with leading spaces succeeds on my system).


Used .NET Framework 4.7.1, tested on Windows 7.

Vlad
  • 35,022
  • 6
  • 77
  • 199
0

Windows filenames are pretty unrestrictive, so really it might not even be that much of an issue. The characters that are disallowed by Windows are:

\ / : * ? " < > |

You could easily write an expression to check if those characters are present. A better solution though would be to try and name the files as the user wants, and alert them when a filename doesn't stick.

Justin Poliey
  • 16,289
  • 7
  • 37
  • 48
-1

One liner for verifying illigal chars in the string:

public static bool IsValidFilename(string testName) => !Regex.IsMatch(testName, "[" + Regex.Escape(new string(System.IO.Path.InvalidPathChars)) + "]");
Zananok
  • 121
  • 1
  • 3
  • 12
-1

In my opinion, the only proper answer to this question is to try to use the path and let the OS and filesystem validate it. Otherwise you are just reimplementing (and probably poorly) all the validation rules that the OS and filesystem already use and if those rules are changed in the future you will have to change your code to match them.

Igor Levicki
  • 1,017
  • 10
  • 17
  • a name might be valid and still not work as a file, see answers above. – Hefaistos68 Aug 31 '22 at 17:01
  • I know exactly what you are talking about. But please check how Windows manages file names (some are actually devices). as I said, read the comments above that explain in details already why this approach can not be used. And I am not even starting to talk about security and auditing issues with this approach. – Hefaistos68 Sep 02 '22 at 11:05
  • I use Windows API since 1997 so I am well aware that some file names can actually be devices. Same goes for `/dev/null` on linux, but that is still valid input and output filename depending on what the user wants / is allowed to do. You can "validate" the filename with the same level of confidence you can "validate" an email address using regex, but you won't know whether you can create the file / send the email until you actually try to do it. Therefore, such validation is entirely pointless, not to mention it is not portable and can't possibly cover all present and future filesystems. – Igor Levicki Sep 02 '22 at 18:06