55
public static boolean isValidName(String text)
{
    Pattern pattern = Pattern.compile("^[^/./\\:*?\"<>|]+$");
    Matcher matcher = pattern.matcher(text);
    boolean isMatch = matcher.matches();
    return isMatch;
}

Does this method guarantee a valid filename on Windows?

Eng.Fouad
  • 115,165
  • 71
  • 313
  • 417
  • Why have you included `.` in the list of forbidden characters? – OpenSauce Jul 18 '11 at 11:47
  • @OpenSauce Try rename a file with `.....` – Eng.Fouad Jul 18 '11 at 23:03
  • OK, but this method also rejects `file.txt`. Or are you not considering the extension as part of the name? – OpenSauce Jul 19 '11 at 07:40
  • of course I will handle the extension part out of this method – Eng.Fouad Jul 19 '11 at 07:42
  • 5
    in that case `.....` is a valid name, since `......txt` is OK ; ) – OpenSauce Jul 19 '11 at 07:45
  • In unix, the pattern itself "^[^/./\\:*?\"<>|]+$" would be a valid filepath. – dagnelies Jul 25 '11 at 23:14
  • 2
    What do you consider to be a "file name"? Does it include extensions? Path? Named streams + attributes? Default streams? The drive letter? The beginning backslash? What about folders and files like `C:\$Extend\$RmMetadata\$Txf` on NTFS which can be opened on e.g. XP but not on Windows 7? Or `C:\$Boot`? etc. etc. – user541686 Jul 27 '11 at 07:30
  • This question is somewhat similar to [Java - How to find out whether a File name is valid?](http://stackoverflow.com/questions/893977/java-how-to-find-out-whether-a-file-name-is-valid). – Barbarrosa Jul 28 '11 at 04:01

11 Answers11

93

Given the requirements specified in the previously cited MSDN documentation, the following regex should do a pretty good job:

public static boolean isValidName(String text)
{
    Pattern pattern = Pattern.compile(
        "# Match a valid Windows filename (unspecified file system).          \n" +
        "^                                # Anchor to start of string.        \n" +
        "(?!                              # Assert filename is not: CON, PRN, \n" +
        "  (?:                            # AUX, NUL, COM1, COM2, COM3, COM4, \n" +
        "    CON|PRN|AUX|NUL|             # COM5, COM6, COM7, COM8, COM9,     \n" +
        "    COM[1-9]|LPT[1-9]            # LPT1, LPT2, LPT3, LPT4, LPT5,     \n" +
        "  )                              # LPT6, LPT7, LPT8, and LPT9...     \n" +
        "  (?:\\.[^.]*)?                  # followed by optional extension    \n" +
        "  $                              # and end of string                 \n" +
        ")                                # End negative lookahead assertion. \n" +
        "[^<>:\"/\\\\|?*\\x00-\\x1F]*     # Zero or more valid filename chars.\n" +
        "[^<>:\"/\\\\|?*\\x00-\\x1F\\ .]  # Last char is not a space or dot.  \n" +
        "$                                # Anchor to end of string.            ", 
        Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE | Pattern.COMMENTS);
    Matcher matcher = pattern.matcher(text);
    boolean isMatch = matcher.matches();
    return isMatch;
}

Note that this regex does not impose any limit on the length of the filename, but a real filename may be limited to 260 or 32767 chars depending on the platform.

ridgerunner
  • 33,777
  • 5
  • 57
  • 69
  • 12
    You may also consider compiling the pattern only once if it is likely to be called multiple times. – Klaimmore Jul 27 '11 at 05:16
  • L"\\?\" is the prefix for > 256 characters. – MSN Aug 16 '11 at 06:00
  • 1
    This validates a path, not a simple file name. – Renato May 18 '12 at 02:30
  • Does it work? ^(?!(?:CON|PRN|AUX|NUL|COM[1-9]|LPT[1-9])(?:\\.[^.]*)?$)[^<>:\"/\\\\|?*\\x00-\\x1F]*[^<>:\"/\\\\|?*\\x00-\\x1F\\ .]$ I tryed to test it in my program and in two java regex online testers, but fails. – villamejia Mar 14 '15 at 01:35
  • @villamejia - You need to handle the escapes (backslashes) very carefully. Each double escape in my regex String above is actually a single escape when it reaches the regex engine. In other words, for the online testers, you probably need to change each double escape into a singe escape. Good luck. – ridgerunner Mar 14 '15 at 14:23
  • Now, if only the REGEX approach was powerful enough to produce "user friendly" error messages, like: "Your filename was rejectected, because it ended with a dot, which is illegal"... – Rekin Dec 16 '15 at 09:36
  • Brilliant answer overall, I didn't find a better solution. This also considers the path slashes as invalid how it should be – BullyWiiPlaza Jan 07 '16 at 18:21
  • Here is the one for java version: `^(?!(?:CON|PRN|AUX|NUL|COM[1-9]|LPT[1-9])(?:\.[^.]*)?$)[^<>:\"/\\|?*\x00-\x1F]*[^<>:\"/\\|?*\x00-\x1F\.]$` – Vikas Jun 01 '16 at 06:45
27

Not enough,in Windows and DOS, some words might also be reserved and can not be used as filenames.

CON, PRN, AUX, CLOCK$, NUL
COM0, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9
LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9.

See~

http://en.wikipedia.org/wiki/Filename


Edit:

Windows usually limits file names to 260 characters. But the file name must actually be shorter than that, since the complete path (such as C:\Program Files\filename.txt) is included in this character count.

This is why you might occasionally encounter an error when copying a file with a very long file name to a location that has a longer path than its current location.

Monday
  • 1,403
  • 12
  • 10
  • 3
    So the pattern would be: `"^(?!(COM[0-9]|LPT[0-9]|CON|PRN|AUX|CLOCK\$|NUL)$)[^./\\:*?\"<>|]+$"` – Tim Pietzcker Jul 18 '11 at 08:46
  • `CLOCK$` is only a problem in MS-DOS, Windows 3.x/9x/Me. In NT-based Windows versions (NT 3/4, 2000, Server, XP/Vista/7/8/10), it isn't a reserved file name any more. It's also worth noting that, depending on the device drivers installed, there can actually be other reserved names under MS-DOS (and presumably older Windows versions as well), such as `EMMQXXX0`, `XMSXXXX0`, `MSCD000`, `CONFIG$` (the later exists under Windows 95). Unless you need to support DOS or ancient Windows versions, `CLOCK$` and all these other names are non-issues today. – Simon Kissane Sep 07 '19 at 09:55
  • Also: in Windows 10 (and I think earlier versions too), `COM0` and `LPT0` are not reserved. But, in addition to this list, `CONIN$` and `CONOUT$` are. – Simon Kissane Sep 07 '19 at 11:17
15

Well, I think the following method would guarantee a valid file name:

public static boolean isValidName(String text)
{
    try
    {
        File file = new File(text);
        file.createNewFile();
        if(file.exists()) file.delete();
        return true;
    }
    catch(Exception ex){}
    return false;
}

What do you think?

Eng.Fouad
  • 115,165
  • 71
  • 313
  • 417
  • 2
    Creating file IO just to check the validity of a file name, and then not even using the file? – J. Steen Jul 21 '11 at 10:40
  • @J. Steen but is there a better way to do it? – Eng.Fouad Jul 21 '11 at 10:41
  • 4
    I question the need to do it beforehand at all. Why do you need to validate the file name, if you're not creating a file? =) – J. Steen Jul 21 '11 at 10:44
  • 7
    +1 I actually like the simplicity and 100% guaranteed-ness of it. Also, regarding permissions, if you don't have permissions, this approach means *all* filenames are invalid for you, which is still correct! – Bohemian Jul 23 '11 at 04:28
  • @Eng.Fouad Xeno has entered an answer based on your approach, with a slight tweak. – Andrew Thompson Jul 25 '11 at 17:17
  • 1
    I don't think that is a good approach. Merely relying on the current machine doesn't guarantee a "Valid filename on Windows".The best way is with a regex that comply with the specification. – Klaimmore Jul 27 '11 at 05:02
  • What if this function is used to validate user input, what are you going to tell the user, when he has no permission to create file? This just doesn't feel right. – romario333 Aug 14 '11 at 18:07
  • 9
    this method has a security flaw: if a file already exists with the submitted name, it is deleted. As no steps are taken to prevent directory separators in the name, any file on the system which is writable to the calling process could be vulnerable to such an attack. The submitter doesn't specify context, but I consider a web application server likely, as this is what the majority of Java code does... – Jules Oct 31 '11 at 12:38
  • Kudos for awarding other answers with additional rep. IMHO, breaking out of the competitive mindset of StackOverflow to reward others for good work shows class. – Evan Plaice Apr 19 '12 at 17:37
  • I'd enhance this by checking if File.separator exists in filename (to prevent path manipulation) and check return value of file.createNewFile() before deleting a file (to avoid deleting existing file). – StephenNYC Sep 14 '16 at 18:55
14

A method that guarantees, generally, that a Windows filename is valid -- that it would be legal to create a file of that name -- would be impossible to implement.

It is relatively straightforward to guarantee that a Windows filename is invalid. Some of the other regexes attempt to do this. However, the original question requests a stronger assertion: a method that guarantees the filename is valid on Windows.

The MSDN reference cited in other answers indicates that a Windows filename cannot contain "Any other character that the target file system does not allow". For instance, a file containing NUL would be invalid on some file systems, as would extended Unicode characters on some older file systems. Thus, a file called ☃.txt would be valid in some cases, but not others. So whether a hypothetical isValidName(\"☃\") would return true is dependent on the underlying file system.

Suppose, however, such a function is conservative and requires the filename consist of printable ASCII characters. All modern versions of Windows natively support NTFS, FAT32, and FAT16 file formats, which accept Unicode filenames. But drivers for arbitrary filesystems can be installed, and one is free to create a filesystem that doesn't allow, for instance, the letter 'n'. Thus, not even a simple file like "snowman.txt" can be "guaranteed" to be valid.

But even with extreme cases aside, there are other complications. For instance, a file named "$LogFile" cannot exist in the root of a NTFS volume, but can exist elsewhere on the volume. Thus, without knowing the directory, we cannot know if "$LogFile" is a valid name. But even "C:\data\$LogFile" might be invalid if, say, "c:\data\" is a symbolic link to another NTFS volume root. (Similarly, "D:\$LogFile" can be valid if D: is an alias to a subdirectory of an NTFS volume.)

There are even more complications. Alternate data streams on files, for instance, are legal on NTFS volumes, so "snowman.txt:☃" may be valid. All three major Windows file systems have path length restructions, so the validity of the file name is also function of the path. But the length of the physical path might not even be available to isValidName if the path is a virtual alias, mapped network drive, or symbolic link rather than a physical path on the volume.

Some others have suggested an alternative: create a file by the proposed name and then delete it, returning true if and only if the creation succeeds. This approach has several practical and theoretical problems. One, as indicated earlier, is that the validity is a function both of the filename and the path, so the validity of c:\test\☃.txt might differ from the validity of c:\test2\☃.txt. Also, the function would fail to write the file for any number of reasons not related to the validity of the file, such as not having write permission to the directory. A third flaw is that the validity of a filename is not required to be nondeterministic: a hypothetical file system might, for instance, not allow a deleted file to be replaced, or (in theory) could even randomly decide if a filename is valid.

As an alternative, it's fairly straightforward to create a method isInvalidFileName(String text) that returns true if the file is guaranteed to not be valid in Windows; filenames like "aux", "*", and "abc.txt." would return true. The file create operation would first check that the filename is guaranteed to be invalid and, if it returns false, would stop. Otherwise, the method could attempt to create the file, while being prepared for the edge case where the file cannot be created because the filename is invalid.

drf
  • 8,461
  • 32
  • 50
8

Posting a new answer because I dont have the rep threshold to comment on Eng.Fouad's code

public static boolean isValidName(String text)
{
    try
    {
        File file = new File(text);
        if(file.createNewFile()) file.delete();
        return true;
    }
    catch(Exception ex){}
    return false;
}

A small change to your answer that prevents deleting a pre-existing file. Files only get deleted if they were created during this method call, while the return value is the same.

Abdul Hfuda
  • 1,463
  • 12
  • 25
  • 1
    What if this function is used to validate user input, what are you going to tell the user, when he has no permission to create file? This just doesn't feel right. – romario333 Aug 14 '11 at 18:07
  • @romario333 You are correct. But you might want to create a file in the temp directory, which at least *should* always be writable. – MC Emperor Jun 27 '15 at 15:26
7

Here you can find which file names are allowed.

The following characters are not allowed:

  • < (less than)
  • (greater than)

  • : (colon)
  • " (double quote)
  • / (forward slash)
  • \ (backslash)
  • | (vertical bar or pipe)
  • ? (question mark)
  • * (asterisk)

  • Integer value zero, sometimes referred to as the ASCII NUL character.

  • Characters whose integer representations are in the range from 1 through 31, except for alternate data streams where these characters are allowed. For more information about file streams, see File Streams.
  • Any other character that the target file system does not allow.
phimuemue
  • 34,669
  • 9
  • 84
  • 115
  • 3
    Since the target file system can impose additional filename limits, the given regex does not *guarantee* a valid filename on Windows. – Sjoerd Jul 18 '11 at 07:59
6

This solution will only check if a given filename is valid according to the OS rules without creating a file.

You still need to handle other failures when actually creating the file (e.g. insufficient permissions, lack of drive space, security restrictions).

import java.io.File;
import java.io.IOException;

public class FileUtils {
  public static boolean isFilenameValid(String file) {
    File f = new File(file);
    try {
       f.getCanonicalPath();
       return true;
    }
    catch (IOException e) {
       return false;
    }
  }

  public static void main(String args[]) throws Exception {
    // true
    System.out.println(FileUtils.isFilenameValid("well.txt"));
    System.out.println(FileUtils.isFilenameValid("well well.txt"));
    System.out.println(FileUtils.isFilenameValid(""));

    //false
    System.out.println(FileUtils.isFilenameValid("test.T*T"));
    System.out.println(FileUtils.isFilenameValid("test|.TXT"));
    System.out.println(FileUtils.isFilenameValid("te?st.TXT"));
    System.out.println(FileUtils.isFilenameValid("con.TXT")); // windows
    System.out.println(FileUtils.isFilenameValid("prn.TXT")); // windows
    }
  }
RealHowTo
  • 34,977
  • 11
  • 70
  • 85
  • 1
    Empty Strings should not be valid file names, but they're also pretty easy to test for. So +1. – Raven Dreamer Nov 27 '12 at 17:05
  • Think this is quite the best answer. I wanted some OS dependent validator. Anyone has any reservations on this approach? – 2c00L Feb 10 '16 at 14:15
  • It seems the behaviour changed: on windows 10 I don't get IOException anymore for "con.txt" etc. :-/ – cupiqi09 Jan 16 '21 at 23:56
6

Looks good. At least if we believe to this resource: http://msdn.microsoft.com/en-us/library/aa365247%28v=vs.85%29.aspx

But I'd simplify use the code. It is enough to look for one of these characters to say that the name is invalid, so:

public static boolean isValidName(String text)
{
    Pattern pattern = Pattern.compile("[^/./\\:*?\"<>|]");
    return !pattern.matcher(text).find();
}

This regex is simpler and will work faster.

AlexR
  • 114,158
  • 16
  • 130
  • 208
2

Not sure how to implement it in Java (either Regex or own method). But, Windows OS has the following rules to create file/directory in the file system:

  1. Name is not only be Dots
  2. Windows device names like AUX, CON, NUL, PRN, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9, cannot be used for a file name nor for the first segment of a file name (i.e. test1 in test1.txt).
  3. Device names are case insensitive. (i.e. prn, PRN, Prn, etc. are identical.)
  4. All characters greater than ASCII 31 to be used except "*/:<>?\|

So, the program needs to stick with these rules. Hope, it covers the validation rules for your question.

Ganesan
  • 73
  • 5
1

You can check all the reserved names (AUX, CON, and the like) and then use this code:

bool invalidName = GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES && 
        GetLastError() == ERROR_INVALID_NAME;

to check for any additional restriction. But note that if you check for a name in a non existant directory you will get ERROR_PATH_NOT_FOUND whether the name is really valid or not.

Anyway, you should remember the old saying:

It's easier to ask for forgiveness than it is to get permission.

rodrigo
  • 94,151
  • 12
  • 143
  • 190
-1

How about letting the File class do your validation?

public static boolean isValidName(String text) {
    try {
        File file = new File(text);
        return file.getPath().equals(text);
    }
    catch(Exception ex){}
    return false;
}
nekno
  • 19,177
  • 5
  • 42
  • 47
  • I'm sorry, but the File class doesn't make any validation: File f = new File("C:\\test\\????"); f.getAbsolutePath(); returns "C:\test\????" – H-Man2 Jul 26 '11 at 14:16