20

Is there a way with Qt 4.6 to check if a given QString is a valid filename (or directory name) on the current operating system ? I want to check for the name to be valid, not for the file to exist.

Examples:

// Some valid names
test
under_score
.dotted-name

// Some specific names
colon:name // valid under UNIX OSes, but not on Windows
what? // valid under UNIX OSes, but still not on Windows

How would I achieve this ? Is there some Qt built-in function ?

I'd like to avoid creating an empty file, but if there is no other reliable way, I would still like to see how to do it in a "clean" way.

Many thanks.

ereOn
  • 53,676
  • 39
  • 161
  • 238

6 Answers6

23

This is the answer I got from Silje Johansen - Support Engineer - Trolltech ASA (in March 2008 though)

However. the complexity of including locale settings and finding a unified way to query the filesystems on Linux/Unix about their functionality is close to impossible.

However, to my knowledge, all applications I know of ignore this problem.

(read: they aren't going to implement it)

Boost doesn't solve the problem either, they give only some vague notion of the maximum length of paths, especially if you want to be cross platform. As far as I know many have tried and failed to crack this problem (at least in theory, in practice it is most definitely possible to write a program that creates valid filenames in most cases.

If you want to implement this yourself, it might be worth considering a few not immediately obvious things such as:

Complications with invalid characters

The difference between file system limitations and OS and software limitations. Windows Explorer, which I consider part of the Windows OS does not fully support NTFS for example. Files containing ':' and '?', etc... can happily reside on an ntfs partition, but Explorer just chokes on them. Other than that, you can play safe and use the recommendations from Boost Filesystem.

Complications with path length

The second problem not fully tackled by the boost page is length of the full path. Probably the only thing that is certain at this moment is that no OS/filesystem combination supports indefinite path lengths. However, statements like "Windows maximum paths are limited to 260 chars" are wrong. The unicode API from Windows does allow you to create paths up to 32,767 utf-16 characters long. I haven't checked, but I imagine Explorer choking equally devoted, which would make this feature utterly useless for software having any users other than yourself (on the other hand you might prefer not to have your software choke in chorus).

There exists an old variable that goes by the name of PATH_MAX, which sounds promising, but the problem is that PATH_MAX simply isn't.

To end with a constructive note, here are some ideas on possible ways to code a solution.

  1. Use defines to make OS specific sections. (Qt can help you with this)
  2. Use the advice given on the boost page and OS and filesystem documentation to decide on your illegal characters
  3. For path length the only workable idea that springs to my mind is a binary tree trial an error approach using the system call's error handling to check on a valid path length. This is quite aloof, but might be the only possibility of getting accurate results on a variety of systems.
  4. Get good at elegant error handling.

Hope this has given some insights.

Andreas Haferburg
  • 5,189
  • 3
  • 37
  • 63
  • Thanks, very instructive and well written answer. Deserves my +1. – ereOn Jun 21 '10 at 15:29
  • 1
    Hey, thank you, that makes it my first upvote! By now I even have enough reputation to clean up the links and upvote your question... Let's dream one day a good library will emerge to tackle this very common problem. –  Jun 21 '10 at 21:25
4

Based on User7116's answer here:

How do I check if a given string is a legal/valid file name under Windows?

I quit being lazy - looking for elegant solutions, and just coded it. I got:

bool isLegalFilePath(QString path)
{
    if (!path.length())
        return false;

    // Anything following the raw filename prefix should be legal.
    if (path.left(4)=="\\\\?\\")
        return true;

    // Windows filenames are not case sensitive.
    path = path.toUpper();

    // Trim the drive letter off
    if (path[1]==':' && (path[0]>='A' && path[0]<='Z'))
        path = path.right(path.length()-2);

    QString illegal="<>:\"|?*";

    foreach (const QChar& c, path)
    {
        // Check for control characters
         if (c.toLatin1() >= 0 && c.toLatin1() < 32)
            return false;

        // Check for illegal characters
        if (illegal.contains(c))
            return false;
    }

    // Check for device names in filenames
    static QStringList devices;

    if (!devices.count())
        devices << "CON" << "PRN" << "AUX" << "NUL" << "COM0" << "COM1" << "COM2"
                << "COM3" << "COM4" << "COM5" << "COM6" << "COM7" << "COM8" << "COM9" << "LPT0"
                << "LPT1" << "LPT2" << "LPT3" << "LPT4" << "LPT5" << "LPT6" << "LPT7" << "LPT8"
                << "LPT9";

    const QFileInfo fi(path);
    const QString basename = fi.baseName();

    foreach (const QString& d, devices)
        if (basename == d)

            // Note: Names with ':' other than with a drive letter have already been rejected.
            return false;    

    // Check for trailing periods or spaces
    if (path.right(1)=="." || path.right(1)==" ")
        return false;

    // Check for pathnames that are too long (disregarding raw pathnames)
    if (path.length()>260)
        return false;

    // Exclude raw device names
    if (path.left(4)=="\\\\.\\")
        return false;

    // Since we are checking for a filename, it mustn't be a directory
    if (path.right(1)=="\\")
        return false;

    return true;
}

Features:

  • Probably faster than using regexes
  • Checks for illegal characters and excludes device names (note that '' is not illegal, since it can be in path names)
  • Allows drive letters
  • Allows full path names
  • Allows network path names
  • Allows anything after \\?\ (raw file names)
  • Disallows anything starting with \\.\ (raw device names)
  • Disallows names ending in "\" (i.e. directory names)
  • Disallows names longer than 260 characters not starting with \\?\
  • Disallows trailing spaces and periods

Note that it does not check the length of filenames starting with \\?, since that is not a hard and fast rule. Also note, as pointed out here, names containing multiple backslashes and forward slashes are NOT rejected by the win32 API.

CodeLurker
  • 1,248
  • 13
  • 22
  • This does not work for e.g. umlauts. :-( "Ä" is 196 and c.toLatin1() returns -60 which is less than 32. – falkb Feb 21 '18 at 12:56
  • better use: if (c.toLatin1() > 0 && c.toLatin1() < 32) – falkb Feb 21 '18 at 13:01
  • Hmm. After looking at it again, I wonder if a control character can be encoded in a QString. If it is stored as UTF16 internally, then the control character would have to exist in the UTF standard. I'll have to look into that when I have time, or you could. – CodeLurker Feb 21 '18 at 16:13
  • I've updated it for now. It does appear control characters can be expressed as UTF-16. Does the Windows API allow e.g. a tab in a filename? Some testing is needed. – CodeLurker Feb 21 '18 at 17:15
  • The QChar::isPrint() function might be a better way to go. Turns out, they're wide characters internally, and can have control codes. – CodeLurker Feb 23 '22 at 17:03
3

I don't think that Qt has a built-in function, but if Boost is an option, you can use Boost.Filesystem's name_check functions.

If Boost isn't an option, its page on name_check functions is still a good overview of what to check for on various platforms.

Josh Kelley
  • 56,064
  • 19
  • 146
  • 246
  • `boost` is definitely an option and that is just perfect ! Thank you very much ! – ereOn Jun 14 '10 at 16:49
  • 4
    Note boost doesn't check for invalid names (on win) such as com,lpt,aux etc - for those who don't remember DOS days. – Martin Beckett Jun 14 '10 at 17:12
  • Right you are. The Boost windows_name() function seems to be it - for those of us using Boost. Note that it has "CLOCK$" as an illegal Windows name; but it does not seem to be. – CodeLurker Feb 23 '22 at 17:13
1

see example (from Digia Qt Creator sources) in: https://qt.gitorious.org/qt-creator/qt-creator/source/4df7656394bc63088f67a0bae8733f400671d1b6:src/libs/utils/filenamevalidatinglineedit.cpp

  • Broken link. It and its header now live here: "https://code.qt.io/cgit/qt-creator/qt-creator.git/tree/src/libs/utils". Note it rejects dir names with "[]{}" which are legal on Windows. – CodeLurker Feb 23 '22 at 17:21
1

Difficult to do reliably on windows (some odd things such as a file named "com" still being invalid) and do you want to handle unicode, or subst tricks to allow a >260 char filename.

There is already a good answer here How do I check if a given string is a legal / valid file name under Windows?

Community
  • 1
  • 1
Martin Beckett
  • 94,801
  • 28
  • 188
  • 263
0

I'd just create a simple function to validate the filename for the platform, which just searches through the string for any invalid characters. Don't think there's a built-in function in Qt. You could use #ifdefs inside the function to determine what platform you're on. Clean enough I'd say.

JimDaniel
  • 12,513
  • 8
  • 61
  • 67