63

If I wanted to create a string which is guaranteed not to represent a filename, I could put one of the following characters in it on Windows:

\ / : * ? | < >

e.g.

this-is-a-filename.png

?this-is-not.png

Is there any way to identify a string as 'not possibly a file' on Linux?

izb
  • 50,101
  • 39
  • 117
  • 168

4 Answers4

52

There are almost no restrictions - apart from '/' and '\0', you're allowed to use anything. However, some people think it's not a good idea to allow this much flexibility.

Vinay Sajip
  • 95,872
  • 14
  • 179
  • 191
23

An empty string is the only truly invalid path name on Linux, which may work for you if you need only one invalid name. You could also use a string like "///foo", which would not be a canonical path name, although it could refer to a file ("/foo"). Another possibility would be something like "/dev/null/foo", since /dev/null has a POSIX-defined non-directory meaning. If you only need strings that could not refer to a regular file you could use "/" or ".", since those are always directories.

mark4o
  • 58,919
  • 18
  • 87
  • 102
  • 2
    ➕1 Very clever, thinking of `/dev/null/` as a directory! – L0j1k Apr 26 '18 at 14:50
  • `mkdir` in Matlab (R2018a) just succeeded on me for `/dev/null` with message `already exists`. Had to do `/dev/null/foo` to trigger an error. – Andreas Feb 24 '22 at 07:40
1

Technically it's not invalid but files with dash(-) at the beginning of their name will put you in a lot of troubles. It's because it has conflicts with command arguments.

lordkian
  • 95
  • 7
-6

I personally find that a lot of the time the problem is not Linux but the applications one is using on Linux.

Take for example Amarok. Recently I noticed that certain artists I had copied from my Windows machine where not appearing in the library. I check and confirmed that the files were there and then I noticed that certain characters in the folder names (Named for the artist) were represented with a weird-looking square rather than an actual character.

In a shell terminal the filenames look even stranger: /Music/Albums/Einst$'\374'rzende\ Neubauten is an example of how strange.

While these files were definitely there, Amarok could not see them for some reason. I was able to use some shell trickery to rename them to sane versions which I could then re-name with ASCII-only characters using Musicbrainz Picard. Unfortunately, Picard was also unable to open the files until I renamed them, hence the need for a shell script.

Overall this a a tricky area and it seems to get very thorny if you are trying to synchronise a music collection between Windows and Linux wherein certain folder or file names contain funky characters.

The safest thing to do is stick to ASCII-only filenames.

OOPMan
  • 510
  • 4
  • 8
  • 1
    This doesn't address the question I'm afraid. – Mat May 16 '14 at 12:53
  • I disagree. Technically the NTFS file system supports all kinds of cool features that the primary "application" that uses it (Windows) does not consider valid or use. – OOPMan Jun 12 '14 at 13:44
  • 1
    it sounds like your Windows box used a different encoding than your linux locale. as long as your filenames are encoded using the same encoding as your locale, you should be fine using wide chars, although you're right that it's safer to use ascii. – sig_seg_v Mar 06 '16 at 20:41
  • 1
    ascii is really poor for non english languages – maazza Feb 17 '21 at 13:45