1

How can I get Python to raise an exception on opening a file with an invalid file name? Example, I have this code

def write_html_to_file(set_name, pid, html_text):
    if not os.path.exists(HTML_DUMP_DIR_NAME):
        os.makedirs(HTML_DUMP_DIR_NAME)
        path = (HTML_DUMP_DIR_NAME + set_name + '-' + pid + '.html')
    try:
        with open(path, "w+", encoding='utf-8') as html_dump_file:
            html_dump_file.write(html_text)
    except OSError as e:
        logging.basicConfig(filename=LOG_FILE_PATH, level=logging.ERROR)
        logging.error('Failed to create html dump. '
                      + ' error=' + e
                      + ' file_name=' + path)

Assume that the value of path is 'Folder1/SubFolder/Some Title: thing.html' and the file does not exist yet.

What I expect is that Python will raise an OSerror with Invalid Arguments, or something like that. What actually happens is it creates a file called 'Folder1/Subfolder/Some Title'. Notice the filename stops at the invalid character

I know I can create my own exceptions that I can raise if I detect an invalid name, but in this case that's pointless. I only care if I'm trying to do something invalid at the OS level. It seems in this case its failing silently and I don't like that.

Edit: Sorry, I guess my question wasn't clear.

  • I do want it to create a file, that part I'm happy about.
  • The problem is that the file-name that gets created stops at the invalid character. I want the entire name to be there.
  • My question is, Why doesn't python raise an exception so I can handle it when it encounters an invalid character
Vinay Sajip
  • 95,872
  • 14
  • 179
  • 191
Josh Sanders
  • 750
  • 10
  • 21
  • 1
    What other way is there to create a file in Python? – Scott Hunter Feb 03 '19 at 23:50
  • You are creating a new file; why do you expect it to complain because it does not exist yet. – Selcuk Feb 03 '19 at 23:51
  • This code will raise `NameError` if `os.path.exists(HTML_DUMP_DIR_NAME)` is true, as `path` is never assigned. Somehow I suspect this isn't the real code. – John Gordon Feb 03 '19 at 23:51
  • "the file does not exist yet" and "a file with an invalid file name" are very different things. – Stop harming Monica Feb 03 '19 at 23:51
  • Don't use `w+` it creates a new file when it doesn't exist. – GKE Feb 03 '19 at 23:51
  • Your mistake is assuming that "Some Title: thing.html" is invalid. This is a file with a data [stream](https://learn.microsoft.com/en-us/windows/desktop/FileIO/file-streams) named " thing.html". – Eryk Sun Feb 03 '19 at 23:52
  • Take a look at [this question](https://stackoverflow.com/questions/16208206/confused-by-python-file-mode-w) too. – GKE Feb 03 '19 at 23:52
  • Adding to @eryksun: Read up on [Alternate Data Streams (ADS)](https://learn.microsoft.com/en-us/sysinternals/downloads/streams). – ShadowRanger Feb 03 '19 at 23:54
  • I updated my question, some of the comments here seem to be confused as to what I'm asking, for example the one from GKE and Scott Hunter. @eryksun, thanks I'll update that. I'm a C++ engineer so I'm still getting used to Python :p – Josh Sanders Feb 04 '19 at 00:00
  • @JohnGordon It is the real code, but in my test cases so far I've only been working with the case in which the file does not exist yet and is being created. I'm sure I would have run into that later, at least in that case I'll have an exception to work with. – Josh Sanders Feb 04 '19 at 00:03
  • I can only reiterate that "Some Title: thing.html" is valid, but it's not doing what you expect. With the NTFS file system, all regular files have at least an anonymous $DATA stream, e.g. "Some Title::$DATA", but we can also have an unlimited number of named $DATA streams, e.g. "Some Title: .html:$DATA". Since NTFS knows it's a regular file, the $DATA stream type is optional, e.g. "Some Title: .html". Check this in CMD using `dir /r` to include the alternate data streams in the listing. – Eryk Sun Feb 04 '19 at 00:06
  • @eryksun ohhhh, I missed your comment about assuming the file is invalid. Interesting. How can I avoid this? Is it just coincidence that : is an invalid character in windows but also an identifier for a data stream? How can I get it so that the ':' makes it to the OS so that it throws an error? Edit: reading about ADS now – Josh Sanders Feb 04 '19 at 00:10
  • ":" is not an invalid character in Windows paths. Obviously it's legal for device names (e.g. "C:"), and it's also valid to delimit the stream name and type if the file system supports this (e.g. NTFS). Where it's invalid (usually, this isn't set in stone) is in the file/directory name itself. Your filename is "Some Title". The colon after that is a delimiter. It's not part of the filename. – Eryk Sun Feb 04 '19 at 00:13
  • I see, I suppose that makes sense. Got it. Okay so then I guess my question is, how do I make ':' be part of the file name that I'm trying to create in python (so that it can fail and throw an exception). Because, if I understand correctly, I've basically created a file called Folder1/SubFolder/SomeTitle with a data stream called thing.html – Josh Sanders Feb 04 '19 at 00:16
  • I guess I can just strip ':' specifically as its a pretty special case. – Josh Sanders Feb 04 '19 at 00:24
  • Wow. There's apparently more to consider than I thought – Josh Sanders Feb 04 '19 at 00:33

1 Answers1

2

So it turns out (thanks eryksun), specifically the ':' in the string you pass to open() actually is a delimiter on NTFS systems that signifies that you want to write to multiple data streams:

My code works fine for any other character that is invalid in file names (* for example). So the solution I came up with is to replace any ':' in my filename with a - before I pass it to open()

base_file_name= base_file_name.replace(':', '-')

':'s are invalid in filenames anyway so this is fine for me.

Josh Sanders
  • 750
  • 10
  • 21