3

Try:

import os, shutil

wd = os.path.abspath(os.path.curdir)
newfile = os.path.join(wd, 'testfile')
print str(newfile)
with open(newfile, 'w') as f: f.write('Hello bugs')
shutil.move(newfile, os.path.join(wd, 'testfile:.txt')) # note the :

Now check the directory - newfile is deleted and no other file is created - Process finished with exit code 0.

If however you issue:

shutil.move(newfile, os.path.join(wd, 'testfile:')) # note no extension

it blows with:

Traceback (most recent call last):
      File "C:/Users/MrD/.PyCharm40/config/scratches/scratch_3", line 9, in <module>
        shutil.move(newfile, os.path.join(wd, 'testfile:'))
      File "C:\_\Python27\lib\shutil.py", line 302, in move
        copy2(src, real_dst)
      File "C:\_\Python27\lib\shutil.py", line 130, in copy2
        copyfile(src, dst)
      File "C:\_\Python27\lib\shutil.py", line 83, in copyfile
        with open(dst, 'wb') as fdst:
    IOError: [Errno 22] invalid mode ('wb') or filename: 'C:\\Users\\MrD\\.PyCharm40\\config\\scratches\\testfile:'

as it should.

Is it a bug ?

Context: I was testing the behavior of my code when illegal filenames were given (: is illegal in windows filenames) when to my amazement my program deleted the original file (bad!) and created a zero size file with the attributes of the original (yes in my case the file was created, just empty) and filename the filename given up to the : - soo a filename like textfile:.jpg gave me a zero byte textfile. It took a lot of debugging - here is the little critter inside the Python27\lib\shutil.py copyfile() (the line that blows above and did not blow):

enter image description here

I don't know why in my case the file was created though while when running the script no.

Mr_and_Mrs_D
  • 32,208
  • 39
  • 178
  • 361
  • 1
    See [Alternate Data Streams](http://stackoverflow.com/questions/33085253/how-do-i-read-windows-ntfss-alternate-data-stream-using-javas-io) – Peter Wood Dec 04 '15 at 21:50
  • @PadraicCunningham: it's open that should blow and doesn't on a closer look in the source - in the traceback it's seen that open would blow but here it does not. `open` is of course OS specific. shutil.move just happens to behave really badly because of that - as in unlink the file instead of blowing. I should probably edit the question but now I need a break - trust me the source this came from is rather err complex. – Mr_and_Mrs_D Dec 04 '15 at 22:19

1 Answers1

2

This isn't a bug in Python's shutil or os modules, it's just a weirdness in Windows. Peter Wood's link in the comments discusses "Advanced Data Streams" -- a Windows filesystem mechanism that attaches a hidden file containing metadata to a regular, visible file. A key word there is attached; The hidden file is deleted if the file it is attached to is deleted.

It appears that a colon is used to separate the path of the regular file from the hidden file. For example, if in the command line you write:

> notepad foo

Then close notepad, and write

> notepad foo.txt:bar

Notepad will open the hidden file. Go ahead and write something in it, save, and close. Typing > dir and the command line will only show foo.txt, not foo.txt:bar.txt. But sure enough, if you write

> notepad foo.txt:bar.txt

the file you just edited will appear, and your changes will be intact.

So what is happening with your Python code? The documentation for shutil.move says:

src is copied (using shutil.copy2()) to dst and then removed.

So when you move testfile to testfile:.txt, Python first copies testfile to the hidden testfile:.txt. But then it removes testfile, and by doing so removes the hidden testfile:.txt. Therefore it appears to you that the original file has been deleted, and no new file has been created.

The following snippet of code might make this clearer (I've saved it as demo.py, and I'm running it in the same, other-wise empty directory):

import os, shutil


with open('test', 'w') as f:
    f.write('Hello bugs')

shutil.copy2('test', 'test:foo.txt')

with open('test:foo.txt') as f:
    print(f.read())

print 'test: exists? ', os.path.exists('test')
print 'test:foo.txt exists? ', os.path.exists('test:foo.txt')
print os.listdir('.')

print('removing...')
os.remove('test')

print 'test: exists? ', os.path.exists('test')
print 'test:foo.txt exists? ', os.path.exists('test:foo.txt')
print os.listdir('.')

This prints:

Hello bugs
test exists? True
test:foo.txt exists? True
['demo.py', 'test']
removing...
test: exists? False
test:foo.txt exists? False
['demo.py']

This shows that we can create a normal file, write to it, and copy that normal file to its hidden stream, open, and read it just fine, and the result is as expected. Then we see that os.path.exists shows that both test and it's hidden attachment test:foo.txt exist, even though os.listdir only shows test. Then we delete test and we see that test:foo.txt no longer exists as well.

Lastly, you can't create a hidden data stream without a name, therefore test: is an invalid path. Python correctly throws an exception in this case.

So the Python code is actually functioning as it should under Windows -- "Alternate Data Streams" are just such a little-known "feature" that this behavior is surprising.

Community
  • 1
  • 1
jme
  • 19,895
  • 6
  • 41
  • 39
  • Wow - I saw the link but was too baffled to actually see how it would apply here. This also explains why the files were created with 0 size in my case (the name of the file I moved was different so unlinking it did not delete the new file with its stream). Thanks – Mr_and_Mrs_D Dec 04 '15 at 22:50