0

new to python and looking for some help on a problem I am having with os.walk. I have had a solid look around and cannot find the right solution to my problem.

What the code does: Scans a users selected HD or folder and returns all the filenames, subdirs and size. This is then manipulated in pandas (not in code below) and exported to an excel spreadsheet in the formatting I desired.

However, in the first part of the code, in Python 2.7, I am currently experiencing the below error:

WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: 'E:\03. Work\Bre\Files\folder2\icons greyscale flatten\._Icon_18?10 Stainless Steel.psd'

I have explored using raw string (r') but to no avail. Perhaps I am writing it wrong.

I will note that I never get this in 3.5 or on cleanly labelled selected folders. Due to Pandas and pysinstaller problems with 3.5, I am hoping to stick with 2.7 until the error with 3.5 is resolved.

import pandas as pd
import xlsxwriter
import os
from io import StringIO

#Lists for Pandas Dataframes   

fpath = []
fname = []
fext = []
sizec = []

# START #Select file directory to scan

filed = raw_input("\nSelect a directory to scan: ")    

#Scan the Hard-Drive and add to lists for Pandas DataFrames

print "\nGetting details..."
for root, dirs, files in os.walk(filed):
  for filename in files:
      f = os.path.abspath(root) #File path
      fpath.append(f) 
      fname.append(filename) #File name
      s = os.path.splitext(filename)[1] #File extension
      s = str(s)
      fext.append(s)
      p = os.path.join(root, filename) #File size
      si = os.stat(p).st_size
      sizec.append(si)
print "\nDone!"

Any help would be greatly appreciated :)

fergdid
  • 11
  • 2
  • The `?` is not a [valid character for a Windows filename](https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx); so I suspect the issue is that there is some other actual character there. What is the filename when you open it up in Windows Explorer? – Burhan Khalid Dec 20 '15 at 06:11
  • Thanks for the reply @Burhan Khalid. Any ideas on how I get os.walk to ignore these? This code is likely to be used on a lot of messy hd's with mixtures of illegal characters – fergdid Dec 20 '15 at 06:15
  • First, try opening the folder in Windows Explorer and see what the actual filename is. – Burhan Khalid Dec 20 '15 at 06:19
  • This file is hidden, and I believe root is catching it - but the filename is ._Icon_1810 Stainless Steel.psd - that square displays as a period at mid height – fergdid Dec 20 '15 at 06:25
  • Try `os.walk(unicode(filed))` and see if you get the same results. – Burhan Khalid Dec 20 '15 at 06:27
  • Man! That's a bingo... thankyou so much. I feel I will use this time and time again. Results are as expected :) – fergdid Dec 20 '15 at 06:31

1 Answers1

0

In order to traverse filenames with unicode characters, you need to give os.walk a unicode path name.

Your path contains a unicode character, which is being displayed as ? in the exception.

If you pass in the unicode path, like this os.walk(unicode(filed)) you should not get that exception.

As noted in Convert python filenames to unicode sometimes you'll get a bytestring if the path is "undecodable" by Python 2.

Community
  • 1
  • 1
Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284