0
df = pd.read_csv("/Users/georgezambrano/Desktop/Code/Sales_Data/Sales_April_2019.csv")

files = [file for file in os.listdir('/Users/georgezambrano/Desktop/Code/Sales_Data')] 

all_months_data = pd.DataFrame()

for file in files:
    df = pd.read_csv("/Users/georgezambrano/Desktop/Code/Sales_Data/"+file)
    all_months_data = pd.concat([all_months_data, df])
    
    all_months_data.head()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 3131: invalid start byte 

I believe I'm getting the error due to a .ds_store file that shows up when I print the list of files. However, the file .ds_store isn't visible in the folder so I'm really not sure how to delete it.

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
George Zambrano
  • 47
  • 3
  • 11
  • Why not simply use `if file != '.DS_Store':`, `file.endswith('.csv')`, or equivalent? Much easier than fighting with your operating system. – nanofarad Sep 09 '21 at 20:10
  • You cannot remove `.DS_Store` it contains necessary metadata about the directory on MacOS. If you delete it it will be recovered by the operating system. You should be programmatically ignoring it. Or, even better, selecting _only_ files that end with .csv to ensure that you aren't passing other incompatible files to `read_csv`. – Henry Ecker Sep 09 '21 at 20:13
  • 1
    Additionally, sequential concatenations are extremely slow. It is better to get a list of DataFrames and concat once. See [Import multiple csv files into pandas and concatenate into one DataFrame](https://stackoverflow.com/q/20906474/15497888) – Henry Ecker Sep 09 '21 at 20:14
  • You could open a terminal, cd to the directory where the file is, and `rm .DS_Store`. But it might reappear. – khelwood Sep 09 '21 at 20:22

1 Answers1

0

I typically use the following:

workdir=os.listdir(Work_Dir)
if '.DS_Store' in workdir:
  workdir.remove('.DS_Store')

os.listdir() works by creating a list of the files present in the directory. For mac users this can be an issue due to the creation of .DS_Store file which stores important metadata.

The aforementioned method works by assigning the list to another variable and it then checks for the presence of .DS_Store in the list. If there is a .DS_Store file in the list it will just remove it.

Hope this helps :)