0

I have a large index of csv's with x number of columns and y number of rows. I want my code to run through each csv (while index loop) and combine columns with specific headers into a new column and then save the csv into a new path. This is my code so far but I'm getting error:

'utf-8' codec can't decode byte 0xa9 in position 33: invalid start byte

Any ideas?

import os
import pandas as pd

#code to add new row to all csvs with unique identifier stamp that combines 
the following: 
#wellkey+drillkey+lat+long+spuddate

files=['Apr 23 2018.csv','Apr 20 2018.csv']
index=0
os.chdir('file path')

#code to loop through all the files listed above
while index < len(files):
    os.chdir('file path')
    current_file=files[index]

    #unique identifier column
    df=pd.read_csv(current_file)
    df['Unique Identifier']=df['A'] + "-" + df['B'] + "-" + df['C'] + "-" + 
    df['D'] + "-" + df['E']
    df.to_csv(current_file)

    #save new csv
    os.chdir('New file Path')
    index = index + 1

Thank you for your advice/comments/corrections.

Vasilis G.
  • 7,556
  • 4
  • 19
  • 29
Sammy
  • 1
  • 1
  • Possible duplicate of [UnicodeDecodeError when reading CSV file in Pandas with Python](https://stackoverflow.com/questions/18171739/unicodedecodeerror-when-reading-csv-file-in-pandas-with-python) – wwii Apr 27 '18 at 17:12
  • Welcome to SO. Please take the time to read [ask] and the other links found on that page. – wwii Apr 27 '18 at 17:12

1 Answers1

0

When I run into this issue, the first thing I will try is adding encoding='ISO-8859-1 to my pd.read_csv() statement

So your statement will look like this: df=pd.read_csv(current_file, encoding='ISO-8859-1')

Tommy
  • 695
  • 2
  • 10
  • 15