0

I have been trying to merge several csv files into one but its showing me some error. I am new to python, your help will be highly appreciated.

Following is my code:

import pandas as pd
import numpy as np
import glob

all_data_csv = pd.read_csv("C:/Users/Am/Documents/A.csv", encoding='utf-8') 

for f in glob.glob('*.csv'):
  df = pd.read_csv(f, encoding='utf-8')
  all_data_csv= pd.merge(all_data_csv,df ,how= 'outer')
  print(all_data_csv)

and the error shown:

Traceback (most recent call last):
  File "pandas\_libs\parsers.pyx", line 1169, in pandas._libs.parsers.TextReader._convert_tokens
  File "pandas\_libs\parsers.pyx", line 1299, in pandas._libs.parsers.TextReader._convert_with_dtype
  File "pandas\_libs\parsers.pyx", line 1315, in pandas._libs.parsers.TextReader._string_convert
  File "pandas\_libs\parsers.pyx", line 1553, in pandas._libs.parsers._string_box_utf8
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x88 in position 1: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "E:/internship/j.py", line 8, in <module>
    df = pd.read_csv(f, encoding='utf-8')
  File "C:\Users\Amreeta Koner\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\parsers.py", line 702, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "C:\Users\Amreeta Koner\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\parsers.py", line 435, in _read
    data = parser.read(nrows)
  File "C:\Users\Amreeta Koner\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\parsers.py", line 1139, in read
    ret = self._engine.read(nrows)
  File "C:\Users\Amreeta Koner\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\parsers.py", line 1995, in read
    data = self._reader.read(nrows)
  File "pandas\_libs\parsers.pyx", line 899, in pandas._libs.parsers.TextReader.read
  File "pandas\_libs\parsers.pyx", line 914, in pandas._libs.parsers.TextReader._read_low_memory
  File "pandas\_libs\parsers.pyx", line 991, in pandas._libs.parsers.TextReader._read_rows
  File "pandas\_libs\parsers.pyx", line 1123, in pandas._libs.parsers.TextReader._convert_column_data
  File "pandas\_libs\parsers.pyx", line 1176, in pandas._libs.parsers.TextReader._convert_tokens
  File "pandas\_libs\parsers.pyx", line 1299, in pandas._libs.parsers.TextReader._convert_with_dtype
  File "pandas\_libs\parsers.pyx", line 1315, in pandas._libs.parsers.TextReader._string_convert
  File "pandas\_libs\parsers.pyx", line 1553, in pandas._libs.parsers._string_box_utf8
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x88 in position 1: invalid start byte
Amreeta Koner
  • 41
  • 1
  • 5

2 Answers2

0

It seems like you have a non-ascii character in your csv file. I would check out the answer here. Hope it helps.

Waleed S Khan
  • 125
  • 1
  • 11
  • Hey @Waleed S Khan, thanks for your help! So there is no error in the code but in the csv file? – Amreeta Koner Jun 17 '19 at 09:49
  • From the errors that you attached in your question, it seems that the problem is with the csv file as the parsers are not able to parse a non-ascii character. Did the answer in link help you? – Waleed S Khan Jun 17 '19 at 09:51
0
#run the same code with little addon 
pd.read_csv("C:/Users/Am/Documents/A.csv",header=0,encoding = "ISO-8859-1") 
vrana95
  • 511
  • 2
  • 10
  • Thanks for your help! I have made the following changes but still the error is persistent. If possible, kindly help me with the error. – Amreeta Koner Jun 17 '19 at 11:04