-1

Trying to read CSV file using pandas in Pycharm for a python project. Getting an error when i run the code I tried all the past solutions like adding "r" or double slash or mentioning the encoding. It does not seem to work.

Using read_csv using pandas library. Tried using double slash in the file path but it didn't work.

UPDATE updated the code as below Apparently, one of the issues with the CSV file was that there was no header in the file. Below worked just fine:

import pandas as pd
path = "c:/ML_Cricket/CSV/225171.csv"
df = pd.read_csv(path,error_bad_lines=False,names=["1","2","3","4","5","6","7","8","9","10","11"])
print(df)
import pandas as pd

df = pd.read_csv("C:\ML_Cricket\CSV\221571.csv")

print(df.head())

I get this traceback:

C:\Users\abc\PycharmProjects\untitled\venv\Scripts\python.exe C:/Users/abc/.PyCharmCE2019.1/config/scratches/scratch.py
Traceback (most recent call last):
 File "C:/Users/abc/.PyCharmCE2019.1/config/scratches/scratch.py", line 3, in <module>
   df = pd.read_csv("C:\ML_Cricket\CSV\221571.csv")
 File "C:\Users\abc\PycharmProjects\untitled\venv\lib\site-packages\pandas\io\parsers.py", line 702, in parser_f
   return _read(filepath_or_buffer, kwds)
 File "C:\Users\abc\PycharmProjects\untitled\venv\lib\site-packages\pandas\io\parsers.py", line 429, in _read
   parser = TextFileReader(filepath_or_buffer, **kwds)
 File "C:\Users\abc\PycharmProjects\untitled\venv\lib\site-packages\pandas\io\parsers.py", line 895, in __init__
   self._make_engine(self.engine)
 File "C:\Users\abc\PycharmProjects\untitled\venv\lib\site-packages\pandas\io\parsers.py", line 1122, in _make_engine
   self._engine = CParserWrapper(self.f, **self.options)
 File "C:\Users\abc\PycharmProjects\untitled\venv\lib\site-packages\pandas\io\parsers.py", line 1853, in __init__
   self._reader = parsers.TextReader(src, **kwds)
 File "pandas\_libs\parsers.pyx", line 387, in pandas._libs.parsers.TextReader.__cinit__
 File "pandas\_libs\parsers.pyx", line 686, in pandas._libs.parsers.TextReader._setup_parser_source
UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: invalid character
vvvvv
  • 25,404
  • 19
  • 49
  • 81
Anoop Mahajan
  • 91
  • 1
  • 10
  • @hiroprotagonist editing the traceback as a quote has trashed the formatting. Why didn't you edit as code? :/ – roganjosh Jun 08 '19 at 12:35
  • 1
    A guess: `df = pd.read_csv("C:\ML_Cricket\CSV\221571.csv", encoding='cp1252')` – roganjosh Jun 08 '19 at 12:36
  • Did not work :( – Anoop Mahajan Jun 08 '19 at 12:47
  • @roganjosh yes, that was a bad decision... tried to fix (but that also looks bad...) do you have a better way of formatting tracebacks here? – hiro protagonist Jun 08 '19 at 12:49
  • hmm. now anky_91 converted it back to quote... – hiro protagonist Jun 08 '19 at 12:52
  • You need to escape the backslashes: `"C:\\ML_Cricket\\CSV\\221571.csv"` – rainer Jun 08 '19 at 12:53
  • @hiroprotagonist I think that, once it goes into quotation format, it cannot be undone properly. Anecdotally, and as I experienced here, I never seem to be able to get it back into correct format. Or maybe it's the pasted format that cannot be converted into code format in the first place that makes people format as quotes. A properly-pasted traceback should be convertible into code format with no issues. – roganjosh Jun 08 '19 at 12:54
  • @AnoopMahajan check with [`chardet`](https://stackoverflow.com/questions/54133455/importing-csv-using-pd-read-csv-invalid-start-byte-error/54134734#54134734) may be.. also escape the `\\` as rainer suggests – anky Jun 08 '19 at 12:56
  • Tried the double backslash but it did not work – Anoop Mahajan Jun 08 '19 at 12:59
  • @roganjosh i took the source from the original edit to paste as code - i did not just undo my edit... now it looks good anyway. thanks for your advice! – hiro protagonist Jun 08 '19 at 16:25
  • "_It doesnt seem to work_" How? Are you getting the same error? A different error? Does the code run but with weird behaviour? Is there a memory leak? Is the OS blowing up? Please be specific. :) – TrebledJ Jun 08 '19 at 16:55

1 Answers1

0

Maybe try this.

df = pd.read_csv("c:/ML_Cricket/CSV/221571.csv")
BillyN
  • 185
  • 2
  • 10