1

Can't read in a large Excel file using read_csv - python error that file doesn't exist.

Smaller versions of same excel file open easily.

import pandas as pd
data = pd.read_csv("E:\rawdata_50K.csv")
print(data[0:5])

Top 20 lines of excel file load perfectly; the large version does not.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271

2 Answers2

2

Note the r in front of the path if using Windows \

data = pd.read_csv(r"E:\rawdata_50K.csv")

or

Note the direction of the / in the path, doesn't require r

data = pd.read_csv("E:/rawdata_50K.csv")

File paths with pathlib:

pathlib

from pathlib import Path

drive_path = Path('E:/')
file_path = drive_path / 'rawdata_50K.csv'
data = pd.read_csv(file_path)
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
  • To be clear, without the raw-string `r` prefix, the path was to `E:awdata_50K.csv`, which of course does not exist. This is why you *always* use raw strings for Windows paths (and regular expressions). – ShadowRanger Aug 07 '19 at 03:11
0

Thanks so much! The 2nd solution in Answer 5 of the 6GB answer worked well and fast....

Trying suggested methods

import pandas as pd Fileread = pd.read_csv("E:\dataraw.csv", chunksize=500) dfList = [] for df in Fileread: dfList.append(df)

df = pd.concat(dfList,sort=False)

print(df[99950:100000])

and perhaps someone can explain why the same CSV file worked when named as dataraw but did NOT work if renamed rawdata...(????)