11

I have a .csv file on my F: drive on Windows 7 64-bit that I'd like to read into pandas and manipulate.

None of the examples I see read from anything other than a simple file name (e.g. 'foo.csv').

When I try this I get error messages that aren't making the problem clear to me:

import pandas as pd

trainFile = "F:/Projects/Python/coursera/intro-to-data-science/kaggle/data/train.csv"
trainData = pd.read_csv(trainFile)

The error message says:

IOError: Initializing from file failed

I'm missing something simple here. Can anyone see it?

Update:

I did get more information like this:

import csv

if __name__ == '__main__':
    trainPath = 'F:/Projects/Python/coursera/intro-to-data-science/kaggle/data/train.csv'
    trainData = []
    with open(trainPath, 'r') as trainCsv:
        trainReader = csv.reader(trainCsv, delimiter=',', quotechar='"')
        for row in trainReader:
            trainData.append(row)
    print trainData

I got a permission error on read. When I checked the properties of the file, I saw that it was read-only. I was able to read 892 lines successfully after unchecking it.

Now pandas is working as well. No need to move the file or amend the path. Thanks for looking.

duffymo
  • 305,152
  • 44
  • 369
  • 561
  • 1
    honestly, your best bet is to move the file...but if you don't want to do that, try using `os` module to change into that directory and just call `train.csv' – Ryan Saxe Jun 06 '13 at 02:09
  • 1
    have you tried providing a buffer instead of a filepath? `pd.read_csv(open(trainFile))` – loopbackbee Jun 06 '13 at 02:20

6 Answers6

13

I cannot promise that this will work, but it's worth a shot:

import pandas as pd
import os

trainFile = "F:/Projects/Python/coursera/intro-to-data-science/kaggle/data/train.csv"

pwd = os.getcwd()
os.chdir(os.path.dirname(trainFile))
trainData = pd.read_csv(os.path.basename(trainFile))
os.chdir(pwd)
zwol
  • 135,547
  • 38
  • 252
  • 361
  • @RyanSaxe I'm pretty sure the thing you suggested is not the same as this and will, in fact, not work. – zwol Jun 06 '13 at 02:21
  • how is it not? you use `os.chdir` to change into the directory of the file and execute it... – Ryan Saxe Jun 06 '13 at 02:23
5

A better solution is to use literal strings like r'pathname\filename' rather than 'pathname\filename'. See Lexical Analysis for more details.

Hanan Shteingart
  • 8,480
  • 10
  • 53
  • 66
4

I also got the same issue and got that resolved .

Check your path for the file correctly

I initially had the path like

dfTrain = pd.read_csv("D:\\Kaggle\\labeledTrainData.tsv",header=0,delimiter="\t",quoting=3)

This returned an error because the path was wrong .Then I have changed the path as below.This is working fine.

dfTrain = dfTrain = pd.read_csv("D:\\Kaggle\\labeledTrainData.tsv\\labeledTrainData.tsv",header=0,delimiter="\t",quoting=3)

This is because my earlier path was not correct.Hope you get it reolved

Dan Lowe
  • 51,713
  • 20
  • 123
  • 112
  • 1
    I'm getting confused by your corrected code. It seems that you typed `dfTrain = ` and `\\labeledTrainData.tsv` twice, but if you deleted all these two, the initial path is the same as this one. Am I missing something here? – StayFoolish Mar 23 '17 at 06:40
4

This happens to me quite often. Usually I open the csv file in Excel, and save it as an xlsx file, and it works.

so instead of

df = pd.read_csv(r"...\file.csv")

Use:

df = pd.read_excel(r"...\file.xlsx")
sheldonzy
  • 5,505
  • 9
  • 48
  • 86
2

If you're sure the path is correct, make sure no other programs have the file open. I got that error once, and closing the Excel file made the error go away.

numbers are fun
  • 423
  • 1
  • 7
  • 12
-1

Try this:

import os
import pandas as pd


trainFile = os.path.join('F:',os.sep,'Projects','Python','coursera','intro-to-data-science','train.csv' )
trainData = pd.read_csv(trainFile)
Shtefan
  • 742
  • 12
  • 14