0

I'm trying to import data form a csv file using pandas:

data=pd.read_csv("data.csv")

this seems to work fine. Next I would like to specify columns' names with

data.columns = ['X', 'Y']

so that i can plot it later. And here the problem comes:

File "pandas/_libs/hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'X'

csv file has the following format

   X   Y    
  20   120  
  25   145  
  41   160  
  62   301
...

Anyone know what I'm doing wrong?

Thanks!

cs95
  • 379,657
  • 97
  • 704
  • 746
spectrum
  • 1
  • 1
  • 2

1 Answers1

1

You're trying to import a "space" separated DataFrame. As said @jezrael in comments, you should use:

data=pd.read_csv("data.csv", delim_whitespace=True) 

From the official doc:

delim_whitespace : boolean, default False

Equivalent to setting sep='\s+'. If this option is set to True, nothing should be passed in for the delimiter parameter.

More over, if you want to specify column names (which is your question):

Again from the official documentation you can see that you should either :

  • use names argument to specify the name that the columns should take.
  • use header argument to tell python that the first line (index 0) should be parsed as the colnames

So to resume, you should be able to use any of the three statements:

data = pandas.csv("data.csv", delim_whitespace=True, names=["X","Y"])

data = pandas.csv("data.csv", delim_whitespace=True, header=0)

data = pandas.csv("data.csv", sep="\s+") 

Concerning header parameter:

Row number(s) to use as the column names, and the start of the data. Default behavior is to infer the column names if no names are passed the behavior is identical to header=0 and column names are inferred from the first line of the file

Note that this parameter ignores commented lines and empty lines if skip_blank_lines=True, so header=0 denotes the first line of data rather than the first line of the file

gcharbon
  • 1,561
  • 12
  • 20