11

Hei I'm trying to read in pandas the csv file you can download from here (euribor rates I think you can imagine the reason I would like to have this file!). The file is a CSV file but it is somehow strangely oriented. If you import it in Excel file has the format

   02/01/2012,03/01/2012,04/01/2012,,,, 
1w 0.652,0.626,0.606,,,,
2w,0.738,0.716,0.700,,,,

act with first column going up to 12m (but I have give you the link where you can download a sample). I would like to read it in pandas but I'm not able to read it in the correct way. Pandas has a built-in function for reading csv files but somehow it expect to be row oriented rather than column oriented. What I would like to do is to obtain the information on the row labeled 3m and having the values and the date in order to plot the time variation of this index. But I can't handle this problem. I know I can read the data with

import pandas 
data = pandas.io.read_csv("file.csv",parse_dates=True) 

but it would work if the csv file would be somehow transpose. H

piRSquared
  • 285,575
  • 57
  • 475
  • 624
Nicola Vianello
  • 1,916
  • 6
  • 21
  • 26

2 Answers2

8

A pandas dataframe has a .transpose() method, but it doesn't like all the empty rows in this file. Here's how to get it cleaned up:

df = pandas.read_csv("hist_EURIBOR_2012.csv")  # Read the file
df = df[:15]    # Chop off the empty rows beyond 12m
df2 = df.transpose()
df2 = df2[:88]  # Chop off what were empty columns (I guess you should increase 88 as more data is added.

Of course, you can chain these together:

df2 = pandas.read_csv("hist_EURIBOR_2012.csv")[:15].transpose()[:88]

Then df2['3m'] is the data you want, but the dates are still stored as strings. I'm not quite sure how to convert it to a DateIndex.

Thomas K
  • 39,200
  • 7
  • 84
  • 86
  • 1
    In pandas 0.8.0 you'll be able to do DatetimeIndex(array_of_strings) or to_datetime(array_of_strings) to parse them to dates – Wes McKinney May 07 '12 at 18:15
  • Thank you I will try the solution proposed and eventually I will wait for pandas 0.8.0 which as far as I understand will be available this May – Nicola Vianello May 08 '12 at 08:51
  • Couldn't you use this option `skip_blank_lines : boolean, default True`? If set to False the empty rows will be no problem right? – Sebastian Nov 04 '16 at 15:58
0

I've never used pandas for csv processing. I just use the standard Python lib csv functions as these use iterators.

import csv
myCSVfile=r"c:/Documents and Settings/Jason/Desktop/hist_EURIBOR_2012.csv"
f=open(myCSVfile,"r")
reader=csv.reader(f,delimiter=',')
data=[]
for l in reader:
    if l[0].strip()=="3m":
        data.append(l)

f.close()
Jay M
  • 3,736
  • 1
  • 24
  • 33
  • Obviously you may need to clean you data after import, e.g. you chould check for data types, remove empty fields (or replace with None) This version processes the entire dataset, but only returns one line, so you could use break at that point or perhaps append other interesting data. e.g. interesting=['1w','3m','9m']; if l[0].strip() in interesting: .... – Jay M May 07 '12 at 15:30