0

I have a DataFrame that is resampled into a smaller DataFrame, which retained a datetimeindex. I transposed the dataframe and now wish to remove the dateindex and replace it with strings (labels), then export it to .csv to be used in a format that can be read by javascript (doing all data manipulation in python).

I did try to write it to a .csv without header (removing the date) and then reading it again to add the labels, but that doesn't seem very efficient.

Link to csv: https://www.dropbox.com/s/qy72yht2m7lk2pg/17_predicted.csv

Python/pandas Code:

import pandas as pd
import numpy as np
from dateutil.parser import parse
from datetime import datetime
from pandas import *

# Load csv into pandas DataFrame
df = pd.read_csv("17_predicted_dummydata.csv", parse_dates=True, dayfirst=False, keep_date_col=True, index_col=0)

#Creating a date range
df.index = pd.to_datetime(pd.date_range(start='1/1/2000 00:30:00', end='1/1/2000 05:00:00', freq='30T'))

#Rename index
df.index.name = 'date'

df_year = df.resample('D', how='sum')
df_year = np.round(df_year, 0)

df_year.index.name = 'label'
df_year.column = ['value']

df_year = df_year.T

print df_year.head()
print df_year.index.name

df_year.to_csv("17_dummy.csv") #drop index through header=False

CSV input:

    Date/Time,InteriorEquipment:Electricity:Zone:4419 [J](TimeStep),InteriorEquipment:Electricity:Zone:3967 [J](TimeStep),InteriorEquipment:Electricity:Zone:3993 [J](TimeStep)
 01/01  00:30:00,0.583979872,0.428071889,0.044676234
 01/01  01:00:00,0.583979872,0.428071889,0.044676234
 01/01  01:30:00,0.583979872,0.428071889,0.044676234
 01/01  02:00:00,0.583979872,0.428071889,0.044676234
 01/01  02:30:00,0.583979872,0.428071889,0.044676234
 01/01  03:00:00,0.583979872,0.428071889,0.044676234
 01/01  03:30:00,0.583979872,0.428071889,0.044676234
 01/01  04:00:00,0.583979872,0.428071889,0.044676234
 01/01  04:30:00,0.583979872,0.428071889,0.044676234
 01/01  05:00:00,0.583979872,0.428071889,0.044676234

Proposed csv output:

    label,value
InteriorEquipment:Electricity:Zone:4419 [J](TimeStep),6.0
InteriorEquipment:Electricity:Zone:3967 [J](TimeStep),4.0
InteriorEquipment:Electricity:Zone:3993 [J](TimeStep),0.0

I tried to follow this (Insert a row to pandas dataframe) workaround, but couldn't make it work.

Any help is appreciated!

Community
  • 1
  • 1
Chris
  • 1,287
  • 12
  • 31
  • 2
    Could you edit this to a reproducible example that can be copied/pasted (using dummmy data if you need to)? – chrisb Jul 24 '14 at 12:08
  • Hi again, I added some more information of what I have done up to now and the .csv file I am using. Hope this makes it a bit clearer. Thanks! – Chris Jul 25 '14 at 07:41
  • Since you want to manipulate the DataFrame further using Javascript, have you tried using JSON? [`DataFrame.to_json`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_json.html) should be easy enough to use. – Midnighter Jul 25 '14 at 09:30
  • I wish to use .csv, because it then allows me to still analyse it and share it in excel. I didn't know about that function though, and will see if I can get that working for me as a backup. – Chris Jul 25 '14 at 10:06
  • I tried to read in the data, but got `CParserError: Error tokenizing data. C error: Expected 2 fields in line 15, saw 7`. A way simpler way is to simply `reset_index()` on your data and then paste `df.head()` in the post, since we can recreate dataframes through copy & paste. – FooBar Jul 25 '14 at 11:54

1 Answers1

1

You can assign directly to the index name and columns of dataframe to make it output as you want.

In [288]: df_year.index.name = 'label'

In [289]: df_year.columns = ['value']

In [290]: print df_year.to_csv()
label,value
Equipment:Electricity:LGF,79468.0
Equipment:Electricity:GF,66724.0
Equipment:Electricity:1st,30700.0
Equipment:Electricity:2nd,24126.0
Lights:Electricity:LGF,30596.0
Lights:Electricity:GF,30596.0
Lights:Electricity:1st,14078.0
Lights:Electricity:2nd,11063.0
General:Equipment:Electricity,201018.0
General:Lights:Electricity,86334.0
Electricity:Facility,314318.0
Electricity:Building,287352.0
Electricity:Plant,6329.0
Gas:Facility,279252.0
Electricity:HVAC,20637.0
General:Fans:Electricity,3554.0
Cooling:Electricity,17083.0
Pumps:Electricity,3708.0
WaterSystems:Electricity,2621.0
chrisb
  • 49,833
  • 8
  • 70
  • 70
  • This is actually not working for me. I think the issue is maybe related to the translation that I do first. I also found that index.names does not work anymore for 0.14 (http://stackoverflow.com/questions/19851005/rename-pandas-dataframe-index). Can't figure out how it should be done though even when using these examples: http://stackoverflow.com/questions/20037966/pandas-reindexing-data-frame-issue. – Chris Jul 25 '14 at 13:18
  • Make sure you're using `index.name`, not `index.names` – chrisb Jul 25 '14 at 13:45
  • hmm, still no success. I was able to change the index to 'label' then transposed, but then have the format in your answer, except that 'value' is a date; '2000-12-32', which I can't seem to change. If I use df_year.columns = ['value'] after transpose, it throws away the 'label' and puts 'value' on the second column.. – Chris Jul 25 '14 at 15:43
  • If it wasn't clear, the steps above should be done to the already transposed data. – chrisb Jul 25 '14 at 15:59
  • I tried that, for the sake of getting it right. I updated my question with a small part of my data and my code to reproduce the situation. What I finally get is data labelled by only the date. – Chris Jul 25 '14 at 16:57
  • Again, you need to do these steps after you transpose the data - in your edited example it's before. You also have a typo, `.column` instead of `.columns` – chrisb Jul 25 '14 at 17:07
  • Ah, your example said with s, but when I do that I get an error: ValueError: Length mismatch: Expected axis has 3 elements, new values have 1 elements. It wants to rename the row names.. – Chris Jul 25 '14 at 17:35
  • Hmm, did something wrong before apparently, thanks for the help this worked. – Chris Jul 29 '14 at 09:59