0

I have pandas dateframe like this:

import pandas as pd
price_data=pd.read_csv('data.csv')  
9   CHART_EQUITY    1585151291564   SUBS    seq chart-sequence  2624
10  CHART_EQUITY    1585151291564   SUBS    key symbol  SPY
11  CHART_EQUITY    1585151291564   SUBS    1   chart-time  244.7099
12  CHART_EQUITY    1585151291564   SUBS    2   open-price  245.14
13  CHART_EQUITY    1585151291564   SUBS    3   high-price  244.35
14  CHART_EQUITY    1585151291564   SUBS    4   low-price   244.97
15  CHART_EQUITY    1585151291564   SUBS    5   close-price 193629.0
16  CHART_EQUITY    1585151291564   SUBS    6   volume  287
17  CHART_EQUITY    1585151291564   SUBS    7   chart-time  1585151220000
18  CHART_EQUITY    1585151291564   SUBS    8   chart-day   18346

1585151291564 is the timestamp,I want to use timestamp as an index and convert the data into this column format :

timestamp      open-price   high-price   low-price   close-price volume
1585151291564  245.14       244.35       244.97      244.97

Any friend can help?

Gamopo
  • 1,600
  • 1
  • 14
  • 22
williamfaith
  • 247
  • 2
  • 4
  • 9
  • df.index = df['timestamp column'] will change the index.. Rename columns with df.columns = ['timestamp',open-price',.......] will change column names – Quantum Dreamer Mar 25 '20 at 17:05
  • Does this answer your question? [How to pivot a dataframe](https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe) – G. Anderson Mar 25 '20 at 17:12
  • Can you clarify what exactly the issue is? Please see [ask], [help/on-topic]. – AMC Mar 25 '20 at 17:44

1 Answers1

0

For the future, you should make sure to include a minimum working example. People trying to help you can't immediately run your code and get the exact dataframe you are working with. But I tried to recreate it below and used a pivot_table, as @G. Anderson suggested, to make the transformation I think you are asking for.

import pandas as pd
import numpy as np

data = np.array(
      [['open-price', 1, 245.14],
       ['high-price', 1, 244.35],
       ['low-price', 1, 244.97],
       ['close-price', 1, 244.97],
       ['open-price', 2, 246.14],
       ['high-price', 2, 245.35],
       ['low-price', 2, 245.97],
       ['close-price', 2, 245.97]]
)

df = pd.DataFrame(data, columns=['price-label', 'timestamp', 'price'])
df['timestamp'] = df['timestamp'].astype('float')
df['price'] = df['price'].astype('float')

df

   price-label  timestamp   price
0   open-price        1.0  245.14
1   high-price        1.0  244.35
2    low-price        1.0  244.97
3  close-price        1.0  244.97
4   open-price        2.0  246.14
5   high-price        2.0  245.35
6    low-price        2.0  245.97
7  close-price        2.0  245.97

newdf = pd.pivot_table(df, index='timestamp', columns='price-label')

newdf

                  price                                
price-label close-price high-price low-price open-price
timestamp                                              
1.0              244.97     244.35    244.97     245.14
2.0              245.97     245.35    245.97     246.14

To get rid of the MultiIndex on the columns you can drop the first level.

newdf = newdf.droplevel(axis=1, level=0)

newdf
price-label  close-price  high-price  low-price  open-price
timestamp                                                  
1.0               244.97      244.35     244.97      245.14
2.0               245.97      245.35     245.97      246.14
jtorca
  • 1,531
  • 2
  • 17
  • 31