0

I have a dataframe that looks like this, where the "Date" is set as the index

            A B C D E
Date  
1999-01-01  1 2 3 4 5
1999-01-02  1 2 3 4 5 
1999-01-03  1 2 3 4 5 
1999-01-04  1 2 3 4 5 

I'm trying to compare the percent difference between two pairs of dates. I think I can do the first bit:

start_1 = "1999-01-02"
end_1 = "1999-01-03"

start_2 = "1999-01-03"
end_2 = "1999-01-04"

Obs_1 = df.loc[end_1] / df.loc[start_1] -1
Obs_2 = df.loc[end_2] / df.loc[start_2] -1

The output I get from - eg Obs_1 looks like this:

A    0.011197
B    0.007933
C    0.012850
D    0.016678
E    0.007330
dtype: float64

I'm looking to build some correlations between Obs_1 and Obs_2. I think I need to create a new dataframe with the labels A-E as one column (or as the index), and then the data series from Obs_1 and Obs_2 as adjacent columns.

But I'm struggling! I can't 'see' what Obs_1 and Obs_2 'are' - have I created a list? A series? How can I tell? What would be the best way of combining the two into a single dataframe...say df_1.

I'm sure the answer is staring me in the face but I'm going mental trying to figure it out...and because I'm not quite sure what Obs_1 and Obs_2 'are', it's hard to search the SO archive to help me.

Thanks in advance

harrison10001
  • 109
  • 1
  • 6
  • does `pd.concat([Obs_1 , Obs_2], axis=1)` what you look for? and try `type(Osb_1)` to see what they are – Ben.T Mar 23 '20 at 18:33
  • 1
    Thanks Ben - that's exactly what I was looking for. Brilliant!! – harrison10001 Mar 23 '20 at 18:44
  • Does this answer your question? [Combining two Series into a DataFrame in pandas](https://stackoverflow.com/questions/18062135/combining-two-series-into-a-dataframe-in-pandas) – Ben.T Mar 23 '20 at 18:48
  • That is most helpful - thank you. Just as a follow up...I set my start_2 equal to end_1 to see if that would be a more efficient way of running the code...but even after I set it back to the date in the original question, Obs_1 and Obs_2 are now being returned as dataframes not as a series. What have I done wrong? Thanks! – harrison10001 Mar 23 '20 at 18:54
  • I can't really reproduce your behavior, so not sure. but if you look at doing a lot of operation like this with consecutive rows, then `df.shift(-1)/df -1` will do all of them at once I think – Ben.T Mar 23 '20 at 19:04
  • 1
    Fair enough. Thanks very much! – harrison10001 Mar 23 '20 at 19:06

0 Answers0