0

Here is the first dataframe:

df
                     open   high   low  close   volume         tpv   cum_volume       cum_tpv      vwap
timestamp                                                                                              
2021-02-04 04:20:00  1.41   1.41  1.40   1.40   1400.0    1967.000       1400.0  1.967000e+03  1.405000
2021-02-04 04:25:00  1.40   1.40  1.40   1.40   5510.0    7714.000       6910.0  9.681000e+03  1.401013
2021-02-04 04:30:00  1.40   2.70  1.40   2.70   8283.0   16980.150      15193.0  2.666115e+04  1.754831
2021-02-04 04:35:00  2.80   3.99  2.79   3.02  53546.0  181520.940      68739.0  2.081821e+05  3.028588
2021-02-04 04:40:00  3.14   3.15  2.37   2.38  51866.0  143150.160     120605.0  3.513322e+05  2.913082
...                   ...    ...   ...    ...      ...         ...          ...           ...       ...
2021-02-04 19:35:00  9.25   9.49  9.20   9.36  46380.0  433421.100  315556036.0  2.575511e+09  8.161819
2021-02-04 19:40:00  9.36   9.53  9.32   9.53  32975.0  310789.375  315589011.0  2.575822e+09  8.161951
2021-02-04 19:45:00  9.51   9.82  9.32   9.79  59682.0  571156.740  315648693.0  2.576393e+09  8.162218
2021-02-04 19:50:00  9.79   9.96  9.50   9.96  46546.0  452892.580  315695239.0  2.576846e+09  8.162449
2021-02-04 19:55:00  9.85  10.09  9.76  10.09  89898.0  892237.650  315785137.0  2.577738e+09  8.162951

[188 rows x 9 columns]

Here is the second dataframe:

df2
                         SMA
timestamp                   
2021-02-04 04:40:00   2.0635
2021-02-04 04:41:00   2.1355
2021-02-04 04:42:00   2.1970
2021-02-04 04:43:00   2.2590
2021-02-04 04:44:00   2.3080
...                      ...
2021-02-04 18:56:00  10.2080
2021-02-04 18:57:00  10.1705
2021-02-04 18:58:00  10.1205
2021-02-04 18:59:00  10.0665
2021-02-04 19:00:00  10.0210

[861 rows x 1 columns]

I want to add the SMA column from df2 to df only if the index exist in both df and df2. Otherwise, place the SMA value to nan in df. The indexes from df will never be deleted. How can I do that?

I have tried this line merged_df = pd.merge(df, df2, left_index=True, right_index=True, how='inner'), but I got this result:

                      open   high    low  close    volume          tpv   cum_volume       cum_tpv      vwap      SMA
timestamp                                                                                                           
2021-02-04 04:40:00   3.14   3.15   2.37   2.38   51866.0   143150.160     120605.0  3.513322e+05  2.913082   2.0635
2021-02-04 04:45:00   2.50   2.70   2.50   2.63   24269.0    63099.400     144874.0  4.144317e+05  2.860635   2.3655
2021-02-04 04:50:00   2.56   2.58   2.42   2.57   11849.0    29622.500     156723.0  4.440542e+05  2.833369   2.6650
2021-02-04 04:55:00   2.57   2.58   2.42   2.50   16916.0    42290.000     173639.0  4.863442e+05  2.800892   2.7280
2021-02-04 05:00:00   2.54   2.60   2.52   2.54   10899.0    27901.440     184538.0  5.142456e+05  2.786665   2.5665
...                    ...    ...    ...    ...       ...          ...          ...           ...       ...      ...
2021-02-04 18:40:00  10.89  10.97  10.10  10.33  105233.0  1108629.655  314656226.0  2.566818e+09  8.157530  11.4100
2021-02-04 18:45:00  10.30  10.47   9.50   9.74  150192.0  1499667.120  314806418.0  2.568317e+09  8.158402  11.0620
2021-02-04 18:50:00   9.64   9.95   9.44   9.78  158926.0  1540787.570  314965344.0  2.569858e+09  8.159177  10.5835
2021-02-04 18:55:00   9.80  10.12   9.65   9.80   91473.0   904210.605  315056817.0  2.570762e+09  8.159678  10.2565
2021-02-04 19:00:00   9.80   9.86   9.60   9.81   67126.0   653135.980  315123943.0  2.571415e+09  8.160013  10.0210

As you can see it misses at least from the beginning of df. This solution does not work.

David
  • 13
  • 5

0 Answers0