0

I have four data frames with different different data (assume the column may be same in some cases for each of this data frame) .

data frame df0 df1 df3 df4

df0

amountC1    directionC1 index_priceC1   instrument_nameC1   ivC1    priceC1 timestampC1 trade_idC1  trade_seqC1
3   sell    6592.88 BTC-10APR20-7000-C  115.34  0.0675  26-03-2020 08:34    69925302    1
0.1 buy     6688.46 BTC-10APR20-7000-C  112.57  0.07    26-03-2020 17:03    69971504    2
10  sell    6806.04 BTC-10APR20-7000-C  114.33  0.077   27-03-2020 03:11    70020981    3
0.1 sell    6788.24 BTC-10APR20-7000-C  113.92  0.0755  27-03-2020 04:47    70027722    4
1.5 sell    6686.07 BTC-10APR20-7000-C  113.98  0.069   27-03-2020 05:20    70036646    5
0.5 buy     6708.57 BTC-10APR20-7000-C  105.29  0.0635  27-03-2020 08:04    70053020    6

df1

amountC2    directionC2 index_priceC2   instrument_nameC2   ivC2    priceC2 timestampC2 trade_idC2  trade_seqC1
3   sell    6592.88 BTC-10APR20-7200-C  110.34  0.0675  26-03-2020 08:39    69925302    1
0.1 buy     6688.46 BTC-10APR20-7200-C  112.57  0.07    26-03-2020 17:03    69971504    2
10  sell    6806.04 BTC-10APR20-7200-C  114.33  0.077   27-03-2020 03:11    70020981    3
0.1 sell    6788.24 BTC-10APR20-7200-C  110.92  0.0755  27-03-2020 04:47    70027722    4
1.5 sell    6686.07 BTC-10APR20-7200-C  113.98  0.069   27-03-2020 05:20    70036646    5
0.5 buy     6708.57 BTC-10APR20-7000-C  105.29  0.0635  27-03-2020 08:04    70053020    6
0.5 buy     6708.57 BTC-10APR20-7000-C  105.29  0.0635  27-03-2020 08:04    70053020    7

I wish to perform merge or join of this data frame column wise and below is the output i wish to get

amountC1    block_trade_idC1    directionC1 index_priceC1   instrument_nameC1   ivC1    priceC1 tick_directionC1    timestampC1 trade_idC1  trade_seqC1 amountC2    directionC2 index_priceC2   instrument_nameC2   ivC2    priceC2 tick_directionC2    timestampC2 trade_idC2  trade_seqC2
    3       sell    6592.88 BTC-10APR20-7000-C  115.34  0.0675  1   26-03-2020 08:34    69925302    1   5   sell    6607.04 BTC-10APR20-7250-C  116.75  0.057   1   46:41.0 69926125    1
    0.1     buy 6688.46 BTC-10APR20-7000-C  112.57  0.07    0   26-03-2020 17:03    69971504    2   0.1 buy 6685.48 BTC-10APR20-7250-C  112.7   0.057   1   03:31.7 69971444    2
    10      sell    6806.04 BTC-10APR20-7000-C  114.33  0.077   0   27-03-2020 03:11    70020981    3   0.2 sell    6708.99 BTC-10APR20-7250-C  104.17  0.05    2   22:40.0 70054437    3
    0.1     sell    6788.24 BTC-10APR20-7000-C  113.92  0.0755  2   27-03-2020 04:47    70027722    4   0.5 buy 6703.15 BTC-10APR20-7250-C  101.21  0.0475  2   27:01.8 70054899    4
    1.5     sell    6686.07 BTC-10APR20-7000-C  113.98  0.069   2   27-03-2020 05:20    70036646    5   0.5 sell    6709.54 BTC-10APR20-7250-C  94.8    0.043   2   42:11.6 70056479    5
    0.5     buy 6708.57 BTC-10APR20-7000-C  105.29  0.0635  2   27-03-2020 08:04    70053020    6   0.5 buy 6710.71 BTC-10APR20-7250-C  95.39   0.0435  0   42:47.1 70056546    6
    0.5     buy 6713.7  BTC-10APR20-7000-C  102.11  0.0615  2   20:06.1 70054217    7   0.5 buy 6699.43 BTC-10APR20-7250-C  90.78   0.0395  2   23:36.3 70059362    7
    0.2     sell    6704.46 BTC-10APR20-7000-C  102.2   0.061   2   27:02.5 70054901    8   0.5 buy 6699.43 BTC-10APR20-7250-C  90.78   0.0395  3   23:36.3 70059363    8
                                                0.5 buy 6691.95 BTC-10APR20-7250-C  89.92   0.0385  2   32:55.2 70059866    9
                                                0.5 buy 6697.59 BTC-10APR20-7250-C  90.27   0.039   0   35:42.4 70060036    10

I tried this methods

df = pd.concat([df0,df1,df3,df4], ignore_index=True)

df = [df0, df1, df3, df4]

Both adds the data frame one by one below.

I want the copy of each data frame column wise right side. No need to do any column match just merge or join .

How to accomplish this using pandas dataframe?

Please note the df1 df2 and other data frame data must be filled from last to first just to keep the recent data of all time frame has value. example df0 has total 800 records df1 has 250 and df2 has 200 records so in this has the df1 data must be filled from record number 550 t0 800 df2 data must be filled from record number 600 to 800 so that i will have data to plot the recent changes of price for all data frames

how to do this with the merge and join

Parfait
  • 104,375
  • 17
  • 94
  • 125
Marx Babu
  • 750
  • 3
  • 11
  • 34
  • 1
    Use pd.concat with axis=1 parameter. `pd.concat([df1, df2, df3], axis=1)` – Scott Boston Apr 04 '20 at 18:24
  • Hi Marx Babu! Checkout this SO thread for tips on improving pandas questions: [link](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). General idea: If you make it so we can copy-paste your DataFrames into a python script, it's much easier to answer your questions. Sometimes this means creating example DataFrames that are smaller and easier to work with. – luc Apr 04 '20 at 18:31
  • sure i will do now the requirment got updated – Marx Babu Apr 04 '20 at 19:00

2 Answers2

0

Use the axis parameter of pd.concat() to join along columns instead of rows:

import pandas as pd

df1 = pd.DataFrame({
    'Col 1': [1, 2, 3]
})

df2 = pd.DataFrame({
    'Col 2': [4, 5, 6]
})

pd.concat([df1, df2], axis=1)

Resulting Table:

   Col 1  Col 2
0      1      4
1      2      5
2      3      6

References:

luc
  • 526
  • 3
  • 6
0

In addition to luc's answer, you can also merge on the index of each. For example:

Create a sample dataframe df1

import pandas as pd
df1 = pd.DataFrame(columns=["c1","c1a","c1b"], data = [[1,2,3],[4,5,6],[7,8,9]])
print(df1)

# | c1 | c1a | c1b |
# |----|-----|-----|
# | 1  | 2   | 3   |
# | 4  | 5   | 6   |
# | 7  | 8   | 9   |

Create a sample dataframe df2:

df2 = pd.DataFrame(columns=["c2","c2a","c2b"], data = [["a","b","c"],["d","e","f"],["g","h","i"]])
print(df2)

# | c2 | c2a | c2b |
# |----|-----|-----|
# | a  | b   | c   |
# | d  | e   | f   |
# | g  | h   | i   |

You can just merge these tables like so:

df3 = df1.merge(df2, left_index=True, right_index=True)
print(df3)

# | c1 | c1a | c1b | c2 | c2a | c2b |
# |----|-----|-----|----|-----|-----|
# | 1  | 2   | 3   | a  | b   | c   |
# | 4  | 5   | 6   | d  | e   | f   |
# | 7  | 8   | 9   | g  | h   | i   |

Note that the columns are merged based on index, preserving the original order of both df1 and df2.

sc4s2cg
  • 153
  • 9