-1

I have two very large DataFrames which I have saved as a .csv file. I want to compare the column 'cell_code' from both files and create a third csv file which contains only the rows where the column 'cell_code' has the same value. Here is just a small snippet of the DataFrames.

First:

       technology    tac  cell_code        lon        lat
8298          LTE   6710    1924499  12.157236  47.586310
34566         LTE   4106     438273  16.541019  48.133775
34567         LTE   4106     835847  16.052141  48.077284
34640         LTE  17006     302595  16.389292  48.229125
34657         LTE  17007     238850  16.413752  48.189886

Second:

      technology      tac  cell_code        lon        lat
0            LTE     7602      56541  16.529176  47.834004
1            LTE     7602      56542  16.529176  47.834004
2            LTE     7602      302595 16.529176  47.834004
3            LTE     7602      18440  16.926798  47.838448
4            LTE     7602      438273 16.926798  47.838448

You see the first and the second df have the same 'cell_code' 438273 and 302595.

Expected result:

     technology    cell_code   tac_1    tac_2       lon_1       lon_2       lat_1      lat_2
1           LTE       438273    4106     7602   16.541019   16.926798   48.133775  47.838448
2           LTE       302595   17006     7602   16.389292   16.529176   48.229125  47.834004
    

Basically I need to make a third DataFrame which I will export later on, and it should be in this format. When it finds a matching 'cell_code' the tac, lon, and lat of both files should be in the same row.

Davidoff
  • 21
  • 5

1 Answers1

1
df1.merge(df2, on=['technology' 'cell_code'], suffixes=('_1', '_2'))
Tom Ron
  • 5,906
  • 3
  • 22
  • 38