0

I have a df that has repeating ids:

   code_contract    days_may19
       1                8
       2                9
       3                7

       1                8
       2                9
       3                7

       1                8
       2                9
       3                7

I want to add a df 'jun19' but it has only code contract ids 1,2,3 matching with df above:

code_contract   days_jun_19
  1                9
  2                6
  3                56
  4                34
  5                12

the end result should be :

code_contract    days_may19      days_jun19
           1                8          9
           2                9          6
           3                7          56

           1                8          9
           2                9          6
           3                7          56

           1                8          9
           2                9          6
           3                7          56

how to join them?

my code - this final report has 89420 rows, 5 months under each other, each month is 17884*5 = 89420. each code_contract is repeating in every month, i need to have this df of repeating ids. I have another df 'jun19'.

df jun19 has 17884 rows, contract ids, it has one column 'max days as per jun19'. that is the column I 'm trying to add.

the problem is this column getting mapped only to the first 17884 rows, and all other ~71k rows is getting nans instead of same values.

final_report = pd.concat([jan19, feb19, march19, april19, may19], axis=0,ignore_index=True)
col_exp = final_report.filter(like='Макс').columns
final_report[col_exp] = final_report.groupby('code_contract')[col_exp].transform('first')

the code:

final_report = pd.concat([final_report, jun19], axis=1)
ERJAN
  • 23,696
  • 23
  • 72
  • 146
  • 1
    If need only one new column `map` working nice, if need more columns use `merge` with left join. – jezrael Jun 08 '20 at 10:24
  • @jezrael hi again jezrael.. the problem is not all code contract ids become filled. only the first 3 are filled, the rest are NAN – ERJAN Jun 08 '20 at 10:29
  • Is possible add your code to question? Also `code_contract` are index or columns ? – jezrael Jun 08 '20 at 10:30
  • @jezrael, yes i will add code, code_contract is column, not index – ERJAN Jun 08 '20 at 10:32
  • @jezrael i have added the code sir – ERJAN Jun 08 '20 at 10:37
  • thank you for code, so not working with data in question? Because I think this should working nice. – jezrael Jun 08 '20 at 10:42
  • @jezrael, no, it' s not working, i dont know how to add only 1 column from jun19 df – ERJAN Jun 08 '20 at 10:44
  • 1
    One idea, how working `final_report = pd.concat([jan19.reset_index(drop=True), feb19.reset_index(drop=True), march19.reset_index(drop=True), april19.reset_index(drop=True), may19.reset_index(drop=True)], axis=0,ignore_index=True)` instead your first line of code? – jezrael Jun 08 '20 at 10:46
  • the final report is ok, it's how it should be - repeating contract ids. i just need to add 1 column from jun19 – ERJAN Jun 08 '20 at 10:47
  • Without data with problems hard for me find what is wrong :( – jezrael Jun 08 '20 at 10:58

0 Answers0