0

I am working with a large dataset and I had to clean some rows, so the indices are now not followed, as some are missing. Now I have:

        A
2       2
5       4
7       5
8       6
17      6
21      8

No matter what the column A means, I had to work with it, split it and transform it. So, in the end, I had two variables resulting from that operations (A_case1, A_case2) where:

print(A_case1)

2    4
7    2
17   3
21   2

print(A_case2)

5   2
8   1

But now I want to merge these two variables and join to the original dataframe. So I want the final result to be:

        A   A_case1_Case2
2       2   4
5       4   2
7       5   2
8       6   1
17      6   3
21      8   2

I have already tried pd.concat but it is not possible to join to the dataframe. Can anyone help me, please?

Alex
  • 6,610
  • 3
  • 20
  • 38
bonaqua
  • 101
  • 7

1 Answers1

0

First merge the new dataframes A_case1 and A_case2 into the original one (df):

merged = df.merge(A_case1, left_index=True, right_index=True, how='left').merge(A_case2, left_index=True, right_index=True, how='left')

Then create your new column A_case1_case2 by joining the two intermediate ones A_case1 and A_case2:

merged['A_case1_case2'] = merged[['A_case1', 'A_case2']].apply(lambda x: ''.join(x.dropna().astype(int).astype(str)), 1)

And finally drop the intermediate columns A_case1 and A_case2:

merged = merged.drop(['A_case1', 'A_case2'], 1)
Gerd
  • 2,568
  • 1
  • 7
  • 20
  • when I apply the first line of code you wrote, it shows an error "can not merge DataFrame with instance of type " – bonaqua Mar 31 '20 at 19:41
  • Your data seems not to be a dataframe, but a series. Take a look at [this question](https://stackoverflow.com/questions/37968785/merging-two-dataframes) to resolve this. – Gerd Mar 31 '20 at 19:51