-1

I have 2 dataframes. sdf and df. In the dataframe df there are different columns with different information about measuring stations and in one column is the respective address. Also sdf has a column with addresses. I want to find out which address sdf and df are the same. However, I have the problem that the addresses are written differently. (Sometimes the street is written out, sometimes not). That's why I wanted to compare the first 5 digits of the street name. And if this matches in each case, it should create a new column with this match. Problem:

ValueError: Can only compare identically-labelled Series objects

i tried so many different things but i get always the same error or it doesnt work at all

df['Adresses_compared'] = (df['adress'].str[0:5]==sdf['adress_c'].str[0:5]) 

I Also tried it with mask

df['Adresses_compared']= sdf.mask(sdf['adress_c']== df1.loc[0:5, 'adress'])

In the end, I want to find a kind of intersection of the addresses that occur in both data frames.

Pawel Kam
  • 1,684
  • 3
  • 14
  • 30
Dilo
  • 1
  • 1
  • Please give sample dataset, would be good if you will add desire output as well. – R. Baraiya Feb 24 '23 at 15:17
  • Does this answer your question? [Error"Can only compare identically-labeled Series objects" and sort\_index](https://stackoverflow.com/questions/44773017/errorcan-only-compare-identically-labeled-series-objects-and-sort-index) – Pranav Hosangadi Feb 24 '23 at 15:51
  • Thank you but no, I think my problem was, that I wanted to compare over every column insted of going over ever row. – Dilo Feb 27 '23 at 14:33

1 Answers1

-1

Although, the question is not clear.

I am guessing the issue with your code is your try to compare whole column instead of went through each rows of both df.

Code:

import pandas as pd
df1 = pd.DataFrame({'adress': ['AAAA A']})

df2 = pd.DataFrame({'adress_c': ['AAAA BBB ASA', 'YF832', 'A']})


df1['Adresses_compared'] = df1['adress'].apply(lambda r : [l for l in list(df2['adress_c']) if r[:5]==l[:5]])

Output:

0    [AAAA BBB ASA]
Name: ID, dtype: object

Note: The new coumn will the have the value as a list then after you filter

R. Baraiya
  • 1,490
  • 1
  • 4
  • 17
  • Thank you that helped actually and now the Code is working. Yes I was also writing some loop so that the code is going over every row, but it didnt work. I am kind of new in python. Thanks for your help – Dilo Feb 27 '23 at 14:32
  • Hi @Dilo would you like to add your tried code and Error in your question ? – R. Baraiya Feb 27 '23 at 14:38