0

I have two data frames. First, saves coordinates and places

data = [['xx1', 1,10,'Madrid'], ['xx2',15,20,'London']]
df1 = pd.DataFrame(data, columns = ['precode', 'start', 'stop','city'])

The second

data2 = [['xx1', 3,11,'location1',], ['xx2',9,17,'location2'],['xx2',21,30,'location3'],['xx5',1,19,'location4']]
df2 = pd.DataFrame(data2, columns = ['precode', 'start', 'stop','locations'],)

I want to check if locations are in city. If so add in df2 the name of the cities

precode  start   stop     location       city
xx1      3       11       location1      Madrid
xx2      9       17       location2      Madrid;London # If more than one city affected, add as many as cities affected as found
xx2      21      30       location3      NaN   
xx5      1       19       location4      NaN

With the help of @Corralien I got this perfect solution from this post

df2['city'] = (
    df2.loc[df2['precode'].isin(df1['precode'])]   
       .reset_index().merge(df1, how='cross', suffixes=('', '_'))
       .query('start.between(start_, stop_) | stop.between(start_, stop_)')
       .groupby('index')['city'].apply(list).str.join(';')
)

However, In the cluster I am working, the last version they have is 1.5, so I can't use how=cross.

I have been working around and looking at other solutions but they are not exactly what I need.

  • Try `df2['city'] = ( df2.loc[df2['precode'].isin(df1['precode'])] .reset_index().assign(tmp=1).merge(df1.assign(tmp=1), on='tmp', suffixes=('', '_')).drop('tmp', 1) .query('start.between(start_, stop_) | stop.between(start_, stop_)') .groupby('index')['city'].apply(list).str.join(';') )` – jezrael Feb 11 '22 at 07:33
  • 1
    Busy at the moment, but I will try your suggestion and I will come back to you. Many thanks by the way! – Manolo Dominguez Becerra Feb 11 '22 at 09:15

0 Answers0