Python - Create new column using dataframe lookup

Asked Nov 06 '22 at 17:25

Active Nov 06 '22 at 17:29

Viewed 13 times

I have 5 million records with address field in df['address'], and Top 1000 UK cities info in cities[['City', 'Region', 'Population']].

I want to make new columns for region and population. My current command is too slow: string contains for all 1000 cities in all 5 million records means 5 billion boolean calculations.

Is there some way to do a better lookup here?

This code works but is very slow.

df['Region'], df['Pop'] = np.nan, np.nan

for i in range(0, 1000):
    df.loc[df['City'].str.contains(cities.iloc[i, 0]),
           ['Region', 'Pop']] = cities.iloc[i, [1, 2]].values

edited Nov 06 '22 at 17:29

mozway

194,879
13
39
75

asked Nov 06 '22 at 17:25

Robbie G

1

please provide a reproducible sample of both dataframes – mozway Nov 06 '22 at 17:27

Python - Create new column using dataframe lookup

0 Answers0