I have the following dataframes
:
df1
:
ZIP code | Other columns |
---|---|
1011AA | ... |
1011AA | ... |
2316XH | ... |
5815NE | ... |
df2
:
starting value ZIP code range | last value ZIP code range | Province |
---|---|---|
1000 | 1200 | North-Holland |
1201 | 1500 | South-Holland |
1501 | 1570 | North-Holland |
1571 | 1600 | Den Haag |
I want to:
- Get the first four digits of
df1["ZIP code"]
- Check if these four digits are present in any range in
df2["starting value ZIP code range"]
anddf["last value ZIP code range"]
- If there is a match, get
df2["Province"]
and add this value to a column indf1
.
The difficulty is that I need to compare this to a range of values and I can only use the first 4 digits of the string. Most examples I found on stackoverflow compare based on a single value. The desired result is:
ZIP code | New column |
---|---|
1011AA | North-Holland |
1011AA | North-Holland |
2316XH | Haarlem |
5815NE | Utrecht |
Bonus points if you can do it using map. For example, df1["New column"] = df1["ZIP code"].str[:4].map(... ? ...)
. However, if the map
method is a bad idea please suggest a better method.