0

I have two csv files with data. They both have a common column (county). The first file just has the counties while the second file has counties together with their population. I have a script which I thought would be able to create a new column of population in the first file. Note that the order of the counties in both files is totally different.

File one:

Id County
1 Nairobi
2 Mombasa
3 Kisumu
4 Nakuru

File two:

Id County Population
1 Kisumu 1,250,200
2 Nairobi 4,560,700
3 Nakuru 2,673,800
4 Mombasa 3,167,900

I wanted to create a new column in the first table as Population and parse through the second table and pick population of each county, like in the table below.

Id County Population
1 Nairobi 4,560,700
2 Mombasa 3,167,900
3 Kisumu 1,250,200
4 Nakuru 2,673,800

Below is my code, I got a bit confused on how to execute that. Please help.


data = pd.read_csv('counties.csv');
county_names = data['COUNTY']

ref_data = pd.read_csv('kenya-population-by-sex-and-county.csv', skiprows=8, header=None)
ref_data.columns = ['County', 'Male', 'Female', 'Intersex', 'Total']

list_count = []
for item in county_names.tolist():
    compare = ref_data['County'].tolist()
    pop = ref_data['Total']
    if item in compare:
        list_count.append(item)
        pop
    else:
        print(item + " is not in list")```
Henry Yik
  • 22,275
  • 4
  • 18
  • 40
Steel8
  • 143
  • 1
  • 11

1 Answers1

-1

You can simply create a pandas dataframe containing County and Population and merge it with the first dataframe where only County was there. Also there are many joining options available to fulfill your different needs.