I'm trying to create a new dataframe which takes values in list 'cities' as rows, and values in 'countries' as columns. Then, I want to pull occurrences from a different dataset (here called 'IPUMS') to find the weighted sum of occurances where the city == city, and the country == bpld. Then, I want to divide that by the total occurances where country == bpld.
Basically, I want to find the percentage of people with birthplace (bpld) i who live in city j, and put it all in a dataframe.
Here's what I have, but I keep getting the following error:
Error in `[<-.data.frame`(`*tmp*`, i, j, value = 0.0182916205267204) :
new columns would leave holes after existing columns
If you have any suggestions, or know a better way to do this please let me know!
#create vectors of cities and countries
cities <- unique(IPUMS$city)
countries <- unique(IPUMS$bpld)
new_dataframe <- data.frame(matrix(nrow = length(cities), ncol = length(countries)), row.names = cities)
colnames(new_dataframe) <- countries
for (i in cities) {
for (j in countries) {
new_dataframe[i, j] <- sum(IPUMS$perwtadj[IPUMS$city == i & IPUMS$bpld == j]) /
sum(IPUMS$perwtadj[IPUMS$bpld == j])
}
}