0

I'm trying to create a new dataframe which takes values in list 'cities' as rows, and values in 'countries' as columns. Then, I want to pull occurrences from a different dataset (here called 'IPUMS') to find the weighted sum of occurances where the city == city, and the country == bpld. Then, I want to divide that by the total occurances where country == bpld.

Basically, I want to find the percentage of people with birthplace (bpld) i who live in city j, and put it all in a dataframe.

Here's what I have, but I keep getting the following error:

Error in `[<-.data.frame`(`*tmp*`, i, j, value = 0.0182916205267204) : 
new columns would leave holes after existing columns

If you have any suggestions, or know a better way to do this please let me know!

#create vectors of cities and countries
cities <- unique(IPUMS$city)
countries <- unique(IPUMS$bpld)

new_dataframe <- data.frame(matrix(nrow = length(cities), ncol = length(countries)), row.names = cities)
colnames(new_dataframe) <- countries

for (i in cities) {
  for (j in countries) {
    new_dataframe[i, j] <- sum(IPUMS$perwtadj[IPUMS$city == i & IPUMS$bpld == j]) / 
      sum(IPUMS$perwtadj[IPUMS$bpld == j])
    }
  }
ouflak
  • 2,458
  • 10
  • 44
  • 49
ashep
  • 1
  • 2
  • 1
    I'd like to try to help, but it is hard to suggest a solution without being able to inspect example data. Are you able to share any data? It's easier to help you if you include a simple reproducible example with sample input and desired output that can be used to test and verify possible solutions (for example, with dput()). See the link for ways to improve your question: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Skaqqs Jan 12 '22 at 13:53
  • Note that we don't need **all** the data. A **minimal** example is best - could you share2 or 3 cities, a few corresponding rows of `IPUMS`, and then show exactly what output you want for that sample input? – Gregor Thomas Jan 20 '22 at 18:56

0 Answers0