1

I need to calculate the percentage change in population for just Albania country. Also, I need to sum the rows of different years to one row for each year not 12. I tried the following code but I don't know how to handle the year part.


    structure(list(country = c("Albania", "Albania", "Albania", "Albania", 
    "Albania", "Albania", "Albania", "Albania", "Albania", "Albania", 
    "Albania", "Albania", "Albania", "Albania", "Albania", "Albania", 
    "Albania", "Albania", "Albania", "Albania", "Albania", "Albania", 
    "Albania", "Albania"), year = c(1985L, 1985L, 1985L, 1985L, 1985L, 
    1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1986L, 1986L, 
    1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 1986L, 
    1986L), population = c(277900, 246800, 267500, 298300, 138700, 
    34200, 301400, 264200, 296700, 325800, 132500, 21100, 283900, 
    252100, 273200, 304700, 141700, 34900, 306700, 269000, 302000, 
    331600, 134800, 21400), pct.chg = c(NA, -11.1910759265923, 8.38735818476499, 
    11.5140186915888, -53.5031847133758, -75.3424657534247, 781.286549707602, 
    -12.342402123424, 12.3012869038607, 9.80788675429727, -59.3308778391651, 
    -84.0754716981132, NA, -11.2011271574498, 8.36969456564855, 11.5300146412884, 
    -53.495241220873, -75.3705010585744, 778.796561604585, -12.292142158461, 
    12.2676579925651, 9.80132450331126, -59.3486127864897, -84.1246290801187
    )), row.names = c(NA, -24L), groups = structure(list(year = 1985:1986, 
        .rows = structure(list(1:12, 13:24), ptype = integer(0), class = c("vctrs_list_of", 
        "vctrs_vctr", "list"))), row.names = 1:2, class = c("tbl_df", 
    "tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
    "tbl_df", "tbl", "data.frame"))


    df <- comp %>% 
      filter(country == 'Albania') %>% 
      select(country, year, population) %>% 
      group_by(year) %>% 
      mutate(pct.chg = 100 * (population - lag(population))/lag(population))


  [1]: https://i.stack.imgur.com/U3d2E.jpg
Ama
  • 43
  • 9
  • Welcome to Stack Overflow. Please [make this question reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by including example data in a plain text format - for example the output from `dput(yourdata)`. We cannot copy/paste data from images. – neilfws Nov 23 '20 at 22:32
  • You have multiple rows for each year. In the final output you want only one row for each year? What do you want to do for all the `population` values? take their sum, mean? Do you need `comp %>% filter(country == 'Albania') %>% group_by(year) %>% summarisepop = sum(population))` ? – Ronak Shah Nov 24 '20 at 04:16

1 Answers1

0

Perhaps, we need summarise

library(dplyr)
comp %>% 
    filter(country == 'Albania') %>%
    select(country, year, population) %>% 
    group_by(year) %>% 
    summarise(pct.chg = sum(100 * (population - lag(population))/lag(population)))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • It shows error `summarise()` ungrouping output (override with `.groups` argument) and the result of percentage change are NA in all rows. – Ama Nov 23 '20 at 23:30
  • @Ama Can you update your post with `dput` of that example data so that I can test – akrun Nov 23 '20 at 23:34
  • I don't think addition should be done *after* the percentage calculation. That metric makes no sense. perhaps, `summarise(population = sum(population)) %>% mutate(pct=100*(population-lag(population))/lag(population))` is what @Ama: is looking for? – Dayne Nov 24 '20 at 03:02