-1

I have a dataframe in R of the following form:

           City     Province   Poupulation
1       Bandung     JABAR       500,000
2      Surabaya     JATIM       600,000
3        Malang     JATIM       350,000
4         Bogor     JABAR       400,000
5      Semarang     JATENG      550,000
6       Cirebon     JABAR       300,000
7        Madiun     JATIM       200,000
8          Solo     JATENG      275,000
9         Tegal     JATENG      290,000

What is the necessary code to compute the overall population from city in JATENG province only?

J_F
  • 9,956
  • 2
  • 31
  • 55
Damn Cold
  • 11
  • 2
  • http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega – jogo Oct 27 '16 at 14:55
  • 2
    Try `sum(df[df$Province=="JATENG","Poupulation"])` where `df` is your dataframe? – aichao Oct 27 '16 at 14:59

1 Answers1

0

Here is a dplyr solution:

library(dplyr)
df %>% 
  group_by(Province) %>% 
    summarise(sum(Poupulation))

#  Province      sum
#    <fctr>    <dbl>
#1    JABAR   700000
#2   JATENG  1115000
#3    JATIM  1150000

When you are only interested in the province JATENG, then this will do the job:

df %>% 
  filter(Province == "JATENG") %>% 
    summarise(sum = sum(Poupulation))
#      sum
#1 1115000

Perhaps you have to change the summarise function to summarise(sum = sum(as.numeric(Poupulation))).

J_F
  • 9,956
  • 2
  • 31
  • 55