0

I have a data frame called "Bycountry" like this:

Countries Norders 1 Algerie 1 2 France 2 3 Tunisie 3 4 Algerie 4 5 Allemagne 5 6 France 6

I want to do statistics on this dataframe:

  • in country column, a summary of the countries (no duplicate)
  • in numbers column, the sum by countries

I already installed plyr and dplyr packages so I know that I have to use mutate(), summarise(), group_by() but I don't know in which or and how.

   Otherbycountry <- data.frame(
          Countries = c("Algerie", "France", "Tunisie", "Algerie", 
          "Allemagne", "France"),
           Norders = c(1 , 2 , 3, 4, 5, 6))

The current result is a 1x1 tibble with the total sum of the numbers.

Jul
  • 15
  • 5
  • 1
    Hi @Jul, it will be easier to get help with this question if it is reproducible: https://stackoverflow.com/q/5963269/3277821 – sboysel May 02 '19 at 06:51
  • Hi @sboysel I just edited my question, can you help me on this? – Jul May 03 '19 at 13:02

1 Answers1

1

Code

library(dplyr)
Otherbycountry %>% 
    # grouping by country
    group_by(Countries) %>% 
    # sum of Norders column (for each group) 
    summarise(Norders_sum = sum(Norders)) %>% 
    # ungroup
    ungroup()
  • Type ?group_by, ?summarise and ?group_by inside R for more information about the functions.
  • Read the section about grouped summaries (R for Data Science by Garrett Grolemund and Hadley Wickham) for more detail.

Output

# # A tibble: 4 x 2
# Countries Norders_sum
# <fct>           <dbl>
# 1 Algerie             5
# 2 Allemagne           5
# 3 France              8
# 4 Tunisie             3