0

I have a table with 2 columns, "name" and "grade". In "name" column I store data that can be replicated couple of times. To imagine the problem, let's create a simple short table like the one below:

list <- data.frame(c("Natalia", "Alex", "Adam", "Natalia", "Natalia", "Alex", "Natalia", "Adam"), c(5, 6, 5, 4, 5, 4, 3, 4))
colnames(list) <- c("name", "grade")

I'd like to get a dataframe with two columns - a list of unique data from column "name" in first one and with a sum of grades for each name in second. The first column I created like that:

n_occur <- data.frame(table(list$name))

and it works - I have a column of unique names from previous table.
Unfortunately I have no idea how to count grades for each name. It's more or less sth like pseudocode below, but I don't know r syntax well, so it's a bit hard for me.

sum(list$grades) where (list$names == n_occur$X1)

I think that I should combine filter with select somehow, but I didn't manage to do that.

Shayan Shafiq
  • 1,447
  • 5
  • 18
  • 25
Natalia
  • 375
  • 3
  • 11
  • 3
    This is what you are looking for? http://stackoverflow.com/questions/1660124/how-to-sum-a-variable-by-group – Gopala Jul 18 '16 at 14:14

1 Answers1

1

Is this what you are looking for;

library(dplyr)
list%>%
   group_by(name)%>%
   summarise(sum(grade))
#Source: local data frame [3 x 2]

#     name sum(grade)
#   (fctr)      (dbl)
#1    Adam          9
#2    Alex         10
#3 Natalia         17
akrun
  • 874,273
  • 37
  • 540
  • 662
rar
  • 894
  • 1
  • 9
  • 24
  • Thanks! I have one more question. Let's say we have one more column, f.ex. number of week in which person got the grade. I'd like to group data so that for each person, for each week, I have the sum of their grades. – Natalia Jul 19 '16 at 07:36
  • Actually I managed to to that, so maybe I share the solution if anyone need it in future. Firstly I changed from dataframe to data.table `library(data.table) dt <- setDT(list)[, by=c("name", "week", "grade")][]` And then I used .SD like that: `dt <- dt[ , lapply(.SD, sum), by = c("week", "name")]` what gave me what I needed ;) – Natalia Jul 19 '16 at 08:43