-2

I have a tabular table example here:

Jerry 2
John 3
Mark 4
John 1
Kevin 10

I want to remove duplicate entries, John in this case, but want to preserve the value and add them up. Not sure if that made sense, but it should look like:

Jerry 2
John 4
Mark 4
Kevin 10

Any ideas of how to do this in R? I know how to remove duplicates but not add up all of the duplicate values.

Thanks.

AdrianP.
  • 435
  • 2
  • 4
  • 14

2 Answers2

4

We can use aggregate and specify the FUN as sum

aggregate(col2~Name, df1, FUN = sum)
#    Name col2
#1 Jerry    2
#2  John    4
#3 Kevin   10
#4  Mark    4

Or with data.table

library(data.table)
setDT(df1)[, .(col2 = sum(col2)), by = Name]
#    Name col2
#1: Jerry    2
#2:  John    4
#3:  Mark    4
#4: Kevin   10

Or use dplyr

library(dplyr)
df1 %>%
    group_by(Name) %>%
    summarise(col2 = sum(col2))

data

df1 <- structure(list(Name = c("Jerry", "John", "Mark", "John", "Kevin"
 ), col2 = c(2L, 3L, 4L, 1L, 10L)), .Names = c("Name", "col2"), 
 class = "data.frame", row.names = c(NA, -5L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    This is an obvious dupe, why not just hammer? – zx8754 Jun 14 '16 at 21:08
  • @zx8754 It already got dupe tagged by DavidArenburg – akrun Jun 15 '16 at 02:17
  • I was looking for an answer but wasn't using the right key words, so I did not find the duplicate post. Thanks for the answer. – AdrianP. Jun 15 '16 at 17:41
  • Of course I **noticed** it before I posted that comment. Point is you must already realize some posts are 100% dupe why post an answer? And you know very well how dupe hammering works. – zx8754 Jun 15 '16 at 19:42
  • @zx8754 WHen I search for dupes, I will always hammer with dupe. But, in most cases, getting the right dupes may be a challenge or I type the answer. – akrun Jun 16 '16 at 02:19
  • @zx8754 Forgot about the other important issue. At workplace, if I use LAN, I won't get google, and if I use wireless, SO website have problems. So, I normally use LAN connection. – akrun Jun 16 '16 at 07:27
  • I understand you might have reasons, but once it is established that it is 100% dupe, why not delete? If dupe target doesn't fully address the question, then why not add a new answer at the targeted post? Anyway, it is up to you. – zx8754 Jun 16 '16 at 07:32
  • @zx8754 I would say that dupe answers will make it more easy to find with search engines. – akrun Jun 16 '16 at 09:48
0

Something similar to aggregate is ddply from plyr package

library(plyr)
ddply(df, c("Name"), function(x) sum(x$Value))


#   Name V1
#1 Jerry  2
#2  John  4
#3 Kevin 10
#4  Mark  4
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213