Removing duplicates while preserving values

Question

I have a tabular table example here:

Jerry 2
John 3
Mark 4
John 1
Kevin 10

I want to remove duplicate entries, John in this case, but want to preserve the value and add them up. Not sure if that made sense, but it should look like:

Jerry 2
John 4
Mark 4
Kevin 10

Any ideas of how to do this in R? I know how to remove duplicates but not add up all of the duplicate values.

Thanks.

You can delete your post if you wish. – zx8754 Jun 15 '16 at 19:43 — zx8754, Jun 15 '16 at 19:43

score 4 · Answer 1 · answered Jun 14 '16 at 17:42

4

We can use aggregate and specify the FUN as sum

aggregate(col2~Name, df1, FUN = sum)
#    Name col2
#1 Jerry    2
#2  John    4
#3 Kevin   10
#4  Mark    4

Or with data.table

library(data.table)
setDT(df1)[, .(col2 = sum(col2)), by = Name]
#    Name col2
#1: Jerry    2
#2:  John    4
#3:  Mark    4
#4: Kevin   10

Or use dplyr

library(dplyr)
df1 %>%
    group_by(Name) %>%
    summarise(col2 = sum(col2))

data

df1 <- structure(list(Name = c("Jerry", "John", "Mark", "John", "Kevin"
 ), col2 = c(2L, 3L, 4L, 1L, 10L)), .Names = c("Name", "col2"), 
 class = "data.frame", row.names = c(NA, -5L))

answered Jun 14 '16 at 17:42

akrun

874,273
37
540
662

1

This is an obvious dupe, why not just hammer? – zx8754 Jun 14 '16 at 21:08
@zx8754 It already got dupe tagged by DavidArenburg – akrun Jun 15 '16 at 02:17
I was looking for an answer but wasn't using the right key words, so I did not find the duplicate post. Thanks for the answer. – AdrianP. Jun 15 '16 at 17:41
Of course I **noticed** it before I posted that comment. Point is you must already realize some posts are 100% dupe why post an answer? And you know very well how dupe hammering works. – zx8754 Jun 15 '16 at 19:42
@zx8754 WHen I search for dupes, I will always hammer with dupe. But, in most cases, getting the right dupes may be a challenge or I type the answer. – akrun Jun 16 '16 at 02:19
@zx8754 Forgot about the other important issue. At workplace, if I use LAN, I won't get google, and if I use wireless, SO website have problems. So, I normally use LAN connection. – akrun Jun 16 '16 at 07:27
I understand you might have reasons, but once it is established that it is 100% dupe, why not delete? If dupe target doesn't fully address the question, then why not add a new answer at the targeted post? Anyway, it is up to you. – zx8754 Jun 16 '16 at 07:32
@zx8754 I would say that dupe answers will make it more easy to find with search engines. – akrun Jun 16 '16 at 09:48

score 0 · Answer 2 · answered Jun 14 '16 at 17:59

0

Something similar to aggregate is ddply from plyr package

library(plyr)
ddply(df, c("Name"), function(x) sum(x$Value))


#   Name V1
#1 Jerry  2
#2  John  4
#3 Kevin 10
#4  Mark  4

answered Jun 14 '16 at 17:59

Ronak Shah

377,200
20
156
213

Removing duplicates while preserving values

2 Answers2

data