If value in column is duplicated change value in another column

Question

I have something like this:

> x=data.frame(index=c("a","a","b","c","c","c","d"),values=c(20,10,15,5,15,5,20))
> x
  index values
1     a     20
2     a     10
3     b     15
4     c      5
5     c     15
6     c      5
7     d     20

and what I need is:

> x_i_need=data.frame(index=c("a","a","b","c","c","c","d"),values=c(10,10,15,5,5,5,20))
> x_i_need
  index values
1     a     10
2     a     10
3     b     15
4     c      5
5     c      5
6     c      5
7     d     20

So all a have the lowest value among a, c of all c and so on. In other words, I want to change value in column values to the lowest among all its duplicates from columnd index. I found the way to filter my data to get only duplicates:

> x[duplicated(x$index) | duplicated(x$index, fromLast=TRUE),]
  index values
1     a     20
2     a     10
4     c      5
5     c     15
6     c      5

But I am not sure where to go from there. Is there any way to achieve it on my full dataset without extracting duplicates? Is there any fucntion that?

Try `x$values <- with(x, ave(values, index, FUN = min))` – Sotos Feb 27 '20 at 11:07 — Sotos, Feb 27 '20 at 11:07

MKR · Answer 1 · 2020-02-27T12:49:22.100

1

Using dplyr:

library(tidyverse)
x %>% group_by(index) %>% 
  summarise(values = min(values)) %>% 
  left_join(x, by="index") %>% 
  select(index, values.x)

EDIT

As Sotos pointed out, this is not the most direct way of doing it. This would be much better:

x %>% group_by(index) %>% 
  mutate(values = min(values))

edited Feb 27 '20 at 12:49

answered Feb 27 '20 at 10:53

MKR

1,620
7
20

1

FYI you can do `mutate(values = ...)` instead of summarise and avoid joining and selecting – Sotos Feb 27 '20 at 11:08
How would that look like? – MKR Feb 27 '20 at 11:26
What do you mean? Just do it and see... – Sotos Feb 27 '20 at 12:40
Thanks for pointing this out :) – MKR Feb 27 '20 at 12:49

score 1 · Answer 2 · answered Feb 27 '20 at 10:58

d <- data.frame(index=c("a","a","b","c","c","c","d"),values=c(20,10,15,5,15,5,20))
d$values <- tapply(d$values, d$index, min)[d$index]
d

# index values
# 1     a     10
# 2     a     10
# 3     b     15
# 4     c      5
# 5     c      5
# 6     c      5
# 7     d     20

If value in column is duplicated change value in another column

2 Answers2