0

I have something like this:

> x=data.frame(index=c("a","a","b","c","c","c","d"),values=c(20,10,15,5,15,5,20))
> x
  index values
1     a     20
2     a     10
3     b     15
4     c      5
5     c     15
6     c      5
7     d     20

and what I need is:

> x_i_need=data.frame(index=c("a","a","b","c","c","c","d"),values=c(10,10,15,5,5,5,20))
> x_i_need
  index values
1     a     10
2     a     10
3     b     15
4     c      5
5     c      5
6     c      5
7     d     20

So all a have the lowest value among a, c of all c and so on. In other words, I want to change value in column values to the lowest among all its duplicates from columnd index. I found the way to filter my data to get only duplicates:

> x[duplicated(x$index) | duplicated(x$index, fromLast=TRUE),]
  index values
1     a     20
2     a     10
4     c      5
5     c     15
6     c      5 

But I am not sure where to go from there. Is there any way to achieve it on my full dataset without extracting duplicates? Is there any fucntion that?

Alexandros
  • 331
  • 1
  • 14

2 Answers2

1

Using dplyr:

library(tidyverse)
x %>% group_by(index) %>% 
  summarise(values = min(values)) %>% 
  left_join(x, by="index") %>% 
  select(index, values.x)

EDIT

As Sotos pointed out, this is not the most direct way of doing it. This would be much better:

x %>% group_by(index) %>% 
  mutate(values = min(values))
MKR
  • 1,620
  • 7
  • 20
1
d <- data.frame(index=c("a","a","b","c","c","c","d"),values=c(20,10,15,5,15,5,20))
d$values <- tapply(d$values, d$index, min)[d$index]
d

# index values
# 1     a     10
# 2     a     10
# 3     b     15
# 4     c      5
# 5     c      5
# 6     c      5
# 7     d     20
r.user.05apr
  • 5,356
  • 3
  • 22
  • 39