0

I have a data frame containing threshold values for different data types:

threshold <- c(5, 10)
type <- c("type1", "type2")
threshold.df <- data.frame(type, threshold)

This gives:

> threshold.df
   type threshold
1 type1         5
2 type2        10

In another data frame, I have:

x <- rep(1:30, 2)
y <- x^2
type <- rep(c("type1", "type2"), each = 30)
my.df <- data.frame(x, y, type)

Which gives:

> head(my.df)
  x  y  type
1 1  1 type1
2 2  4 type1
3 3  9 type1
4 4 16 type1
5 5 25 type1
6 6 36 type1

Now, I want to replace all y values of type 1 where x is lower that the threshold by 0.

Using dplyr, I was thinking about something like my.df %>% group_by(type) %>% mutate(y = somefunction).

But then I'm stuck for the function implementation.

I know it can also be done using ave function, but end up with the same problem.

I would know how to do it with a loop, but I'm sure there are better ways with R.

Community
  • 1
  • 1
Ben
  • 6,321
  • 9
  • 40
  • 76
  • What does "all values of type 1" mean? X values? Rows? – Gopala May 17 '17 at 13:45
  • Does this: `my.df[type == 'type1' & my.df$x < threshold.df$threshold[threshold.df$type == 'type1'], ]` OR `my.df$x[type == 'type1' & my.df$x < threshold.df$threshold[threshold.df$type == 'type1']]` give you what you want? – Gopala May 17 '17 at 13:48
  • it's all y values where x < threshold. I edited the question – Ben May 17 '17 at 13:50
  • Your solution works for type1, but then I need to repeat for type2. If I have a lot of types, I wish I could automate it. – Ben May 17 '17 at 13:56
  • Your question says you only want to do it for 'type 1' where x is.... – Gopala May 17 '17 at 13:57

2 Answers2

1

I would just merge the data.

require(data.table)
setDT(threshold.df)
setDT(my.df)
my.df <- merge(my.df, threshold.df, by = 'type')
my.df[y < threshold, y := 0]
my.df[, threshold := NULL]
amatsuo_net
  • 2,409
  • 11
  • 20
  • Alternatively by chain `my.df[threshold.df, on = .(type)][y < threshold, y := 0][, threshold := NULL]` – mt1022 May 17 '17 at 14:02
1

Here is one way to do it with dplyr:

my.df %>%
  inner_join(., threshold.df) %>%
  mutate(y = ifelse(x < threshold & type == 'type1', 0, y)) %>%
  select(-threshold)

Result is something like this:

    x   y  type
1   1   0 type1
2   2   0 type1
3   3   0 type1
4   4   0 type1
5   5  25 type1
6   6  36 type1
7   7  49 type1
8   8  64 type1
9   9  81 type1
10 10 100 type1
11 11 121 type1
12 12 144 type1

If you want the threshold check to apply to all types and not just type1, you can do this:

my.df %>%
  inner_join(., threshold.df) %>%
  mutate(y = ifelse(x < threshold, 0, y)) %>%
  select(-threshold)
Gopala
  • 10,363
  • 7
  • 45
  • 77