Sample data:
Group <- c("a", "a", "a", "b", "b", "b", "c", "c", "c")
value_1 <- c(1.10, 2.5, 1.7, 0.99, 1.50, 1.65, 2.5, 2.5, 1.5)
value_2 <- c(0.03, 1.3, 3.5, 0.02, 4.3, 1.2, 1.4, 1.4, 3.7)
new_variable_1 <- c(1,0,1, 1,1,0, 0,0,1)
df <- data.frame(Group, value_1, value_2, new_variable_1)
The output is new_variable_1
. I want to create a new_variable_1
based on following criteria; I am seeking 2 solutions.
Basic idea:
lookup the max value in
value_2
by group and create dummy variable based on values invalue_1
.
solution 1 Logic:
Find
max(value_2)
by group. E.g., the max value invalue_2
for groupa
is3.5
Find the corresponding
value_1
by group. E.g.,value_1
is1.7
in groupa
create
new_variable_1
by group that is1
ifvalue_1
is less than the corresponding value in the above step. E.g., for groupa
,value_1 <= 1.7
should show1
&value_1 > 1.7
should show0
.
solution 2 Logic:
Same as above, but increase the threshold value from step 2 by 10%.
the max value in
value_2
for groupa
is3.5
it then corresponds to value
1.7
value_1
in groupa
Increase the value by
10%
. For group a10%
in increase would be1.87
.Create
new_variable_1
: for group a,value_1 <= 1.87
should show1
&value_1 > 1.87
should show0
.
R, dplyr
, data.table
and most efficient R codes are welcome.
It's a large dataset so groups may have different length and Inf
or NA
may exist in value_2
.