2

My dataset contains 500 observations. Here is any example of the data structure:

df <- data.frame(rating_mean=c(3.6, 4.0, 3.7, 4.8, 3.9, 5.1, 4.1, 4.3 ),
             actual_truth=c("true", "false", "false", "true", "true", "false", "false", "true"))

I would like to return the 60 items with a rating_mean closest to the value of 3.5 for "true" stimuli, and the same for "false" stimuli (so a total of 120 items). So far I have this but it's not correct:

df50 <- df %>%   group_by(actual_truth) %>%   top_n(n = 60, wt = rating_mean - 3.5)

Thank you.

r2evans
  • 141,215
  • 6
  • 77
  • 149
Emma
  • 45
  • 3
  • `top_n(n=50, wt=abs(rating_mean-3.5))`. But if you want 60 items, why `n=50`? If this doesn't work, you'll need to be less vague by *"it's not correct"* (by including your expected output given this sample input). – r2evans Nov 04 '19 at 18:07
  • The expected output is a list of the 60 "true" stimuli with a rating_mean closest to 3.5. And the same for "false" stimluli. – Emma Nov 04 '19 at 18:09
  • Emma, perhaps you missed my point about *"given this sample input"*. Please **literally** provide the expected output as a `data.frame` given these 8 rows and (say) `n=2` or something. It might be informative to read about fully-reproducible, MWE questions; I suggest at least one of https://stackoverflow.com/questions/5963269, https://stackoverflow.com/help/mcve, and https://stackoverflow.com/tags/r/info. – r2evans Nov 04 '19 at 18:13
  • 1
    While this may be moot given my suggestion (in my first comment) and/or @akrun's answer, in which case please keep this in mind with your next question. – r2evans Nov 04 '19 at 18:14

1 Answers1

2

One option is to arrange by 'actual_truth' and the absolute difference between the 'rating_mean' and threshold value, then grouped by 'actual_truth', slice the first 60 observations

library(dplyr)
df %>% 
   arrange(actual_truth, abs(rating_mean - 3.5)) %>% 
   group_by(actual_truth) %>%
   slice(seq_len(60))
akrun
  • 874,273
  • 37
  • 540
  • 662