0

I have a large data set with 15+ different columns of data. There are 2 separate columns in my data set labeled ("BASC_7y") and ("BASC_12y"). I need to bin my data based on the following conditions:

BASC_7y>=60 & BASC_12y>=60
BASC_7y>=60 & BASC_12y<60
BASC_7y<60 & BASC_12y>=60
BASC_7y<60 & BASC_12y<60

I realized I can't just make these conditions into categorical variables because there is data in other columns that I need for each of these conditions.

I also want to rename these bins as "High Persistent", "Decreased Threshold", "Increased Threshold", and "Low Persistent".

I ultimately need to put this into a regression, but how did I bin this data and create named labels for each bin?

  • 1
    There are multiple ways to accomplish this. What have you tried and where are you getting stuck? It'd be helpful if you can post example data, expected output, and code for whatever attempts you've started already. – andrew_reece Mar 13 '23 at 20:54
  • 1
    See `?dplyr::case_when` for a friendly approach. – Gregor Thomas Mar 13 '23 at 20:55
  • Please take the tour: https://stackoverflow.com/tour And provide a minimum, reproducible example: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – John Polo Mar 13 '23 at 21:58

0 Answers0