0

Thanks in advance to the community for the wealth of knowledge you all provide.

I wrote the below function to calculate a point total for men as done in the Framingham Heart Study (total points used to determine a 10 yr risk score). What I'm trying to do now is apply this function to a data frame with a set of test patients to get the point total for each individual (i.e. by row). I'm assuming I need a for loop as I've tried apply(), mapply(), etc. and I'm not having any luck with that approach. Any thoughts on how to get the function to pull in the individual values for each attribute into the function by patient? I'm new to coding so I apologize if this is a very ignorant question (I tried searching with very limited success).

risk_men <- function(total_cholesterol, age, taking_bp_medication, systolic_bp, smoke, diabetes, hdl_cholesterol)
{
    tot_points = 0
    
    # Adding points based off of total cholesterol
    if(total_cholesterol < 160) tot_points = tot_points + 0
    else if(total_cholesterol >= 160 & total_cholesterol < 199) tot_points = tot_points + 1
    else if(total_cholesterol >= 200 & total_cholesterol < 239) tot_points = tot_points + 2
    else if(total_cholesterol >= 240 & total_cholesterol < 279) tot_points = tot_points + 3
    else if(total_cholesterol >= 280) tot_points = tot_points + 4
    
    # Adding or subtracting points based off of age
    if(age >= 30 & age < 34) tot_points = tot_points + 0
    else if(age >= 35 & age < 39) tot_points = tot_points + 2
    else if(age >= 40 & age < 44) tot_points = tot_points + 5
    else if(age >= 45 & age < 49) tot_points = tot_points + 6
    else if(age >= 50 & age < 54) tot_points = tot_points + 8
    else if(age >= 55 & age < 59) tot_points = tot_points + 10
    else if(age >= 60 & age < 64) tot_points = tot_points + 11
    else if(age >= 65 & age < 69) tot_points = tot_points + 12
    else if(age >= 70 & age < 74) tot_points = tot_points + 14
    else if(age >= 75) tot_points = tot_points + 15
    
    # Adding points for systolic blood pressure stratified by whether or not pt on bp meds
    if(taking_bp_medication == 0)
    {
        if(systolic_bp < 120) tot_points = tot_points - 2
        else if(systolic_bp >= 120 & systolic_bp < 129) tot_points = tot_points + 0
        else if(systolic_bp >= 130 & systolic_bp < 139) tot_points = tot_points + 1
        else if(systolic_bp >= 140 & systolic_bp < 159) tot_points = tot_points + 2
        else if(systolic_bp >= 160) tot_points = tot_points + 3
    } else if(taking_bp_medication == 1)
    {
        if(systolic_bp < 120) tot_points = tot_points + 0
        else if(systolic_bp >= 120 & systolic_bp < 129) tot_points = tot_points + 2
        else if(systolic_bp >= 130 & systolic_bp < 139) tot_points = tot_points + 3
        else if(systolic_bp >= 140 & systolic_bp < 159) tot_points = tot_points + 4
        else if(systolic_bp >= 160) tot_points = tot_points + 5
    }

    # Adding points for smoking
    if(smoke == 0) tot_points = tot_points + 0
    else if(smoke == 1) tot_points = tot_points + 4
    
    # Adding points for diabetes
    if(diabetes == 0) tot_points = tot_points + 0
    else if(diabetes == 1) tot_points = tot_points + 3
    
    # Adding or subtracting points based on HDL cholesterol levels
    if(hdl_cholesterol >= 60) tot_points = tot_points - 2
    else if(hdl_cholesterol >= 50 & hdl_cholesterol < 59) tot_points = tot_points - 1
    else if(hdl_cholesterol >= 45 & hdl_cholesterol < 49) tot_points = tot_points + 0
    else if(hdl_cholesterol >= 35 & hdl_cholesterol < 44) tot_points = tot_points + 1
    else if(hdl_cholesterol < 35) tot_points = tot_points + 2
    
    return(tot_points)
}

The patients data frame with the test patients is shown here:

id sex total_cholesterol age systolic_bp diastolic_bp smoke bmi diabetes taking_bp_medication hdl_cholesterol ldl_cholesterol
1 2 323 39 131.5 85 1 24.79 0 0 NA NA
2 1 264 49 127.5 81 0 25.16 0 0 68 152
3 1 200 57 117.5 80 0 25.41 0 0 NA NA
4 1 260 41 137.5 80 1 26.89 0 0 NA NA
5 2 312 62 162.5 93.5 0 25.33 0 1 NA NA
6 1 260 41 120 72.5 1 26.36 0 0 46 221
animuson
  • 53,861
  • 28
  • 137
  • 147
  • You should use vectorized functions, not 1-value-at-a-time functions. `ifelse` is a vectorized `if() {} else{}`, but for binning numeric values you just need the `cut` function. Check out the [FAQ on binning values](https://stackoverflow.com/q/5570293/903061https://stackoverflow.com/q/5570293/903061). – Gregor Thomas Mar 08 '21 at 21:56

2 Answers2

1

Try something like this:

risk_men_vectorized <- function(total_cholesterol, age, taking_bp_medication, systolic_bp, smoke, diabetes, hdl_cholesterol)
{
   
    chol_points = cut(total_cholesterol, breaks = c(0, 160, 200, 240, 280, Inf), labels = 0:4, right FALSE)
    age_points = cut(age, breaks = c(0, seq(35, 70, by = 5), Inf), labels = c(0, 2, 5, 6, 8, 10, 11, 12, 14, 15), right = FALSE)
    
    # Adding points for systolic blood pressure stratified by whether or not pt on bp meds
    if(taking_bp_medication == 0) {
        bp_points = cut(systolic_bp, breaks = c(0, 120, 130, 140, 160, Inf), labels = c(-2, 0, 1, 2, 3), right = FALSE)
    } else if(taking_bp_medication == 1) {
        bp_points = cut(systolic_bp, breaks = c(0, 120, 130, 140, 160, Inf), labels = c(0, 2, 3, 4, 5), right = FALSE)
    }

    hdl_points = cut(hdl_cholesterol , breaks = c(0, 35, 45, 50, 60, Inf), labels = c(2, 1, 0, -1, -2), right FALSE)

    ## cut returns factors so we need to convert to integer before adding
    tot_points = as.integer(as.character(chol_points)) +
        as.integer(as.character(age_points)) +
        as.integer(as.character(bp_points)) +|
        as.integer(as.character(hdl_points)) +|    

    # Adding points for smoking
    ## we don't need an if() if the action is "do nothing" (like add 0)
    if(smoke == 1) tot_points = tot_points + 4
    
    # Adding points for diabetes
    if(diabetes == 1) tot_points = tot_points + 3
    
    return(tot_points)
}

This function is vectorized, so you can call it with:

your_data$points = with(your_data, risk_men_vectorized(total_cholesterol, age, taking_bp_medication, systolic_bp, smoke, diabetes, hdl_cholesterol))
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • 1
    Tthis is amazing! My coding world is so small lol... thank you so much for showing me this technique. The only thing I'd still like to figure out is managing NAs. A few of my patients have unrecorded values for certain columns (i.e. hdl_cholesterol) so no point total is calculated. That seems appropriate to me though since it'd be misleading to calculate the point total without factoring in all variables. Thanks Gregor! I really appreciate the help :D – Kareem Abdalla Mar 08 '21 at 22:57
0

Are you comfortable that your function's arguments will always match the column names of your data.frame?

If so rowwise might be your friend (I would also add triple dots ... to your function's signature in case things change in the future)

your.data %>% rowwise() %>% do.call( what=risk_men )

this would iterate over your data.frame row by row, giving each row (which incidently also is a list) do do.call which takes it as named arguments to the function risk_men. Which should be what you ask for.

Sirius
  • 5,224
  • 2
  • 14
  • 21
  • Oh I like this approach! Thanks for the recommendation. And yes, the function arguments will always match the column names. When I try it on the patients data frame though I end up getting an unused arguments error. Any thoughts on why that may be the case? patients %>% rowwise() %>% do.call(what = risk_men) Error in (function (total_cholesterol, age, taking_bp_medication, systolic_bp, : unused arguments (id = c(1, 2, 3, 4, 5, 6), sex = c(2, 1, 1, 1, 2, 1)... – Kareem Abdalla Mar 08 '21 at 22:31
  • Yes, see my second paragraph. there are columns in the data.frame that are not part of the argument set of the function. Add dots, and they safely go away to nowhere. So `risk_men <- function( , ... )` <-- three dots as a final argument. This will allow for additinoally provided unspecified arguments, that you may or may not make use of (_may not_ in this case) – Sirius Mar 08 '21 at 22:59