I'm trying to write a function to use the R tidymodels function initial_split with an argument that would let me change the strata to a different variable each time I call the function.
Using initial_split regularly like this works perfectly:
split_glab=initial_split(data,prop=0.7,strata=sp_glabrata)
Then I converted it to a function and plugged in my species parameter:
split_data=function(df,species){
initial_split(df,prop=0.7,strata=species)
}
split_data(data,species=sp_glabrata)
And get the following error:
Error: Can't subset columns that don't exist.
x Column `species` doesn't exist.
Of course, this column doesn't exist in my data since it's just an argument in my function --the column I'm trying to reference is called sp_glabrata. I can't figure out how to get my function to reference the column instead of the parameter. I don't want to just type the column name since I have to apply many similar functions to several columns and it would take forever.
Any guidance would be appreciated!