Being a beginner in R, I need some help:
I have already written a function call it say fun(a,b,c) and returning say "d". a, b, c are values of columns in my dataset of 4m records. my function applies some logic and returns some value on "d", which I want to later add it to my dataset.
Please can someone help me with the syntax of 1. calling a function on a dataset with multiple arguments 2. add the new information in "d" to my dataset 3. efficient enough to handle 4m records.
Thanks in advance.
Please see below code
#hybrid FUNCTION
hybridfun <- function(df, lookup, df_year, df_name, df_id, lup_year, lup_name, lup_id_digit, lup_id_letter){
for (i in 1:nrow(lookup)){
df$new = "NOT_SURE"
if (df$df_year == lookup$lup_year)
if (df$df_name == lookup$lup_name)
if (substring(df$df_id, lookup$lup_id_digit, lookup$lup_id_digit) == lookup$lup_id_letter){
df$new = "HYBRID"
break
}
}
print(fuel_type)
}
hybridfun(data, lookup, "data_year", "data_name", "data_id", "lookup_year", "lookup_name", "lookup_id_digit", "lookup_id_letter")