0

This is the function i have written

factor_fun <- function(data1,vec){
  
  for(i in vec){
    data1[,i] = as.factor(data1[,i])
  }
}
x = c(3,5,7,8,9,12,15,17,18,19,21)

factor_fun(train_data,x)
str(train_data)

This is the result

> factor_fun(train_data,x)
> str(train_data)
'data.frame':   13645 obs. of  22 variables:
 $ EmpID                  : int  11041 15079 18638 3941 5936 9670 16554 3301 12236 10157 ...
 $ EmpName                : chr  "John" "William" "James" "Charles" ...
 $ LanguageOfCommunication: chr  "English" "English" "English" "English" ...
 $ Age                    : int  35 26 36 29 25 35 31 32 28 31 ...
 $ Gender                 : chr  "Male" "Male" "Female" "Female" ...
 $ JobProfileIDApplyingFor: chr  "JR85289" "JR87525" "JR87525" "JR87525" ...
 $ HighestDegree          : chr  "B.Tech" "B.Tech" "PhD" "BCA" ...
 $ DegreeBranch           : chr  "Electrical" "Artificial Intelligence" "Computer Science" "Information Technology" ...
 $ GraduatingInstitute    : chr  "Tier 1" "Tier 3" "Tier 1" "Tier 2" ...
 $ LatestDegreeCGPA       : int  7 7 6 5 8 9 7 8 6 8 ...
 $ YearsOfExperince       : int  12 3 6 6 2 12 1 9 2 8 ...
 $ GraduationYear         : int  2009 2018 2015 2015 2019 2009 2020 2012 2019 2013 ...
 $ CurrentCTC             : int  21 15 15 16 24 25 12 7 21 21 ...
 $ ExpectedCTC            : int  26 19 24 24 32 29 21 17 28 31 ...
 $ MartialStatus          : chr  "Married" "Married" "Single" "Married" ...
 $ EmpScore               : int  5 5 5 5 5 4 3 3 4 3 ...
 $ CurrentDesignation     : chr  "SSE" "BA" "SDE" "SDE" ...
 $ CurrentCompanyType     : chr  "Enterprise" "MidSized" "MidSized" "Startup" ...
 $ DepartmentInCompany    : chr  "Design" "Engineering" "Engineering" "Product" ...
 $ TotalLeavesTaken       : int  20 6 19 16 10 10 8 18 7 10 ...
 $ BiasInfluentialFactor  : chr  "YearsOfExperince" "" "Gender" "Gender" ...
 $ FitmentPercent         : num  95.4 67.1 91.3 72.3 86.3 ...

so i have written a function where it takes a vector and data and convert the vector matched columns into factor...when i ran it on my dataset itsnt converting columns into factors...can someone help me in this...and i know that we can use lapply or other functions...but it will be better if someone can explain me why this isnt working...thanks..this is my first question on stackoverflow..

Phil
  • 7,287
  • 3
  • 36
  • 66
Yagami_Light
  • 192
  • 1
  • 9
  • 1
    Remove `class()`. Also, please do not post screenshots of your code, but paste it in along with a reproducible example of the data set you are using it on. – Phil Jun 04 '21 at 15:52
  • Hi! Welcome to SO. Please take the time to convert your images into actual code blocks. And it would be best to make the entire thing reproducible. Here is a good intro to making it reproducible: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Dason Jun 04 '21 at 15:52
  • Sorry @Phil i have edited it now...i have added class later to check whether its converting it or not..but even without class() it isnt doing much help..did you fine any mistake..? – Yagami_Light Jun 04 '21 at 15:57
  • You need to save the result of the function `train_data <- factor_fun(train_data,x)` – Phil Jun 04 '21 at 16:08
  • @Phil The existing function returns NULL. – ktiu Jun 04 '21 at 16:10
  • 1
    @ktiu Yes because the OP removed the `return()` line in an edit after my comment. – Phil Jun 04 '21 at 16:12
  • @Phil thankyou...it worked..it looks silly from my side... – Yagami_Light Jun 04 '21 at 16:18

1 Answers1

1

It looks like you are expecting the function to change the object that you are passing to it in the parent environment. This is fundamentally not how R works.

One workaround would be to return data1 at the end of your function and assign it when called:

factor_fun <- function(data1,vec){
  for(i in vec){
    data1[,i] <- as.factor(data1[,i])
  }
  return(data1)
}

new_df <- factor_fun(df, 1:2)

Better yet, you could skip the for loop altogether, e. g. with the dplyr package:

factor_fun <- function(data, cols) {
  dplyr::mutate(data, across(all_of(cols), as.factor))
}

new_df <- factor_fun(df, 1:2)
ktiu
  • 2,606
  • 6
  • 20