I am trying to build a predictive classification model using the Binary Logistic Regression and Penalized LASSO which eventually I will compare both models. The thing is that I am trying to understand more the data and run some tests before applying the models such as multicollinearity test but the data types are being converted incorrectly.
The data set consists of both of numeric and factor variables. I have imported the data in r from a csv file and before importing the data I have changed all the variables which were "factors" to "numeric" manually. I have selected specifically which columns I want from the whole data set, but when this is done the matrix should be numeric using as.matrix but this is not the case.
Data<- read.csv("Test.csv")
names(Data)
attach(Data)
dim(Data)
sapply(Data,class)
ChurnFlag <- ifelse(ChurnedFlag=="Y",1,0)
#combinding all the new created variables
DataMat <- as.matrix(cbind(Data,ChurnFlag))
#selecting specifically which variables I want to analyse which are all
numeric/integer
DataMatRed <- as.matrix((DataMat[,c(4:8,10:73,92)]))
DataMatRedNum <- mapply(DataMatRed,FUN=as.numeric)
#defining the matrix as numeric
is.numeric(DataMatRedNum) #checking that it is numeric
DataMatDF <- as.data.frame(DataMatRed)
DataMatDF2 <- data.frame(DataMatRed,row.names = NULL,check.rows = FALSE,check.names = TRUE) /*
I expect to have the a numeric matrix not character because when trying to run the colldiag
function in R it is not working and the error is as follows:
Error in svd(X) : infinite or missing values in 'x'
and i have checked if I have any missing values and there are no missing values