column19 <- 19
mdf[,column19] <- lapply(mdf[,column19],function(x){as.numeric(gsub(",", "", x))})
this snippet works but also results in duplicate values
column19 <- 19
mdf[,column19] <- lapply(mdf[,column19],function(x){as.numeric(gsub(",", "", x))})
this snippet works but also results in duplicate values
If there is only a single column, we don't need lapply
mdf[, column19] <- as.numeric(gsub(",", "", mdf[, column19], fixed = TRUE))
The reason why the OP's code didn't work out as expected is because lapply
on a single column after converting it to a vector
(mdf[, column19]
) and loop through each of the single element of the column and return a list
. Now, we are assigning the output of list
back to that single column
column19 <- 19
mdf[,column19] <- lapply(mdf[,column19],function(x) as.numeric(gsub(",", "", x)))
Warning message: In
[<-.data.frame
(*tmp*
, , column19, value = list(27, 49, 510, : provided 5 variables to replace 1 variables
Instead, if we want to use the same procedure either keep the data.frame
structure by mdf[column19]
or mdf[, column19, drop = FALSE]
and then loop using lapply
. In this way, it will be a list
with a single vector
mdf[column19] <- lapply(mdf[column19],function(x) as.numeric(gsub(",", "", x)))
This is more related to dropping of dimensions when using [
on a single column or row as by default it is drop = TRUE
.
set.seed(24)
mdf <- as.data.frame(matrix(sample(paste(1:5, 6:10, sep=","),
5*20, replace = TRUE), 5, 20), stringsAsFactors=FALSE)