I'm trying to perform a find and replace using sub(), and apply it over multiple columns.
My dataset looks similar to this:
> mydata
col1 col2 col3 col4
1 1 $1.40 $5.39 $23.42
2 2 $(2.40) $(38.29) $(1,239.30)
3 3 $1,302.00 $102.32 $23.10
with several numerical fields expressed in traditional accounting formatting.
I have tried writing the following function to swap out the parenthesis negatives, the thousands separators, and the dollar figures.
find_replace <- function(df, cols){
df[, cols] <- sub('\\,','',df[, cols])
df[, cols] <- sub('\\$','',df[, cols])
df[, cols] <- sub('\\-','',df[, cols])
df[, cols] <- sub('\\(','-',df[, cols])
df[, cols] <- sub('\\)','',df[, cols])
df[, cols] <- as.numeric(df[, cols])
}
mydata[,2:4] <- lapply(mydata[,2:4], find_replace(mydata, 2:4))
...but keep receiving the following error when I test it on the data fram above
Error in match.fun(FUN) :
'find_replace(mydata, 2:4)' is not a function, character or symbol
And when I try running it over my actual dataset (applying it over 6 columns and approximately 4.8 million rows), it gets hung up and have to stop the operation before I get the error, but I would imagine it's the same.
Any suggestions for an efficient way to end up with the following, where all fields are numeric? I have also tried using the colClass argument with a SetClass function when reading in the csv similar to this approach but without success.
> mydata
col1 col2 col3 col4
1 1 1.40 5.39 23.42
2 2 -2.40 38.29 -1239.30
3 3 1302.00 102.32 23.10
Thank you in advance!
Edit: trying the setClass option again, and using the regex from @waterling:
setClass("acntngFmt")
# [1] "acntngFmt"
setAs("character", "acntngFmt",
function(from) as.numeric(gsub("(?![.])[[:punct:]]", "", col, perl=TRUE, from)))
Input <- "A, B, C
$1.40, $(2.40), $1,302.00
$5.39, $(38.29), $102.32
$23.42, $(1,239.30), $23.10"
DF <- read.csv(textConnection(Input), header = TRUE,
colClasses = c("acntngFmt", "acntngFmt", "acntngFmt"))
Error in as.character(x) :
cannot coerce type 'closure' to vector of type 'character'