You can gsub
everything that isn't a number (using [:digit:]
, and adding period and minus to keep that) and replace it with nothing. Pass that through a as.numeric
and put it all into a sapply
to get it back as a matrix:
sapply(data, function(x) as.numeric(gsub("[^[:digit:].-]","",x)))
x y z q
[1,] 25.00 756 1000 1500000
[2,] -10.00 6710 5000 300000
[3,] 9.11 723 1500 500000
[4,] 1.37 659 27000 3000000
(if you just do the gsub, without sapply
, you get back each row as a single string of numbers. There may be a better way to avoid that, but I'm not sure what it is.)
Following suggestions from Gregor, here's a variant of this solution where I replace foot-inch formats with a decimal point for better readbility:
sapply(data, function(x) {x<-gsub("'(\\d*)''",".\\1",x)
as.numeric(gsub("[^[:digit:].-]","",x))})
x y z q
[1,] 25.00 75.6 1000 1500000
[2,] -10.00 67.1 5000 300000
[3,] 9.11 72.3 1500 500000
[4,] 1.37 65.9 27000 3000000
(note that in my data, the inch symbol was replaced with ''
(two apostrophes) -- you'll need to replace it with whatever your data has there.)
One last option, where I change the feet and inches into cm, to make it decimal:
sapply(data, function(x) {
if(any(grepl("'",x))) {inches<-strsplit(x,split="\\'")
x<-unlist(lapply(inch, function(y) as.numeric(y[1])*30.48+as.numeric(y[2])*2.54))
x}
as.numeric(gsub("[^[:digit:].-]","",x))
}
)
x y z q
[1,] 25.00 2301.24 1000 1500000
[2,] -10.00 2067.56 5000 300000
[3,] 9.11 2202.18 1500 500000
[4,] 1.37 2004.06 27000 3000000