1

I have a column in my dataframe where in every cell there are one or more numbers. If there are many numbers, they are seperated with a space. Furthermore, R considers them as a character vector. I'd really like to convert them to numeric (and if possible sum them up right away). E.g. one of my cells might look like

6 310 21 20 64

I've tried

Reduce(sum,L)

and

as.numeric(L)

but I always get Warning message:

NAs introduced by coercion

Here, L is just a sample object I created to put one of my cells into.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
GaiusBaltar
  • 31
  • 1
  • 7
  • 1
    Try `sum(as.numeric(strsplit("6 310 21 20 64",' ')[[1]]))`. You might have to modify the code a bit to apply it to all of your data. Post `dput` of your data if you need further help. – etienne Dec 01 '15 at 09:02
  • http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example?s=1|2.2808 Please give some data for L – jogo Dec 01 '15 at 09:06
  • Ok, in my example L<-c("6 310 21 20 64"). But this already works fine for me, thanks! – GaiusBaltar Dec 01 '15 at 09:10
  • 1
    @etienne you could probably generalize this to `sapply(strsplit(str1,' '), function(x) sum(type.convert(x)))` – David Arenburg Dec 01 '15 at 09:53
  • @DavidArenburg: thanks, I didn't know about `type.convert` – etienne Dec 01 '15 at 10:01
  • @etienne yo can use `as.numeric`. I've used `type.convert` so it will pick between `as.numeric` and `as.integer` automatically. You should post an answer anyway IMO. – David Arenburg Dec 01 '15 at 10:02

3 Answers3

5

Here are two more options which work correctly on a vector (it seems)

str1 <- c("6 310 21 20 64", "6 310 21 20 64","6 310 21 20 64")
rowSums(read.table(text = str1))
## [1] 421 421 421

Or using data.table::fread

rowSums(data.table::fread(paste(str1, collapse = "\n")))
# [1] 421 421 421        

Or as mentioned in comments by @akrun, you can use Reduce(`+`,...) instead of rowSums(...) in both cases in order to avoid to marix conversion

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
4

We can use scan

sum(scan(text=str1, what=numeric(), quiet=TRUE))
#[1] 421

data

str1 <- "6 310 21 20 64"
akrun
  • 874,273
  • 37
  • 540
  • 662
  • @DavidArenburg That is a nice option. You could potentially post that as a solution – akrun Dec 01 '15 at 09:45
  • 1
    I think it's more general because your solution will give incorrect result for a vector (IMO), for example `str1 <- c("6 310 21 20 64", "6 310 21 20 64","6 310 21 20 64")` – David Arenburg Dec 01 '15 at 09:46
  • @DavidArenburg Yes, only after I posted I realized that the OP has a dataframe. – akrun Dec 01 '15 at 09:51
3

Following my comment, here is a solution using sapply :

sum(as.numeric(strsplit("6 310 21 20 64",' ')[[1]]))

which for the column of the dataframe will give something like this:

sapply(1:nrow(df),function(x){sum(as.numeric(strsplit(str1,' ')[[x]]))})
# 421 421 421

which could be improved in sapply(strsplit(str1,' '), function(x) sum(type.convert(x))), thanks to David Arenburg's suggestion.

etienne
  • 3,648
  • 4
  • 23
  • 37