1

I have been searching for an answer for a long time and can't find one. I have a column of data such as:

22 apples, 16 oranges
13 plums, 22 large green grapes
52 fig leaves, 2 peanuts

I need to extract just the numbers for each cell, and add them and put that in a new column. Any help is greatly appreciated!

Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
R. Miller
  • 13
  • 2
  • What do you mean you have a column of data? Like these are all values in the same column of a table? – bpgeck Apr 18 '16 at 23:40
  • You probably need to look at regular expressions. ?regex – Richard Telford Apr 18 '16 at 23:42
  • yes, each cell in the column contains both numbers and words. I just need the total of the numbers, put into a new column. So a cell might contain 22 apples, 16 oranges. I want 38 in a new cell. – R. Miller Apr 18 '16 at 23:43
  • 1
    Welcome to SO. Your question is unclear, and it lacks a reproducible example. In order for people to give you the best help, we need a clear statement of the problem, code to reproduce your data, your attempts to solve the problem yourself, and the desired result. See [ask], [mcve], and [How to make a great R reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – Rich Scriven Apr 18 '16 at 23:48
  • 1
    please edit your question to indicate that `c(38,35,54)` would be the correct result for this example ... – Ben Bolker Apr 19 '16 at 00:12

3 Answers3

4

You could also do this without looping over the columns by using a character matrix for extraction and then switching its mode to numeric after that. Using @crippledlambda's data table, we have

m <- gsub("\\D+", "", as.matrix(table))
mode(m) <- "numeric"
rowSums(m)
# [1] 38 35 54
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
  • Thank you for your help everyone. Its been hard for an old chemist to suddenly get dropped into the middle of a programming pool! – R. Miller Apr 19 '16 at 19:55
3
x <- c("22 apples, 16 oranges",
       "13 plums, 22 large green grapes",
       "52 fig leaves, 2 peanuts")

Although this can also be done with base R (e.g. gsub followed by strsplit), stringr::str_extract_all is convenient.

library(stringr)
numstr <- str_extract_all(x,"[0-9]+")

Now convert strings to numeric and combine ...

sapply(numstr,
       function(x) sum(as.numeric(x)))
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
1
input <- "22 apples, 16 oranges
13 plums, 22 large green grapes
52 fig leaves, 2 peanuts"

table <- read.table(text=input, sep=",", col.names=c("first", "second"))

Strip <- function(x)
  gsub("^[ ]*|[ ]*$", "", x)

Getnumber <- function(x, pattern="^([0-9]+) (.+)$")
  as.numeric(sub(pattern, "\\1", Strip(x)))

table$sum <- Getnumber(table$first) + Getnumber(table$second)

Then you get:

> table

          first                 second sum
1     22 apples             16 oranges  38
2      13 plums  22 large green grapes  35
3 52 fig leaves              2 peanuts  54

> table$sum
[1] 38 35 54
hatmatrix
  • 42,883
  • 45
  • 137
  • 231