How to get a sum of numbers in a character string?

Question

Character String is like this.

test <- c("John got a score of 4.5 in mathematics and scored 4.3 in English and ranked 4th.", "Matthew got a score of 7.6")

Output desired is c(8.8, 7.6).

Basically sum of numbers after "score" pattern.

I tried:

s <- as.numeric(gsub(pattern = "^\\D*score\\D*(\\d+\\.*\\d*)\\D*score*\\D*(\\d*\\.*\\d*)\\D*$", replacement = ("\\1"), test$Purpose)) + 
        as.numeric(gsub(pattern = "^\\D*score\\D*(\\d+\\.*\\d*)\\D*score*\\D*(\\d*\\.*\\d*)\\D*$", replacement = ("\\2"), test$Purpose))

However this is returning NAs.

Almost a duplicate of https://stackoverflow.com/questions/35947123/r-stringr-extract-number-after-specific-string . I got it with `str_extract_all(test, "(?i)(?<=score of\\D)\\d+.\\d+|(?i)(?<=scored\\D)\\d+.\\d+")` — Ronak Shah, Feb 20 '18 at 04:47

akrun · Answer 1 · 2018-02-20T04:22:16.343

We can extract the numbers with regex and then do the sum

library(stringr)
sapply(str_extract_all(test, "(?<=score of )[0-9.]+|(?<=scored )[0-9.]+"),
                      function(x) sum(as.numeric(x)))
#[1] 8.8 7.6

Or using tidyverse

library(dplyr)
library(purrr)
str_extract_all(test, "\\b[0-9.]+\\b") %>%
      map_dbl(~ as.numeric(.x) %>%
                           sum)
#[1] 8.8 7.6

Or if we need to get only the numbers after score

str_extract_all(test, "(?<=score of )[0-9.]+|(?<=scored )[0-9.]+") %>%
     map_dbl(~ as.numeric(.x) %>%
                           sum)

How to get a sum of numbers in a character string?

1 Answers1