0

I have a data frame that contains survey responses. What is the best way to extract the numbers and change them to a double type variable?

Here is a little sample:

a <- ("10.5", "about 30", "25 per month")
tibble(a)

I have tried

parse_double(a)

and it seems like I am close. Any help is appreciated

  • Does this answer your question? [Extracting decimal numbers from a string](https://stackoverflow.com/questions/19252663/extracting-decimal-numbers-from-a-string) – camille Oct 15 '21 at 21:21
  • 1
    Also 4 more options [in this post](https://stackoverflow.com/q/28819761/5325862) and another 11 options [here](https://stackoverflow.com/q/14543627/5325862) that just need to be amended to include the decimal point – camille Oct 15 '21 at 21:25

2 Answers2

1

We need parse_number

library(readr)
parse_number(a)
[1] 10.5 30.0 25.0

The difference is that parse_double works on character vectors with only digits + . as the characters whereas parse_number extracts the numeric part from a string which also include non-numeric characters

data

a <- c("10.5", "about 30", "25 per month")
akrun
  • 874,273
  • 37
  • 540
  • 662
1

I know a solution too (from base package)

a <- c("10.5", "about 30", "25 per month")
as.numeric(gsub("[[:alpha:]]", "", a)) 


 > as.numeric(gsub("[[:alpha:]]", "", a))
 [1] 10.5 30.0 25.0
 > end_time <- Sys.time()
 > end_time - start_time
 Time difference of 0.01400113 secs
 > start_time <- Sys.time()
 > parse_number(a)
 [1] 10.5 30.0 25.0
 > end_time <- Sys.time()
 > end_time - start_time
 Time difference of 0.1500092 secs

My solution is faster than that provided by Akrun.

Dharman
  • 30,962
  • 25
  • 85
  • 135
manro
  • 3,529
  • 2
  • 9
  • 22