get the number from character "\n 0.28\n \n " in R language

Question

I would like to get the 0.28 from the character using R "\n 0.28\n \n ".

Maybe I should use sub() function, but I am not sure how to do it.

This is nearly a duplicate of these previous questions: http://stackoverflow.com/q/14543627/1036500 and http://stackoverflow.com/q/15451251/1036500 Some of the answers to those questions work here also, eg. `as.numeric(gsub("[[:alpha:]]", "", string))` — Ben, May 05 '13 at 06:47

score 11 · Accepted Answer · answered May 05 '13 at 01:00

In general, you want to learn about regular expressions. Which can be intimidating, but you can also learn by example.

Here, we can do something relatively simple:

R> txt <- "\n 0.28\n \n "
R> gsub(".* ([0-9.]+).*", "\\1", txt)
[1] "0.28"
R> as.numeric(gsub(".* ([0-9.]+).*", "\\1", txt))
[1] 0.28
R>

The (...) marks something we "want", here we say we want digits or dots, and several of them (the +). The "\\1" then recalls that match.

Alternatively, we could just "erase" all of the \n and spaces:

R> as.numeric(gsub("[\n ]", "", txt))
[1] 0.28
R>

Great response and well explained example. Regex needs more of this +1 — Tyler Rinker, May 05 '13 at 01:14

score 8 · Answer 2 · answered May 05 '13 at 01:02

8

You don't need regular expressions for your use-case.

 string <-  "\n 0.28\n \n "
 as.numeric(string)
 [1] 0.28

answered May 05 '13 at 01:02

hd1

Nice one. Seems to break as soon as there is another digit somewhere but the presented example it does indeed work. – Dirk Eddelbuettel May 05 '13 at 01:04
1

@Dirk. Wouldn't you want it to break? Using for example `txt <- " \n 1.5 \n 33 \n"` your two solutions would give respectively `33` and `1.533`. Not that your answer was bad. – flodel May 05 '13 at 01:27

score 1 · Answer 3 · answered May 05 '13 at 01:53

The solutions so far are great and actually teach you something. If you want the dumb but simple answer, taRifx::destring will work:

library(taRifx)
> destring("\n 0.28\n \n ")
[1] 0.28

It uses the [^...] regular expression idiom ("not") rather than back-referencing as in @Dirk's solution:

return(as.numeric(gsub(paste("[^", keep, "]+", sep = ""), "", x)))

3 Answers3