0

I have this kind of data :

MWE <- c(
  "World1    2.6       -4.5         1.5          5.0       -0.2",
  "1,2",
  "G20    112.9            -4.1                1.6                        5.7                    0.4"
)

The desired output is :

[1] "    2.6       -4.5         1.5          5.0       -0.2"                                                      
[2] ""                                                                                                               
[3] "   112.9                         -4.1                    1.6                        5.7                    0.4"

I want to separate what is a number and what is not (in this precise case, the "1,2" is a "mistake" in datamining and refers to footnotes for "G20", just to mention it is not a number I want to get).

I think the correct regex for the format is therefore [-+]?\\d+\\.\\d

And it works in the positive sense :

> MWE2 <- gsub("[-+]?\\d+\\.\\d","blah",MWE)  
> MWE2
[1] "World1    blah       blah         blah          blah       blah"                                                     
[2] "1,2"                                                                                                                 
[3] "G20    blah                         blah                    blah                        blah                    blah"

But when I try to isolate values by replacing every thing that is not that by nothing, with negative lookahead (I have understood from there that it was what I was looking for) (?! ), so that : (?![-+]?\\d+\\.\\d), but it does not seem to work (I have looked here and added the perl=T option)

> MWE3 <- gsub("(?![-+]?\\d+\\.\\d)","",MWE,perl=T)  
> MWE3
[1] "World1    2.6       -4.5         1.5          5.0       -0.2"                                                      
[2] "1,2"                                                                                                               
[3] "G20    112.9                         -4.1                    1.6                        5.7                    0.4"
Anthony Martin
  • 767
  • 1
  • 9
  • 28

0 Answers0