how to discriminate text from numeric and missing values in R?

Question

I have a variable in the dataset contains three types of values: text(string), numeric, and missing values. All of them are stored as a factor now. I want to distinguish the text content from the numeric values and the missing values. How could I get it?

Data <- data.frame(x=c("100","20","home","","30"))

there are three type of values here, number, text, and missing values, I want to find the locations of all text

Use `is.numeric` and `is.na`? But you should show this data to us. — Tim Biegeleisen, Aug 24 '17 at 13:04
[How to make a great R reproducible example?](http://stackoverflow.com/questions/5963269) — Sotos, Aug 24 '17 at 13:07
it will only give the data which are strings or missing values, so I cannot distinguish missing values and strings. I will show you my data later. — Xin Chang, Aug 24 '17 at 13:07

acylam · Accepted Answer · 2017-08-24T18:19:55.053

You can extract text, numeric and missing indices separately with regex:

grep("[:alpha:]+", Data$x)
# [1] 3

grep("[0-9]+", Data$x)
# [1] 1 2 5

grep("^\\s*$", Data$x)
# [1] 4

To get the actual values, use value=TRUE:

grep("[:alpha:]+", Data$x, value = TRUE)
# [1] "home"

grep("[0-9]+", Data$x, value = TRUE)
# [1] "100" "20"  "30"

grep("^\\s*$", Data$x, value = TRUE)
# [1] ""

[:alpha:]+ matches any alphabet one or more times

[0-9]+ matches any numbers one or more times

^ matches start of string, $ matches end of string, and \\s* matches spaces zero or more times, so ^\\s*$ matches only spaces zero or more times.

how to discriminate text from numeric and missing values in R?

1 Answers1