Check whether an element in a character vector can be converted to numeric in R

Question

How can I check whether an element of a character vector can be converted to numeric or not? To be more precise, when the element is a float or an integer it can be converted to numeric without any problems, but when it is a string the warning: “NAs introduced by coercion” occurs. I was able to indirectly check by the index of the NA value. However, it would be much cleaner to be able to do this without getting a warning.

cat1 <- c("1.12354","1.4548","1.9856","some_string")
cat2 <- c("1.45678","1.1478","1.9565","1.32315")
target <- c(0,1,1,0)
df <- data.frame(cat1, cat2, target)
catCols <- c("cat1", "cat2")

for(col in catCols){
a <- as.numeric(unique(df[[col]]))
if(length(which(is.na(a))) != 0){
print(col)
print(which(is.na(a)))
 }
}

What is your goal? The `as.numeric` function warns you if you have some non-coercible numbers. You just want to suppress the warnings? — nicola, May 14 '21 at 09:49
Does this answer your question? [Test for numeric elements in a character string](https://stackoverflow.com/questions/13638377/test-for-numeric-elements-in-a-character-string) — slamballais, May 14 '21 at 09:49
You either convert the entire vector to numeric, or you leave it as string. What do you want to achieve here? — Tim Biegeleisen, May 14 '21 at 09:50
@TimBiegeleisen I want to determine the element and the column this warning occurs at — Mine, May 14 '21 at 09:58
@nicola The goal is to determine the value and the column this warning occurs at but preferably without getting the warning. Using another method — Mine, May 14 '21 at 10:00

Ronak Shah · Accepted Answer · 2021-05-14T10:55:44.463

4

Perhaps, you can use regex to find if all the values in a column are either an integer or float.

can_convert_to_numeric <- function(x) {
  all(grepl('^(?=.)([+-]?([0-9]*)(\\.([0-9]+))?)$', x, perl = TRUE))  
}

sapply(df[catCols], can_convert_to_numeric)
# cat1  cat2 
#FALSE  TRUE

Alternatively, to get values that cannot be converted to numeric we can use grep as :

values_which_cannot_be_numeric <- function(x) {
  grep('^(?=.)([+-]?([0-9]*)(\\.([0-9]+))?)$', x, perl = TRUE, invert = TRUE, value = TRUE)
}

lapply(df[catCols], values_which_cannot_be_numeric)

#$cat1
#[1] "some_string"

#$cat2
#character(0)

Regex taken from here.

If you use type.convert you don't have to worry about this at all.

df <- type.convert(df, as.is = TRUE)
str(df)

#'data.frame':  4 obs. of  3 variables:
# $ cat1  : chr  "1.12354" "1.4548" "1.9856" "some_string"
# $ cat2  : num  1.46 1.15 1.96 1.32
# $ target: int  0 1 1 0

edited May 14 '21 at 10:55

answered May 14 '21 at 09:53

Ronak Shah

377,200
20
156
213

What about numeric literals with exponents? What about hexadecimals? – Tim Biegeleisen May 14 '21 at 10:00
@RonakSkah I also want to find the value which cannot be converted to numeric – Mine May 14 '21 at 10:03
@moli To get the values which cannot be converted to numeric, see the updated answer. – Ronak Shah May 14 '21 at 10:56

Rui Barradas · Answer 2 · 2022-05-15T17:42:32.530

A solution is to write a function returning the indices of the NA values to be applied to the columns you want.

check_num <- function(x){
  y <- suppressWarnings(as.numeric(x))
  if(anyNA(y)){
    which(is.na(y))
  } else invisible(NULL)
}
lapply(df[catCols], check_num)
#$cat1
#[1] 4
#
#$cat2
#NULL

The function above returns NULL if all values can be converted to numeric. This next function follows the same method of determining which vector elements can be converted but returns integer(0) if all can be converted.

check_num2 <- function(x){
  y <- suppressWarnings(as.numeric(x))
  which(is.na(y))
}
lapply(df[catCols], check_num2)
#$cat1
#[1] 4
#
#$cat2
#integer(0)

Check whether an element in a character vector can be converted to numeric in R

2 Answers2

Linked