1

After writing Inf (as numeric) into excel, I read it back again and get Inf as Char. For NaN, it becomes NA. For NA, it remains NA as numeric, which is supposed to be.

I used both readxl and writexl. I tried to install tibble and failed.

library(readxl)
library(writexl)

x <- c(1,NA)
y <- c(2,NaN)
z <- c(3,Inf)

d0 <- data.frame(x,y,z)

write_xlsx(d0, 'Test.xlsx')
d1 <- read_xlsx('Test.xlsx')

str(d0$x)
str(d0$y)
str(d0$z)

str(d1$x)
str(d1$y)
str(d1$z)

Here are results: Notice NaN becomes NA; Inf (numeric) becomes Inf (character)

> str(d0$x)
 num [1:2] 1 NA
> str(d0$y)
 num [1:2] 2 NaN
> str(d0$z)
 num [1:2] 3 Inf

> str(d1$x)
 num [1:2] 1 NA
> str(d1$y)
 num [1:2] 2 NA
> str(d1$z)
 chr [1:2] "3" "Inf"

I am looking for explanations and remedies during write and read phases. I am not looking for solutions after reading in such as setting as.numeric(d1$z). I am verifying dataset output by third party using the same R code.

Thanks for your help!

massisenergy
  • 1,764
  • 3
  • 14
  • 25
ESG
  • 11
  • 3
  • Thanks @joran for a quick response. If you open the saved excel file, you will see blank cells for NA and NaN; plus string "Inf" for Inf. When you read them back, NA and NaN become NA. "Inf" become Chr (the entire column becomes chr). Excel carries 4 special values: empty cell, null, #N/A, #DIV/0!. I wonder why write_xlsx does not carry an option for writing Inf as true infinite (here #DIV/0!). When you read it back, you will get Inf as numeric. Of course, Excel does not distinguish +Inf and -Inf. Another complication: for 0/0, it is NaN in R and becomes #DIV/0! in Excel. – ESG Jul 21 '19 at 04:13

1 Answers1

1

See this SO question about infinity in excel for some more info. Infinity is not available in excel. Before writing to excel you could set the Inf values to .Machine$double.xmax. This will be represented as #NULL! in excel. But reading #NULL! in again will automatically be represented as Inf in R. Strangely enough -Inf is represented in excel as the same value #NULL!.

library(readxl)
library(writexl)

d0 <- data.frame(x = c(1, NA),
                 y = c(2, NaN),
                 z = c(3, Inf),
                 z1 = c(4, -Inf))

d0[d0 == Inf] <- .Machine$double.xmax 
d0[d0 == -Inf] <- .Machine$double.xmax * -1


write_xlsx(d0, 'Test.xlsx')
d1 <- read_xlsx('Test.xlsx')
d1

# A tibble: 2 x 4
      x     y     z    z1
  <dbl> <dbl> <dbl> <dbl>
1     1     2     3     4
2    NA    NA   Inf  -Inf

If you need to see the numbers in excel you would have to replace the Inf values with 9.99999E+307 and replace this number back to Inf when after you read the data back into R.

phiver
  • 23,048
  • 14
  • 44
  • 56