0

I am quite new to R. I need to work with a number of lists that I have in excel, but when importing them with readxl the numbers seem to be changing.

I have a spreadsheet (example here), and when I subtract the sum of the tab "target_intputs" to the sum of the tab "target_outputs" in excel, the total is 0 (as it should be, see "comparison" tab). However, after importing it to R, if I compare them using print(sum(target_inputs) - sum(target_outputs)), it gives me 6.007031e-08 as the total.

Here is a sample of the code:

library(readxl)
target_inputs <- read_excel("C:\\A Matrix - Table 5 2019-20.xlsx", sheet="target_inputs")
target_outputs <- read_excel("C:\\A Matrix - Table 5 2019-20.xlsx", sheet="target_outputs")

target_inputs <- as.numeric(target_inputs)
target_outputs <- as.numeric(target_outputs)

print(sum(target_inputs) - sum(target_outputs))

I assumed there could be a problem with the decimal points, so I tried the round() function and options(digits=20), etc., but I cannot get rid of the problem. This seems to be happening only in R, as when I export it back again using the code below, the problem seems to disappear.

library(writexl)
write_xlsx(target_inputs, "C:\\target_inputs.xlsx")
write_xlsx(target_outputs, "C:\\target_outputs.xlsx")

Any help would be much appreciated, thanks!

---- EDIT 1 ---- This happens with some datasets, others don't give me the problem.

---- EDIT 2 ---- I just noticed that if I increase the decimal points with options(digits=22) I can see changes in the decimal points of the values. For example, in excel the first value of "target_inputs" is 23923.72846564920000000, and after importing to R target_inputs[[1]] gives me 23923.72846564929932356. But then, exporting it back to xlsx gets the values back to the original... I don't get what I could be doing wrong

vabm
  • 285
  • 7
  • 16
  • 1
    sidenote: ```read_excel``` has an argument to make all loaded in information numeric ```col_types = "numeric"```, maybe it helps to read the excel files in as numeric? – Omniswitcher Jun 16 '22 at 12:10

1 Answers1

0

I think I found the issue. After realising that the decimal points were changing I found this post. Seems to be related to the floating-point precision. The solution (at least for me) is to round and adjust the numbers in excel before importing them to R. Cheers

vabm
  • 285
  • 7
  • 16