2

I have a Microsoft Excel (.xlsx) file which I would like to load into R. I've done this before and used read.csv(), which always worked fine, but with this file something strange happens. Something seems to go wrong with one column, which contains a row name and large numbers with values of 13 digits. This column is - no matter how I try to load the file into R - converted to its scientific notation.

The problem can be simulated as follows: in Excel, type in the first column first row, a random row name. Type in the first column second row a random big number, e.g. 6345157890027. Then save the file as .csv. Let's suppose I want to open this file in R: TestData <- read.csv(file = "Test.csv", head = TRUE) and then View(TestData)

The number 6345157890027 is now displayed as 6.345158e+12, and information is lost somewhere during import. Now, I've tried to solve this multiple ways:

  • In both the .xlsx and the .csv file, I converted the row into number format
  • Tried all .csv formats that Excel allows me to save it in
  • Copied the whole worksheet to a new Excel file
  • Used the Import Data function in Excel
  • Save the worksheet as .txt and import this into R using File->Import Dataset function

However, none of these solved the problem: the value always seems to appear in scientific notation. Does someone knows a way to make sure either Excel or R does not transform large numbers into scientific notation?

Silhouettes
  • 145
  • 1
  • 10
  • 1
    Are you sure you are losing information? What happens when you use `print(..., digits = 15)` – G. Grothendieck Apr 07 '20 at 16:16
  • You are right, the information is not lost, it's just displayed differently. However, still I would like like to be able to display the information as a whole number. – Silhouettes Apr 07 '20 at 18:21
  • Use `options(digits = 15)` – G. Grothendieck Apr 07 '20 at 20:10
  • Thanks for your comments. But I should clarify myself: I see that with using this option, it is possible to display the value in the R console in the non-scientific manner. However, when I use for example ```View(TestData)``` I still see the scientific notation. Is there a way to overcome this, it would make things more convenient for me – Silhouettes Apr 08 '20 at 11:41
  • Can you provide a reproducible example where it does not work. This works for me to produce all digits: `options(digits = 15); DF <- data.frame(x = 6345157890027); View(DF)` – G. Grothendieck Apr 08 '20 at 12:25

3 Answers3

1

I solved this problem by using the format function, as proposed in another post by rnso (see How to prevent scientific notation in R?):

> xx = 100000000000
> xx
[1] 1e+11
> format(xx, scientific=F)
[1] "100000000000"

Also worked perfectly for my wrongly displayed column in my data frame

Silhouettes
  • 145
  • 1
  • 10
1

One can just declare options(scipen = 999) and make sure that the environment does not change large numbers to exponential form.

Sky_7
  • 77
  • 1
  • 10
0

you could use the as.bit64 function from the bit64 package. Just make sure to not use 64bit numbers to reference other R objects. This solutions is good for numbers found in the +-2^63 range.

Daniel O
  • 4,258
  • 6
  • 20
  • If you haven’t lost any information as your other comment states then this solution will not help you. Your numbers are not big enough to need 64bit storage. – Daniel O Apr 07 '20 at 18:32