how to avoid introducing NA while covering numeric values to integer

Question

Here is my data

mydata <- structure(list(GroupA_1 = c(400730000, 0, 0, 0, 4442200000, 0, 
0, 0, 0, 4482700000, 0, 0, 0, 0, 0), GroupA_2 = c(375840000, 
0, 0, 38008000, 7963200000, 0, 0, 0, 164980000, 4102700000, 0, 
0, 0, 0, 89135000), GroupA_3 = c(342230000, 0, 0, 0, 6705700000, 
14662000, 0, 0, 0, 4.665e+09, 0, 0, 0, 0, 0), GroupA_4 = c(311840000, 
0, 0, 0, 4611900000, 0, 0, 0, 148700000, 5108300000, 0, 0, 0, 
0, 123910000)), .Names = c("GroupA_1", "GroupA_2", "GroupA_3", 
"GroupA_4"), class = "data.frame", row.names = c("first1", "first2", 
"first3", "first4", "first5", "first6", "first7", "first8", "first9", 
"first10", "first11", "first12", "first13", "first14", "first15"
))

I load the data like this

mydata <- read.table("path to mydata.txt", header=TRUE, row.names = 1)

then I str it and i see that the values are number

'data.frame':   15 obs. of  4 variables:
 $ GroupA_1: num  4.01e+08 0.00 0.00 0.00 4.44e+09 ...
 $ GroupA_2: num  3.76e+08 0.00 0.00 3.80e+07 7.96e+09 ...
 $ GroupA_3: num  3.42e+08 0.00 0.00 0.00 6.71e+09 ...
 $ GroupA_4: num  3.12e+08 0.00 0.00 0.00 4.61e+09 ...

I try to get convert them to integer like below

mydata2 <- data.frame(sapply(mydata, as.integer))

which introduces NA into the data

Warning messages:
1: In lapply(X = X, FUN = FUN, ...) :
  NAs introduced by coercion to integer range
2: In lapply(X = X, FUN = FUN, ...) :
  NAs introduced by coercion to integer range
3: In lapply(X = X, FUN = FUN, ...) :
  NAs introduced by coercion to integer range
4: In lapply(X = X, FUN = FUN, ...) :
  NAs introduced by coercion to integer range

How can I convert my data into integer without introducing NA?? because I don't see any reason to have NA

@Abdou I want to convert them from numeric to integer . it does not make sense to do as.numeric on numeric No? — nik, Aug 09 '16 at 08:36
Your numbers are too large. For reference, see `.Machine$integer.max` which, on my 64bit machine, is roughly 2.14e+09 (`2^31 - 1`), well below your higher numbers of 4.44e+09 (just over `2^32`). — r2evans, Aug 09 '16 at 08:37
@r2evans right but it means we cannot convert them to integer ? if so why not just because of their values? — nik, Aug 09 '16 at 08:41
Correct. See [some discussion](http://www.win-vector.com/blog/2015/06/r-in-a-64-bit-world/) and one (perhaps of many) workaround: [`bit64`](https://cran.r-project.org/web/packages/bit64/index.html). (Caveat: I've not worked with that package.) — r2evans, Aug 09 '16 at 08:45
@RHertel's comment ("must you use integers, or are numerics sufficient") is salient, but I'll take it a step further (as has been recommended in other channels): if you don't do calculations on them and only infrequently need to do valuation, consider converting them to character strings. (You can trivially check for zero or negative. You can get basic scale with `nchar`. Uniqueness is preserved. Relative comparisons might be a bit more work. ***It all depends*** on your use-case.) — r2evans, Aug 09 '16 at 08:52

score 2 · Answer 1 · edited May 23 '17 at 10:28

2

EDITED:

R is restricted to integers that are smaller than 2147483648. See struggling with integers (maximum integer size)

Your commands work for smaller numbers:

mydata2 <- mydata/1000
mydata3 <- data.frame(sapply(mydata2, as.integer))
str(mydata3)

'data.frame':   15 obs. of  4 variables:
 $ GroupA_1: int  400730 0 0 0 4442200 0 0 0 0 4482700 ...
 $ GroupA_2: int  375840 0 0 38008 7963200 0 0 0 164980 4102700 ...
 $ GroupA_3: int  342230 0 0 0 6705700 14662 0 0 0 4665000 ...
 $ GroupA_4: int  311840 0 0 0 4611900 0 0 0 148700 5108300 ...

edited May 23 '17 at 10:28

Community

1
1

answered Aug 09 '16 at 08:41

mkt

437
7
20

2

thanks but I need the exact values as integer without any manipulation – nik Aug 09 '16 at 08:42
I understand, but you're bumping up against a limit in R. See the link in my edited answer (and the comment by @r2evans above, which I missed before posting). – mkt Aug 09 '16 at 08:48
2

In one of the answers in the link you will find a package `int64`, which apparently allows you to use larger integers. I haven't worked with it, but it might be worth a look in your case – KenHBS Aug 09 '16 at 09:55

how to avoid introducing NA while covering numeric values to integer

1 Answers1