43

What would be the easiest way to convert a number to base 2 (in a string, as for example 5 would be converted to "0000000000000101") in R? There is intToBits, but it returns a vector of strings rather than a string:

> intToBits(12)
 [1] 00 00 01 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[26] 00 00 00 00 00 00 00

I have tried some other functions, but had no success:

> toString(intToBits(12))
[1] "00, 00, 01, 01, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00"
Jay
  • 9,585
  • 6
  • 49
  • 72
  • 1
    `intToBits` does _not_ return a vector of strings. It returns a raw vector. Notice the vector has 32 elements. That's one element for each bit (since R uses 32-bit integers). I can't think of a situation where it would be useful to represent a number as a literal string of bits... what are you trying to do? – Joshua Ulrich Jul 07 '11 at 17:10
  • I'm working on some examples in cryptanalysis, and it is nice to be able to show keys as bit sequences, "011010110", etc. – Jay Jul 07 '11 at 17:17
  • 2
    @DWin: It's actually listed as "GNU R statistical computation and graphics system" in Debian, and the project page says it's a GNU project, that's why I called it GNU R. Not that I'm picky about these things -- I got used to saying "GNU R" to help disambiguate (doing a Google search for "R" isn't really useful). – Jay Jul 07 '11 at 17:23
  • 6
    It annoys the R Core to see it referred to as GNU R. Since they are the authors, I figure they get the final say. And searching on GNU R is going to miss a majority of what is on the Net. Use "r-project" as a term or use RSiteSearch() or rseek as search engines. Some people report success with "r:language" as a Google term. – IRTFM Jul 07 '11 at 17:35
  • 1
    @42- Tough luck. If it annoys the core authors, they shouldn’t list it as a GNU project. Yet they did, and continue doing so on the official site. – Konrad Rudolph Mar 06 '16 at 13:48

11 Answers11

29

paste(rev(as.integer(intToBits(12))), collapse="") does the job

paste with the collapse parameter collapses the vector into a string. You have to use rev to get the correct byte order though.

as.integer removes the extra zeros

nico
  • 50,859
  • 17
  • 87
  • 112
25

Note that intToBits() returns a 'raw' vector, not a character vector (strings). Note that my answer is a slight extension of @nico's original answer that removes the leading "0" from each bit:

paste(sapply(strsplit(paste(rev(intToBits(12))),""),`[[`,2),collapse="")
[1] "00000000000000000000000000001100"

To break down the steps, for clarity:

# bit pattern for the 32-bit integer '12'
x <- intToBits(12)
# reverse so smallest bit is first (little endian)
x <- rev(x)
# convert to character
x <- as.character(x)
# Extract only the second element (remove leading "0" from each bit)
x <- sapply(strsplit(x, "", fixed = TRUE), `[`, 2)
# Concatenate all bits into one string
x <- paste(x, collapse = "")
x
# [1] "00000000000000000000000000001100"

Or, as @nico showed, we can use as.integer() as a more concise way to remove the leading zero from each bit.

x <- rev(intToBits(12))
x <- paste(as.integer(x), collapse = "")
# [1] "00000000000000000000000000001100"

Just for copy-paste convenience, here's a function version of the above:

dec2bin <- function(x) paste(as.integer(rev(intToBits(x))), collapse = "")
Brian Stamper
  • 2,143
  • 1
  • 18
  • 41
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • @bubakazouba: in this example, `12` is a 32-bit integer. Why do you think it has too many bits? What's to fix? – Joshua Ulrich Nov 09 '15 at 03:53
  • Im sorry I just meant it's alot of bits for what I need, I didnt mean "fix" as there is something wrong to be repaired. I just meant is there a simple way to vary the number of bits? – bubakazouba Nov 09 '15 at 03:54
  • @bubakazouba: in short, no. Base R only has 32-bit integers. If you know the number can be represented in a smaller number of bits (e.g. a byte or short), you could extract only the right-most X bits using `substr`. But you should really be using `readBin` and `writeBin` to deal with binary data. – Joshua Ulrich Nov 09 '15 at 04:05
20

I think that you can use R.utils package, then the intToBin() function

>library(R.utils)

>intToBin(12)
[1] "1100"

> typeof(intToBin(12))
[1] "character"
BenMorel
  • 34,448
  • 50
  • 182
  • 322
dlacos
  • 201
  • 2
  • 2
16

intToBits is limited to maximum 2^32, but what if we want to convert 1e10 to binary? Here is function for converting float numbers to binary, assuming as they are big integers stored as numeric.

dec2bin <- function(fnum) {
  bin_vect <- rep(0, 1 + floor(log(fnum, 2)))
  while (fnum >= 2) {
    pow <- floor(log(fnum, 2))
    bin_vect[1 + pow] <- 1
    fnum <- fnum - 2^pow
  } # while
  bin_vect[1] <- fnum %% 2
  paste(rev(bin_vect), collapse = "")
} #dec2bin

This function begins to loose digits after 2^53 = 9.007199e15, but works fine for smaller numbers.

microbenchmark(dec2bin(1e10+111))
# Unit: microseconds
#                 expr     min       lq     mean   median      uq    max neval
# dec2bin(1e+10 + 111) 123.417 125.2335 129.0902 126.0415 126.893 285.64   100
dec2bin(9e15)
# [1] "11111111110010111001111001010111110101000000000000000"
dec2bin(9e15 + 1)
# [1] "11111111110010111001111001010111110101000000000000001"
dec2bin(9.1e15 + 1)
# [1] "100000010101000110011011011011011101001100000000000000"
inscaven
  • 2,514
  • 19
  • 29
  • I don't need to convert such large numbers, but great answer anyway! +1 – Jay Jun 23 '15 at 12:55
  • 2
    I faced a problem where I need to operate with big numbers and after searching on stackoverflow for a solution finally wrote my own code :) – inscaven Jun 23 '15 at 13:24
  • I upvoted this answer for covering the case for the big integers stored as numeric. I would be glad to see that inscaven's answer will cover the fractional cases as well: `dec2bin(0.3) # Error in rep(0, 1 + floor(log(fnum, 2))) : invalid 'times' argument.` Also, notice that `dec2bin(0) # Error in rep(0, 1 + floor(log(fnum, 2))) : invalid 'times' argument`. Hence, the case 0 must be handled properly. – Erdogan CEVHER Jun 05 '19 at 22:40
  • @inscaven, have you also implemented the inverse bin2dec for large 'bin' strings? – mshaffer Sep 14 '21 at 21:37
  • @mshaffer if you have a binary string `bs` you can use this one-liner `sum(2^(nchar(bs) - stringi::stri_locate_all(bs, fixed = "1")[[1]][,1]))` remember it can correctly hadle binary strings with length not more than 53 characters – inscaven Sep 15 '21 at 08:51
6

Oh, but what to do if you have a 64 bit integer as enabled by the bit64 package? Every answer given, other than that of @epwalsh will not operate on the 64 bit integer because the C based internals of R and R.utils do not support it. @epwalsh's solution is great and works in R if you load the bit64 package first, except it (using loops) in R is dog slow (all speeds are relative).

o.dectobin <- function(y) {
  # find the binary sequence corresponding to the decimal number 'y'
  stopifnot(length(y) == 1, mode(y) == 'numeric')
  q1 <- (y / 2) %/% 1
  r <- y - q1 * 2
  res = c(r)
  while (q1 >= 1) {
    q2 <- (q1 / 2) %/% 1
    r <- q1 - q2 * 2
    q1 <- q2
    res = c(r, res)
  }
  return(res)
}

dat <- sort(sample(0:.Machine$integer.max,1000000))
system.time({sapply(dat,o.dectobin)})
#   user  system elapsed 
# 61.255   0.076  61.256 

We can make this better if we byte compile it...

library(compiler)
c.dectobin <- cmpfun(o.dectobin)
system.time({sapply(dat,c.dectobin)})
#   user  system elapsed 
# 38.260   0.010  38.222 

... but it is still pretty slow. We can get substantially faster if we write our own internals in C (which is what I have done here borrowing from @epwalsh's code - I'm not a C programmer, obviously)...

library(Rcpp)
library(inline)
library(compiler)
intToBin64.worker <- cxxfunction( signature(x = "string") , '    
#include <string>
#include <iostream>
#include <sstream>
#include <algorithm>
// Convert the string to an integer
std::stringstream ssin(as<std::string>(x));
long y;
ssin >> y;

// Prep output string
std::stringstream ssout;


// Do some math
int64_t q2;
int64_t q1 = (y / 2) / 1;
int64_t r = y - q1 * 2;
ssout << r;
while (q1 >= 1) {
q2 = (q1 / 2) / 1;
r = q1 - q2 * 2;
q1 = q2;
ssout << r;
}


// Finalize string
//ssout << r;
//ssout << q1;
std::string str = ssout.str();
std::reverse(str.begin(), str.end());
return wrap(str);
', plugin = "Rcpp" )

system.time(sapply(as.character(dat),intToBin64.worker))
#   user  system elapsed 
#  7.166   0.010   7.168 

```

Barranka
  • 20,547
  • 13
  • 65
  • 83
russellpierce
  • 4,583
  • 2
  • 32
  • 44
  • 6
    ... which I now notice is entirely absurd because bit64 has a as.bitstring function that is twice as fast as my Rcpp function... but I'll leave this here as a monument to folly and as a potential reminder of how to bridge from integer64 to C++ and back... but definitely see the bit64 source code if you need a more efficient way to do just that. – russellpierce May 14 '15 at 14:13
  • 1
    Your "monument to folly" comment made me think of: https://despair.com/products/mistakes. – Joshua Ulrich Apr 04 '18 at 11:33
  • I wonder if just re-tooling the internal `intToBits` to handle wider inputs wouldn't do just fine? https://github.com/wch/r-source/blob/21ac5ee817a45d98361da324285c77e2f9c4f73d/src/main/raw.c#L124-L142 – MichaelChirico Apr 04 '18 at 12:40
  • @MichaelChirico IIRC bit64 implements 64 bit integers under the hood in two doubles. So, I'd be a little surprised if things ran smoothly just by changing the vector and loop bounds. Also... oddly surprised to see comments on this low ranked answer after three years. =) – russellpierce Apr 04 '18 at 12:57
  • 2
    bit64 implements `integer64` as one double--a `REALSXP`. A double is 64 bits, as is a 64-bit integer. Same amount of memory, but the contents are represented differently. My comment was due to being on this page to address @MichaelChirico's comment / edit to my answer. I happened to read your comment, which made me smile and think of that link. – Joshua Ulrich Apr 04 '18 at 13:53
6

Have a look at the R.utils package - there you have a function called intToBin...

http://rss.acs.unt.edu/Rdoc/library/R.utils/html/intToBin.html

Chris
  • 69
  • 1
  • 1
5

This function will take a decimal number and return the corresponding binary sequence, i.e. a vector of 1's and 0's

dectobin <- function(y) {
  # find the binary sequence corresponding to the decimal number 'y'
  stopifnot(length(y) == 1, mode(y) == 'numeric')
  q1 <- (y / 2) %/% 1
  r <- y - q1 * 2
  res = c(r)
  while (q1 >= 1) {
    q2 <- (q1 / 2) %/% 1
    r <- q1 - q2 * 2
    q1 <- q2
    res = c(r, res)
  }
  return(res)
}
petew
  • 671
  • 8
  • 13
2

Try »binaryLogic«

library(binaryLogic)

ultimate_question_of_life_the_universe_and_everything <- as.binary(42)

summary(ultimate_question_of_life_the_universe_and_everything)
#>   Signedness  Endianess value<0 Size[bit] Base10
#> 1   unsigned Big-Endian   FALSE         6     42

> as.binary(0:3, n=2)
[[1]]
[1] 0 0

[[2]]
[1] 0 1

[[3]]
[1] 1 0

[[4]]
[1] 1 1
lemon
  • 21
  • 1
1

--originally added as an edit to @JoshuaUlrich's answer since it's entirely a corollary of his and @nico's; he suggested I add a separate answer since it introduces a package outside his ken--

Since @JoshuaUlrich's answer is so functional (6 back-to-back functions), I find the pipe (%>%) operator of magrittr/tidyverse makes the following solution more elegant:

library(magrittr)

intToBits(12) %>% rev %>% as.integer %>% paste(collapse = '')
# [1] "00000000000000000000000000001100"

We can also add one final as.integer call to truncate all those leading zeros:

intToBits(12) %>% rev %>% as.integer %>% paste(collapse = '') %>% as.integer
# [1] 1100

(note of course that this again stored as integer, meaning R considers it as 1100 represented in base 10, not 12 represented in base 2)

Note that @ramanudle's (and others', notably @russellpierce, who gives a C++ implementation) approach is often the standard suggested in low-level languages as it's quite an efficient approach (and it works for any number that can be stored in R, i.e, not limited to integer range).

Also worth mentioning that the C implementation of intToBits is remarkably straightforward -- see https://en.wikipedia.org/wiki/Bitwise_operations_in_C for the parts that may be unfamiliar to R-only users

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
0

Here's a recursive function that converts a positive integer to any base from 2 to 9. The function works by repeatedly dividing by the base and converting the quotient to the target base, by calling itself. The digits of the answer are the remainders of each division along the way.

convertBase <- function(x, base=2L, g="") {
  if (x < 1) return(g)
  convertBase( x %/% base, base, paste0(x %% base, g) )
}

For example, convertBase(545,6) will first divide 545 by 6, giving 90 remainder 5. So "5" is the rightmost digit, and then the function calls convertBase(90,6,"5") which divides 90 by 6 giving 15 remainder 0. Thus "0" is the next digit (moving left), and the function calls convertBase(15,0,"05") which divides 15 by 6 giving 2 remainder 3, so the next digits (again moving left) are "3" and finally "2", returning "2305". The default base is 2 (binary); for example convertBase(12) gives "1100".

If x is 0 (or negative), the function returns "". If x is not integral, the function won't work. If you need to convert to a base larger than 10, the function as I've presented it won't work but it is not hard to adapt.

Montgomery Clift
  • 445
  • 3
  • 13
-2
decimal.number<-5

i=0

result<-numeric()

while(decimal.number>0){

  remainder<-decimal.number%%2

  result[i]<-remainder

  decimal.number<-decimal.number%/%2

  i<-i+1
}
Kenzo_Gilead
  • 2,187
  • 9
  • 35
  • 60
  • While this code may answer the question, providing additional context regarding **how** and **why** it solves the problem would improve the answer's long-term value. – Alexander Feb 11 '18 at 17:01