1

For an algorithm I am currently implementing I need to handle a previous step I am not completely sure it is computationally tractable. This step requires to generate all binary words of length n, for an arbitrary n (it can be big, but in practice shouldn't be bigger than 50). If I remember well, this has exponential complexity (O(2^n)), which is no good.

A naive recursive implementation might be the following:

def gen(n: Int, acc: List[String]) : List[String] = {
  if (n == 0) {
    acc
  } else {
    if (acc.length == 0) {
      gen(n - 1, List("0", "1"))
    } else {
      gen(n - 1, (for (i <- acc) yield i ++ "0") ++ (for (j <- acc) yield j ++ "1"))
    }
  }
}
gen(4, List())  //List(0000, 1000, 0100, 1100, 0010, 1010, 0110, 1110, 0001, 1001, 0101, 1101, 0011, 1011, 0111, 1111)

This works fine for small n's and quickly freezes my computer as n increases.

This problem can also be seen as obtaining the binary representation of all natural numbers [0,2^n - 1], which can be easily parallelizable, but this does not work anyway for big values of n, since the number of elements is huge.

Even if this were possible, another problem is that most data structures have a limit size (Int.MaxValue for most of them), but that is another story :)

Does this problem actually have a solution?

jarandaf
  • 4,297
  • 6
  • 38
  • 67
  • 1
    Possible duplicate of [Fastest way to generate all binary strings of size n into a boolean array?](http://stackoverflow.com/questions/8374079/fastest-way-to-generate-all-binary-strings-of-size-n-into-a-boolean-array) – j_random_hacker Feb 23 '16 at 15:34
  • This question has probably hundreds of duplicates. Please search before asking. – j_random_hacker Feb 23 '16 at 15:34
  • My question is whether it exists an efficient solution to this problem. All "possible" duplicates were asking for a solution (not worrying about efficiency or big values of n) – jarandaf Feb 23 '16 at 15:37
  • 2
    Obviously the size of the solution is 2^N for N bits, there is no way around that. – Jesper Feb 23 '16 at 15:38
  • 4
    Generating n things cannot take less than n units of time, and if you want to store them all, less than n units of space. There are exactly 2^n things (binary sequences of length n). How can a more efficient "solution" possibly exist? – j_random_hacker Feb 23 '16 at 15:38
  • 1
    "this does not work anyway for big values of n, since the number of elements is huge." Any way of producing the answer you want will be huge for big n. So a better question is "why do you need this?" -since your current question is *exactly* the same as "how do I produce a list of strings of the binary representation of all integers from 0 to 1 << n, any order will do?" – The Archetypal Paul Feb 23 '16 at 15:47

3 Answers3

1

Since scala apparently supports BigInteger - I haven't got the slightest clue about coding in Scala - you can simply use it to represent those words. The rest is pretty simple:
All binary words of length n are in [0 , 1 << n). Simply start with 0 as starting-value and increment:

for (bint <- 0L until 1L << n)
    process(bint)

or

0L until (1L<<n) foreach process

This will produce all words that match ordered lexicographically.
The more important question: if n = 40, you'll already wind up with 2^40 words. Even if you only used 40 bits = 5 bytes per word, you'd wind up with a total of 5TB of data. I doubt you'll be capable of handling that amount of data. There should be a better approach than generating that list.

The Archetypal Paul
  • 41,321
  • 20
  • 104
  • 134
  • The Scala syntax for `for` loops is not the same as in Java / C / C++; your example is not syntactically valid in Scala. – Jesper Feb 23 '16 at 15:35
  • @Jesper as I've already written (1. sentence btw. -.-): I don't have any idea about scala. I know it's a JVM-language and that's it. I haven't written a single line of scala-code in my life. That code is just supposed to show how these words can be generated more easily. Translating this into scala will be OPs job. –  Feb 23 '16 at 15:37
1

You can use a Stream:

The class Stream implements lazy lists where elements are only evaluated when they are needed.

Create a Stream for your scenario is pretty straightforward:

def numbers(n: BigInt): Stream[BigInt] = n #:: numbers(n + 1)

You can then use take to get/generate only the n first numbers:

val stream = numbers(0).take(n)

And then transform it to your binary String representation:

 val stream = numbers(0).take(10).map(_.toString(2))

This will also return a Stream. After that, you can do whatever you need with the stream, per instance:

 stream.foreach(println)

Not sure about the performance implications, but it is another alternative you can try.

marcospereira
  • 12,045
  • 3
  • 46
  • 52
0

You could use a Range to do it. If Integers are not enough to represent, use NumericRange over Long or BigInt:

val numbers = NumericRange[BigInt](0,1000,1)

You can convert a number to a binary string with .toString(2) when you need it. And you can convert the Range to a List or whatever you want.

It would be good to know however what you need this for. I cannot think of a good reason to iterate over that many numbers or even store them. If this is part of a solution to another problem, there is probably a better way to approach it.

dth
  • 2,287
  • 10
  • 17