3

I am trying to get a Fibonacci sequence of 5 million elements.

This code aborts abnormally, when I pass 1000 as a parameter.

def self.fibo_seq(limit)
  result_array = [0,1]
  return result_array if limit < 2
   while result_array.length <= limit
     result_array << result_array[-1] + result_array[-2]
   end
  return result_array
end
res= Multiple.fibo_seq(5_000_000)
print res

Error: [1]    22382 killed     ruby fibo.rb

Example output:

# >> [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, , 1...] upto 5 Million elements
Santosh Mohanty
  • 468
  • 1
  • 6
  • 23
  • 1
    Overflow, most likely. You can't just have huge numbers in variables in most languages. – pulsejet Mar 15 '17 at 19:06
  • 1
    @RadialApps: in ruby you can. No problems whatsoever. – Sergio Tulentsev Mar 15 '17 at 19:06
  • @SergioTulentsev, you're right I guess (just looked it up). My second guess then would be insufficient memory for allocation for a bigger data type when it gets big enough. – pulsejet Mar 15 '17 at 19:11
  • 2
    Huge numbers are expensive to store in memory, run your program and open another tab and run `top -o mem`. – Anthony Mar 15 '17 at 19:12
  • I'm seeing really strange results where strings of the same bytesize as it's Numeric equivalent are keeping memory overhead at 1/40th the size... – Anthony Mar 15 '17 at 19:44
  • 1
    The code should handle 1000 correctly. BTW, `50_00_000` is 5,000,000 and should be written `5_000_000` if you're going to use that notation. The details are important when programming. – the Tin Man Mar 15 '17 at 20:42

4 Answers4

4

Storing the first 5000000 Fibonacci numbers using YARV's implementation of Integer uses exactly 1084762047712 bytes (assuming 8 bits per byte) on a 64 bit platform. That's close to one TiByte (0.9865853351 TiByte, to be precise). And that's just the space for the numbers themselves, there's also the overhead for the array (a couple of bytes) and the pointers inside the array (a little bit less than 5000000 times 8, or a little over 38 MiByte).

Computing those 5000000 numbers even without storing them (only remembering the last 2 to avoid re-computation), took a little over 20 minutes on my late 2011 model MacBook Pro. Computing them while at the same time allocating 1 TiByte of RAM is going to be much slower. If you don't have 1 TiByte of RAM, and the OS starts swapping out to disk, it is going to be orders of magnitude slower, even if you have a blazing RAID of SSDs connected via FibreChannel.

In order to print the array, it needs to be represented as a string first. Even just the commas and spaces without the numbers are already 4999999*2 characters, which need close to 10 MiByte of RAM (assuming a single-byte character set). If you tried to print out just the commas and spaces, you would need about 2500 pages of DIN A4 paper, or 1250 sheets if you print double-sided. Office paper is typically sold in stacks of 500 sheets which are roughly 5 cm high, so you have 2.5 stacks about 12.5 cm high just for the commas and spaces.

The total number of digits, and thus characters (and bytes) for the 5000000 numbers, is roughly 2.7 trillion digits, that's about 2.5 TiByte of RAM for the final string to be printed. Printing that out on DIN A4 double-sided would result in a stack of paper 33 km high, 4 times the height of Mt. Everest.

All in all, at the point that you are calling print, your program needs about 3.5 TiByte of RAM.

Printing to the console is actually surprisingly slow, on my standard macOS Terminal.app, I get about 1 MiByte/s, which means that not only will calculating the 5000000 numbers take at least tens of minutes without even counting the time for allocating all those objects and all that RAM, not only will your program use 3.5 TiByte of RAM, just the act of displaying the final array on the terminal will take about one month.

tl;dr summary: 5000000 Fibonacci numbers are big.

Jörg W Mittag
  • 363,080
  • 75
  • 446
  • 653
2

The problem with this program is likely the memory constraints. But do you really need all those numbers? If yes, then you better get more hardware.

Otherwise, if you need just the five millionth number in the sequence, you could greatly speed up your program by storing just the two last numbers.

The final step of improvement: calculating an arbitrary member of fibonacci sequence in constant time! - "Find The Millionth Fibonacci in Java".

Community
  • 1
  • 1
Sergio Tulentsev
  • 226,338
  • 43
  • 373
  • 367
1

Generating a Fibonacci sequence of 5M is one problem because of the memory and time involved.

After generating it the next question becomes reusing those results so you don't have to do it twice. Even if it'd fit, storing the sequence in memory is silly since the code or machine crashing will force regenerating the values, and if you need 5,000,000 you could wait a long time before the application is ready to do something useful, so put them on disk, either in a flat file, or into a database where you can retrieve just the specific value you need relatively quickly.

Here's simple code to generate a flat file that I tested up to 25,000 before I got bored and stopped it. It seemed to do fine for that test but I imagine it'll slow down as Ruby shuffles things around. What the upper limit is I don't know and lack the patience to find out.

limit = ARGV.shift.to_i

puts "#{limit} iterations"

File.open('fibonacci.out', 'w') do |fo|
  ary = [0, 1]
  fo.puts ary
  break if limit < 2

  (limit - ary.length).times do |i|
    next_nbr = ary[-1] + ary[-2]
    ary.shift
    ary.push(next_nbr) 

    fo.puts next_nbr
    print 2 + i, "\r"
  end

  puts
end

You might gain a little speed by getting rid of ary.

Running that with

ruby test.rb 5

resulted in "fibonacci.out" containing:

0
1
1
2
3

which seems about right.

There are Fibonacci generators for databases but if they're recursive you'll take out your DBM eventually as it tries to generate big numbers so using a simple generator then storing the values into a table seems more reasonable.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
0

That is probably better than the alternative. The 5 millionth fibonacci number would have about a million digits. Ignoring the time to calculate it, storing all of those would take a terabyte of memory and at least another 2 terabytes of memory or storage for the output.

The bottom line, if you want to do this you cannot do it on your average desktop machine, nor should you do it in Ruby.

For those who asked how I got the number:

According to Wikipedia https://en.wikipedia.org/wiki/Fibonacci_number#Magnitude the number of digits is approximately 0.2090 times n, so for 5 the millionth number that would be about a million digits. I haven't looked closely at Ruby's BigNumber implementation but I assumed 2 digits per byte which is the simplest representation for decimal arithmetic. You could pack a few more bits in (3 digits in 10 bits instead of 2 digits in 8) but it does not change the results that much here.

For the whole array I just used the standard formula for the sum of an arithmetic series. (n/2)*(a0 + an): 5,000,000 / 2 * 1,000,000 or 2.5e12 digits. At 2 digits per byte, that would be about a terabyte of internal memory (Not counting the overhead that Ruby is adding with its internal structure and indirection).

If you print it out or store it, you can count on 1 digit per byte (in UTF-8) so that would take 2.5 terabytes, not counting the 4,999,999 commas.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Marc Rohloff
  • 1,332
  • 7
  • 8
  • How do you calculate 1-2 TB? Not saying you're wrong, but what's the math behind that figure? – Glyoko Mar 15 '17 at 19:28
  • @Glyoko: the length of a number representation in base `b` is roughly `log_b(n)`. The conversion factor between decimal and binary is roughly 3 to 10 (10^3 == 1000, 2^10 == 1024). fib(5000000) is roughly 7*10^1044000, so over a million decimal digits, which translates to roughly 3.47 million bits, or 423 KiByte. Not sure where the 1 TiByte number comes from. I think Marc refers to the size of the whole array. I'm just in the process of crunching the numbers myself. – Jörg W Mittag Mar 15 '17 at 19:52
  • I think you are confusing Ruby's `BigDecimal` which is an arbitrary precision **decimal** type with Ruby's `Integer` which is an arbitrary size **binary** integer type. `Integer` uses exactly 8 (binary) digits per byte, or about 2.5 decimal digits. However, like you wrote yourself, that doesn't have a big influence on your argument, though. – Jörg W Mittag Mar 15 '17 at 21:51
  • Another fun fact: displaying the array on the terminal will take about one month on my machine (late 2011 MacBook Pro with the standard macOS Terminal.app). – Jörg W Mittag Mar 15 '17 at 22:14
  • Jörg, no I meant `Bignum`, even though Ruby 2.4 removed it the implementation is still there behind the scenes. You are right though it is implemented as a bitstring. – Marc Rohloff Mar 16 '17 at 01:32