How come Haskell got a segmentation fault when the vector is very large but under memory limit?

Question

I wanted to write a program that operates on a really large array, and does a lot of random access read/write operations. I figured vector is the most suitable way of doing it in Haskell, so I wrote a simple program to test its performance:

import Data.Int
import qualified Data.Vector.Unboxed.Mutable as UM

n = 1000000000

main = do
    a <- UM.new n
    UM.read a 42 :: IO Int32

However, when I ran it, it failed with segmentation fault:

$ ghc -O2 test.hs
$ ./test
Segmentation fault (core dumped)

This machine has more than enough memory for the array. Here is the output of free -h:

             total       used       free     shared    buffers     cached
Mem:          251G       150G       100G       672K       419M       141G
-/+ buffers/cache:       9.2G       242G
Swap:         255G       870M       255G

Was it because Haskell's vector package cannot handle very large arrays? Can I modify my code so that it can work on large arrays without too much performance compromise?

Edit: My GHC version is 7.10.2.20150906, and my vector version is 0.11.0.0. This is on a 64-bit linux machine, with

> maxBound :: Int
9223372036854775807

If you change `n` to be small does it still yield a segmentation fault? — Bakuriu, Sep 18 '15 at 07:03
I would report it as a bug. FWIW `n = 536870912` is the smallest value for which the program segfaults on my machine (OS X - 64bit GHC 7.10.2). — ErikR, Sep 18 '15 at 07:05
@user5402 Which is exactly `2**29`. Looks like the package cannot handle vectors bigger than that. — Bakuriu, Sep 18 '15 at 07:09
vector seems to hold index in `Int`: http://stackoverflow.com/questions/3429291/haskell-int-and-integer — ymonad, Sep 18 '15 at 07:25
On the machine I am on right now, `maxBounds :: Int` gives 9223372036854775807, so here `Int` is 64 bits. Maybe the OP has installed a 32 bit version of GHC on a 64 bit OS? — chi, Sep 18 '15 at 08:38
Interesting, I get the Segfault on Ubuntu 14.04 when compiled with GHC-7.10.1, but not with GHC-7.8.2. — leftaroundabout, Sep 18 '15 at 10:03
Could you un-accept my answer? It's clearly wrong, though it was the first explanation that came to my mind, so I want to delete it (cannot if it is accepted...). I did some testing and I can reproduce leftaroundabout behaviour: ghc 7.8 doesn't segfault. You should provide specific details about the versions of ghc you are using. Also if you can try to install a different ghc versions and see if that segfaults too. — Bakuriu, Sep 18 '15 at 13:32
No idea whether this is related but it looks like ghc 7.10.1 *did* change *something* in vector. See [this issue](https://ghc.haskell.org/trac/ghc/ticket/10800) for example. Maybe the time for compilation wasn't the only thing that changed. — Bakuriu, Sep 18 '15 at 13:36
The version of GHC doesn't matter for me, only the version of vector. vector-0.11.0.0 crashes under either 7.8.4 or 7.10.1, vector-0.10.12.2 works under either. I reported it at https://github.com/haskell/vector/issues/98 — Reid Barton, Sep 18 '15 at 14:26

score 4 · Accepted Answer · answered Sep 22 '15 at 15:45

4

This is due to a bug in primitive that is apparently fixed in the recently-released primitive-0.6.1.0. I suggest you add a lower bound on primitive to your project accordingly.

answered Sep 22 '15 at 15:45

Reid Barton

14,951
3
39
49

How come Haskell got a segmentation fault when the vector is very large but under memory limit?

1 Answers1