17

I wanted to write a program that operates on a really large array, and does a lot of random access read/write operations. I figured vector is the most suitable way of doing it in Haskell, so I wrote a simple program to test its performance:

import Data.Int
import qualified Data.Vector.Unboxed.Mutable as UM

n = 1000000000

main = do
    a <- UM.new n
    UM.read a 42 :: IO Int32

However, when I ran it, it failed with segmentation fault:

$ ghc -O2 test.hs
$ ./test
Segmentation fault (core dumped)

This machine has more than enough memory for the array. Here is the output of free -h:

             total       used       free     shared    buffers     cached
Mem:          251G       150G       100G       672K       419M       141G
-/+ buffers/cache:       9.2G       242G
Swap:         255G       870M       255G

Was it because Haskell's vector package cannot handle very large arrays? Can I modify my code so that it can work on large arrays without too much performance compromise?


Edit: My GHC version is 7.10.2.20150906, and my vector version is 0.11.0.0. This is on a 64-bit linux machine, with

> maxBound :: Int
9223372036854775807
xzhu
  • 5,675
  • 4
  • 32
  • 52
  • `free -m` would be more readable. – Sibi Sep 18 '15 at 06:41
  • If you change `n` to be small does it still yield a segmentation fault? – Bakuriu Sep 18 '15 at 07:03
  • @Bakuriu: No, at least not on my machine. – Zeta Sep 18 '15 at 07:03
  • 1
    I would report it as a bug. FWIW `n = 536870912` is the smallest value for which the program segfaults on my machine (OS X - 64bit GHC 7.10.2). – ErikR Sep 18 '15 at 07:05
  • @user5402 Which is exactly `2**29`. Looks like the package cannot handle vectors bigger than that. – Bakuriu Sep 18 '15 at 07:09
  • 1
    Not reproducible here. OS-specific? – n. m. could be an AI Sep 18 '15 at 07:14
  • vector seems to hold index in `Int`: http://stackoverflow.com/questions/3429291/haskell-int-and-integer – ymonad Sep 18 '15 at 07:25
  • On the machine I am on right now, `maxBounds :: Int` gives 9223372036854775807, so here `Int` is 64 bits. Maybe the OP has installed a 32 bit version of GHC on a 64 bit OS? – chi Sep 18 '15 at 08:38
  • 1
    Interesting, I get the Segfault on Ubuntu 14.04 when compiled with GHC-7.10.1, but not with GHC-7.8.2. – leftaroundabout Sep 18 '15 at 10:03
  • 3
    Could you un-accept my answer? It's clearly wrong, though it was the first explanation that came to my mind, so I want to delete it (cannot if it is accepted...). I did some testing and I can reproduce leftaroundabout behaviour: ghc 7.8 doesn't segfault. You should provide specific details about the versions of ghc you are using. Also if you can try to install a different ghc versions and see if that segfaults too. – Bakuriu Sep 18 '15 at 13:32
  • No idea whether this is related but it looks like ghc 7.10.1 *did* change *something* in vector. See [this issue](https://ghc.haskell.org/trac/ghc/ticket/10800) for example. Maybe the time for compilation wasn't the only thing that changed. – Bakuriu Sep 18 '15 at 13:36
  • 1
    The version of GHC doesn't matter for me, only the version of vector. vector-0.11.0.0 crashes under either 7.8.4 or 7.10.1, vector-0.10.12.2 works under either. I reported it at https://github.com/haskell/vector/issues/98 – Reid Barton Sep 18 '15 at 14:26

1 Answers1

4

This is due to a bug in primitive that is apparently fixed in the recently-released primitive-0.6.1.0. I suggest you add a lower bound on primitive to your project accordingly.

Reid Barton
  • 14,951
  • 3
  • 39
  • 49