137

Even trivially small Haskell programs turn into gigantic executables.

I've written a small program, that was compiled (with GHC) to the binary with the size extending 7 MB!

What can cause even a small Haskell program to be compiled to the huge binary?

What, if anything, can I do to reduce this?

Danubian Sailor
  • 1
  • 38
  • 145
  • 223
  • 3
    Have you tried just stripping it? – Fred Foo May 24 '11 at 19:11
  • 24
    Run the program `strip` on the binary to remove the symbol table. – Fred Foo May 24 '11 at 19:20
  • 1
    @tm1rbt: Run `strip test`. This command removes some debug information from the program and makes it smaller. – fuz May 24 '11 at 19:20
  • 8
    As an aside your data types in the 3D math library should be stricter for performance reasons: `data M3 = M3 !V3 !V3 !V3` and `data V3 = V3 !Float !Float !Float`. Compile with `ghc -O2 -funbox-strict-fields`. – Don Stewart May 24 '11 at 19:24
  • 9
    This post is discussed on [meta](http://meta.stackoverflow.com/questions/270881/what-to-do-with-a-highly-voted-rotten-link-only-question). – Patrick Hofman Sep 07 '14 at 21:32

2 Answers2

226

Let's see what's going on, try

  $ du -hs A
  13M   A

  $ file A
  A: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), 
     dynamically linked (uses shared libs), for GNU/Linux 2.6.27, not stripped

  $ ldd A
    linux-vdso.so.1 =>  (0x00007fff1b9ff000)
    libXrandr.so.2 => /usr/lib/libXrandr.so.2 (0x00007fb21f418000)
    libX11.so.6 => /usr/lib/libX11.so.6 (0x00007fb21f0d9000)
    libGLU.so.1 => /usr/lib/libGLU.so.1 (0x00007fb21ee6d000)
    libGL.so.1 => /usr/lib/libGL.so.1 (0x00007fb21ebf4000)
    libgmp.so.10 => /usr/lib/libgmp.so.10 (0x00007fb21e988000)
    libm.so.6 => /lib/libm.so.6 (0x00007fb21e706000)
    ...      

You see from the ldd output that GHC has produced a dynamically linked executable, but only the C libraries are dynamically linked! All the Haskell libraries are copied in verbatim.

Aside: since this is a graphics-intensive app, I'd definitely compile with ghc -O2

There's two things you can do.

Stripping symbols

An easy solution: strip the binary:

$ strip A
$ du -hs A
5.8M    A

Strip discards symbols from the object file. They are generally only needed for debugging.

Dynamically linked Haskell libraries

More recently, GHC has gained support for dynamic linking of both C and Haskell libraries. Most distros now distribute a version of GHC built to support dynamic linking of Haskell libraries. Shared Haskell libraries may be shared amongst many Haskell programs, without copying them into the executable each time.

At the time of writing Linux and Windows are supported.

To allow the Haskell libraries to be dynamically linked, you need to compile them with -dynamic, like so:

 $ ghc -O2 --make -dynamic A.hs

Also, any libraries you want to be shared should be built with --enabled-shared:

 $ cabal install opengl --enable-shared --reinstall     
 $ cabal install glfw   --enable-shared --reinstall

And you'll end up with a much smaller executable, that has both C and Haskell dependencies dynamically resolved.

$ ghc -O2 -dynamic A.hs                         
[1 of 4] Compiling S3DM.V3          ( S3DM/V3.hs, S3DM/V3.o )
[2 of 4] Compiling S3DM.M3          ( S3DM/M3.hs, S3DM/M3.o )
[3 of 4] Compiling S3DM.X4          ( S3DM/X4.hs, S3DM/X4.o )
[4 of 4] Compiling Main             ( A.hs, A.o )
Linking A...

And, voilà!

$ du -hs A
124K    A

which you can strip to make even smaller:

$ strip A
$ du -hs A
84K A

An eensy weensy executable, built up from many dynamically linked C and Haskell pieces:

$ ldd A
    libHSOpenGL-2.4.0.1-ghc7.0.3.so => ...
    libHSTensor-1.0.0.1-ghc7.0.3.so => ...
    libHSStateVar-1.0.0.0-ghc7.0.3.so =>...
    libHSObjectName-1.0.0.0-ghc7.0.3.so => ...
    libHSGLURaw-1.1.0.0-ghc7.0.3.so => ...
    libHSOpenGLRaw-1.1.0.1-ghc7.0.3.so => ...
    libHSbase-4.3.1.0-ghc7.0.3.so => ...
    libHSinteger-gmp-0.2.0.3-ghc7.0.3.so => ...
    libHSghc-prim-0.2.0.0-ghc7.0.3.so => ...
    libHSrts-ghc7.0.3.so => ...
    libm.so.6 => /lib/libm.so.6 (0x00007ffa4ffd6000)
    librt.so.1 => /lib/librt.so.1 (0x00007ffa4fdce000)
    libdl.so.2 => /lib/libdl.so.2 (0x00007ffa4fbca000)
    libHSffi-ghc7.0.3.so => ...

One final point: even on systems with static linking only, you can use -split-objs, to get one .o file per top level function, which can further reduce the size of statically linked libraries. It needs GHC to be built with -split-objs on, which some systems forget to do.

Community
  • 1
  • 1
Don Stewart
  • 137,316
  • 36
  • 365
  • 468
  • 7
    when is dynamic linking due to arrive for ghc on the mac? – Carter Tazio Schonwald May 25 '11 at 00:14
  • 1
    ...doesn't `cabal install` strip the installed binary by default? – hvr May 25 '11 at 11:34
  • 1
    doing so on Windows seems to make the resulting file un-runnable, it complains about missing libHSrts-ghc7.0.3.dll – is7s May 26 '11 at 11:10
  • apparently the dlls for the shared libraries are kept within original package folders of these libraries. Is there any way to force ghc to store these libraries in a certain location such as "C:\Windows\System32 for example ? – is7s May 27 '11 at 19:03
  • @hvr Good question! And how can we do the same with `cabal`, without ghc (for example [Snap](http://snapframework.com/) projects are built with `cabal install`)? – Andriy Drozdyuk Jul 04 '12 at 01:19
  • I keep getting the `Could not find module \`Prelude' Perhaps you haven't installed the "dyn" libraries for package \`base'?` error message... – Andriy Drozdyuk Jul 29 '12 at 05:56
  • @is7s To get this to work on Windows, the DLL files must be in your path, so either you can copy all the DLLs to an existing PATH location, or set PATH to point to all the locations (there are a slew of DLLs in a lot of folders). I think it's a lot easier on Unix as the shared object files are installed into standard locations. – mydoghasworms Feb 01 '13 at 12:58
  • 3
    will this binary be working on other Linux machines after these procedures? – Incerteza Sep 05 '14 at 16:21
  • 1
    Hi OP from 2011! I'm from the future and can tell that pandoc executable on Ubuntu 16.04 is 50MB fat and it's not going to changed based on http://packages.ubuntu.com/zesty/pandoc . Message to near-future self and others: contact package maintainer and ask if `enable-shared` was considered. https://launchpad.net/ubuntu/+source/pandoc/+bugs – Stéphane Gourichon Jan 23 '17 at 12:39
  • 1
    For those on macs, the equivalent of `ldd` appears to be `otool -L` – James McMahon Apr 13 '17 at 02:00
13

Haskell uses static linking by default. This is, the whole bindings to OpenGL are copied into your program. As they are quite big, your program gets unnecessarily inflated. You can work around this by using dynamic linking, although it isn't enabled by default.

fuz
  • 88,405
  • 25
  • 200
  • 352
  • 5
    You can dynamically link libraries to work around this. Not sure why it matters what is default, the flag is simple enough. – Thomas M. DuBuisson May 24 '11 at 19:47
  • 5
    The problem is that "any libraries you want to be shared should be built with `--enabled-shared`" so if your Haskell Platform comes with libraries built without `--enabled shared` you have to recompile the base libraries which can be quite painful. – nponeccop Oct 04 '12 at 13:16