6

I made small C module to improve performance, but GHC doesn't inline foreign functions, and calls cost eliminates the acceleration. For example, test.h:

int inc (int x);

test.c:

#include "test.h"
int inc(int x) {return x + 1;}

Test.hc:

{-# LANGUAGE ForeignFunctionInterface #-}
module Test (inc) where
import Foreign
import Foreign.C
foreign import ccall unsafe "test.h inc" c_inc :: CInt -> CInt
inc = fromIntegral . c_inc . fromIntegral
{-# INLINE c_inc #-}
{-# INLINE inc #-}

Main.hs:

import System.Environment
import Test
main = do {args <- getArgs; putStrLn . show . inc . read . head $ args }

Making:

$ gcc -O2 -c test.c
$ ghc -O3 test.o Test.hs
$ ghc --make -O3 test.o Main
$ objdump -d Main > Main.as

Finally, in Main.as I have callq <inc> instructions instead of desirable inc's.

leventov
  • 14,760
  • 11
  • 69
  • 98
  • 3
    You expect ghc to inline a C function in its generated code? This might work if you use the -via-C option, otherwise it's hopeless (since it would require ghc to read the C code and generate code for it). – augustss Jan 07 '13 at 16:05
  • 2
    Not possible in absence of link-time optimisation. One (hacky) approach to try is to compile both Haskell and C to LLVM bitcode, combine the .bc files with `llvm-link`, optimise with `opt` and then emit executable code with `llc`. – Mikhail Glushenkov Jan 07 '13 at 16:07
  • @MikhailGlushenkov, could you write a sketch of making commands sequence? I failed to google out how to obtain `.bc` files from haskell code. – leventov Jan 07 '13 at 16:23
  • 1
    Provide a type for `inc`. Currently you're converting it to an Integer, is that what you intend? That's going to swamp the FFI overehad (around 900ns per call, IIRC) – Don Stewart Jan 07 '13 at 17:16
  • @leventov `-keep-llvm-files` gives you `.ll` assembly which you can then compile with `llvm-as`. – Mikhail Glushenkov Jan 07 '13 at 17:42

1 Answers1

9

GHC won't inline C code via its asm backend or LLVM backend. Typically you're only going to call into C for performance reasons if the thing you are calling really costs a lot. Incrementing an int isn't such a thing, as we already have primops for that.

Now, if you call via C you may get GCC to inline things (check the generated assembly).

Now, however, there's some things you can do already to minimize the call cost:

foreign import ccall unsafe "test.h inc" c_inc :: CInt -> CInt

inc = fromIntegral . c_inc . fromIntegral

Provide a type signature for inc. You're paying precious cycles converting to Integer here.

Mark the call as "unsafe", as you do, so that the runtime is not bookmarked prior to the call.

Measure the FFI call overhead - it should be in the nanoseconds. However, if you find it still too expensive, you can write a new primop and jump to it directly. But you better have your criterion numbers first.

Don Stewart
  • 137,316
  • 36
  • 365
  • 468
  • Actually my "inc" is the set of branchless SSE min-max functions: https://gist.github.com/4476908 – leventov Jan 07 '13 at 17:47
  • Ah I see -- you really do want new primops then. You're duplicating some thing of http://hackage.haskell.org/trac/ghc/ticket/3557 ? – Don Stewart Jan 07 '13 at 18:03
  • Generally no, but maybe these min-max instructions are particularly considered in the ticket, I haven't studied it in details. – leventov Jan 07 '13 at 19:42