Currently, I’m writing on my master thesis. The resulting prototype is called MateVM which is a method-based Java Just-In-Time Compiler, written in Haskell. Although, it’s a rather low-level application, Haskell is a perfect match (well, that’s my opinion ;-)).

Anyways, for dynamic code generation we use Harpy, which is only available for x86 at the moment (no PowerPC sigh). People ask us, why we don’t just use LLVM.

The real reason is probably because we want to mess around with compiler challenges on our own. It would be rather boring to throw everything at LLVM and we are done. Apart from that, there’s already a JVM which targets LLVM, but written in C++.

The official reason: LLVM compilation is too slow for a Just-In-Time Compiler. At least this was my argument, but I don’t have evidence for this yet. And, I don’t know anything about compilation speed of Harpy. So, let’s change that.

Simple example

Assume that we want to emit a function, which calculates a + (b * c), where b is fixed and a and c are function arguments. One can solve that by using a loop, but we want to show compile-time behaviour, therefore we unroll the loop b-times.

LLVM

Here is the code for LLVM:

module Main where

import Data.Word
import Data.Int
import Control.Monad
import LLVM.Core
import LLVM.ExecutionEngine

mSomeFun :: CodeGenModule (Function (Int32 -> Int32 -> IO Int32))
mSomeFun =
  createFunction ExternalLinkage $ \ x y -> do
    r <- foldM (\a _ -> add a y) x [1..4000000]
    ret r

main = do
  someFun <- simpleFunction mSomeFun
  someFun 1337 432 >>= print

b = 4000000 should be large enough. Note, that the function is executed after compilation:

$ time ./llvmmain
1728001337

real    0m6.983s
user    0m6.732s
sys     0m0.232s

Harpy

This API requires a bit more dirty work than LLVM does, for example FFI and manual register selection.

{-# LANGUAGE ForeignFunctionInterface #-}
module Main where

import Foreign hiding (xor)
import Foreign.C.Types
import Control.Monad
import Harpy


type SomeFunType = CInt -> CInt -> IO CInt

foreign import ccall "dynamic"
   call_int :: FunPtr SomeFunType -> SomeFunType


mSomeFun :: CodeGen () () (FunPtr SomeFunType)
mSomeFun = do
  mov eax (Disp 0x4, esp)
  mov ebx (Disp 0x8, esp)
  replicateM_ 4000000 (add eax ebx)
  ret
  liftM castPtrToFunPtr getEntryPoint

main = do
  (_, Right someFun) <- runCodeGen mSomeFun () ()
  call_int someFun 1337 432 >>= print

And here how it performs:

$ time ./harpymain
1728001337

real    0m0.163s
user    0m0.136s
sys     0m0.024s

Wow: With Harpy it just takes 2.33% compared to the LLVM execution time. Very nice!

For what it’s worth, the generated code is actually the same, except that LLVM chooses ecx instead of ebx for the second argument.

Conclusion

The comparison isn’t fair of course:

  • the example is far too simple, e.g. there are no labels involed
  • one can disable optimization passes at LLVM, which would reduce compile-time
  • both targets have differents goals, Harpy is just a not-so-stupid assembler, while LLVM is more

However, I think everyone accepts the conclusion, that LLVM has an certain overhead. Therefore, using Harpy was and is the right decision for MateVM.



blog comments powered by Disqus

Author:

lewurm

Published

17 September 2012

Tags