Instructions
Now that we have the basic infrastructure in place we’ll wrap the raw llvm-hs AST nodes inside a collection of helper functions to push instructions onto the stack held within our monad.
Instructions in LLVM are either numbered sequentially (%0
, %1
, …) or given explicit variable names (%a
, %foo
, ..). For example, the arguments to the following function are named values, while the result of the add instruction is unnamed.
define i32 @add(i32 %a, i32 %b) {
%1 = add i32 %a, %b
ret i32 %1
}
In the implementation of llvm-hs both these types are represented in a sum type containing the constructors UnName
and Name
. For most of our purpose we will simply use numbered expressions and map the numbers to identifiers within our symbol table. Every instruction added will increment the internal counter, to accomplish this we add a fresh name supply.
fresh :: Codegen Word
fresh = do
i <- gets count
modify $ \s -> s { count = 1 + i }
return $ i + 1
Throughout our code we will however refer named values within the module, these have a special data type Name
(with an associated IsString
instance so that Haskell can automatically perform the boilerplate coercions between String types) for which we’ll create a second name supply map which guarantees that our block names are unique.
type Names = Map.Map String Int
uniqueName :: String -> Names -> (String, Names)
uniqueName nm ns =
case Map.lookup nm ns of
Nothing -> (nm, Map.insert nm 1 ns)
Just ix -> (nm ++ show ix, Map.insert nm (ix+1) ns)
Since we can now work with named LLVM values we need to create several functions for referring to references of values.
local :: Name -> Operand
local = LocalReference double
externf :: Name -> Operand
externf = ConstantOperand . C.GlobalReference double
Our function externf
will emit a named value which refers to a toplevel function (@add
) in our module or will refer to an externally declared function (@putchar
). For instance:
declare i32 @putchar(i32)
define i32 @add(i32 %a, i32 %b) {
%1 = add i32 %a, %b
ret i32 %1
}
define void @main() {
%1 = call i32 @add(i32 0, i32 97)
call i32 @putchar(i32 %1)
ret void
}
Since we’d like to refer to values on the stack by named quantities we’ll implement a simple symbol table as an association list letting us assign variable names to operand quantities and subsequently look them up when used.
assign :: String -> Operand -> Codegen ()
assign var x = do
lcls <- gets symtab
modify $ \s -> s { symtab = [(var, x)] ++ lcls }
getvar :: String -> Codegen Operand
getvar var = do
syms <- gets symtab
case lookup var syms of
Just x -> return x
Nothing -> error $ "Local variable not in scope: " ++ show var
Now that we have a way of naming instructions we’ll create an internal function to take an llvm-hs AST node and push it on the current basic block stack. We’ll return the left hand side reference of the instruction. Instructions will come in two flavors, instructions and terminators. Every basic block has a unique terminator and every last basic block in a function must terminate in a ret
.
instr :: Instruction -> Codegen (Operand)
instr ins = do
n <- fresh
let ref = (UnName n)
blk <- current
let i = stack blk
modifyBlock (blk { stack = (ref := ins) : i } )
return $ local ref
terminator :: Named Terminator -> Codegen (Named Terminator)
terminator trm = do
blk <- current
modifyBlock (blk { term = Just trm })
return trm
Using the instr
function we now wrap the AST nodes for basic arithmetic operations of floating point values.
fadd :: Operand -> Operand -> Codegen Operand
fadd a b = instr $ FAdd NoFastMathFlags a b []
fsub :: Operand -> Operand -> Codegen Operand
fsub a b = instr $ FSub NoFastMathFlags a b []
fmul :: Operand -> Operand -> Codegen Operand
fmul a b = instr $ FMul NoFastMathFlags a b []
fdiv :: Operand -> Operand -> Codegen Operand
fdiv a b = instr $ FDiv NoFastMathFlags a b []
On top of the basic arithmetic functions we’ll add the basic control flow operations which will allow us to direct the control flow between basic blocks and return values.
br :: Name -> Codegen (Named Terminator)
br val = terminator $ Do $ Br val []
cbr :: Operand -> Name -> Name -> Codegen (Named Terminator)
cbr cond tr fl = terminator $ Do $ CondBr cond tr fl []
ret :: Operand -> Codegen (Named Terminator)
ret val = terminator $ Do $ Ret (Just val) []
Finally we’ll add several “effect” instructions which will invoke memory and evaluation side-effects. The call
instruction will simply take a named function reference and a list of arguments and evaluate it and simply invoke it at the current position. The alloca
instruction will create a pointer to a stack allocated uninitialized value of the given type.
call :: Operand -> [Operand] -> Codegen Operand
call fn args = instr $ Call Nothing CC.C [] (Right fn) (toArgs args) [] []
alloca :: Type -> Codegen Operand
alloca ty = instr $ Alloca ty Nothing 0 []
store :: Operand -> Operand -> Codegen Operand
store ptr val = instr $ Store False ptr val Nothing 0 []
load :: Operand -> Codegen Operand
load ptr = instr $ Load False ptr Nothing 0 []