nlvm (the nim-level virtual machine?)is an LLVM-based compiler for the Nimlanguage.
From Nim's point of view, it's a backend just like C or JavaScript - fromLLVM's point of view, it's a language frontend that emits IR.
When I started on this little project, I knew neither llvm nor Nim.Therefore, I'd specially like to thank the friendly folks at the #nimchannel that never seemed to tire of my nooby questions.Also, thanks to all tutorial writers out there, on llvm, programmingand other topics for providing such fine sources of copy-pa... er,inspiration!
Questions, patches, improvement suggestions and reviews welcome. Whenyou find bugs, feel free to fix them as well :)
Fork and enjoy!
Jacek Sieka (arnetheduck on gmail point com)
nlvm
is generally at par with nim
in terms of features, with the followingnotable differences:
C
compiler stepgdb
/lldb
debug information with source stepping, typeinformation etcwasm32
support with no extra toolingMost things from nim
work just fine (see notes below however!):
nim
to nlvm
!)importc
works without needing C
header files - the declaration in the.nim
file needs to be accurateTest coverage is not too bad either:
How you could contribute:
osx
and windows
should be easy, arm
would benice)nlvm
generate better IR - optimizations, builtins, exception handling..nlvm
-compatiblenlvm
does not:
C
- as a consequence, header
, emit
and similar pragmaswill not work - neither will the fancy importcpp
/C++
featuresTo do what I do, you will need:
gcc
most of the time)Start with a clone:
cd $SRC
git clone https://github.com/arnetheduck/nlvm.git
cd nlvm && git submodule update --init
We will need a few development libraries installed, mainly due to how nlvm
processes library dependencies (see dynlib section below):
# Fedora
sudo dnf install pcre-devel openssl-devel sqlite-devel ninja-build
# Debian, ubuntu etc
sudo apt-get install libpcre3-dev libssl-dev libsqlite3-dev ninja-build
Compile nlvm
(if needed, this will also build nim
and llvm
):
make
Compile with itself and compare:
make compare
Run test suite:
make test
make stats
You can link statically to LLVM to create a stand-alone binary - this willuse a more optimized version of LLVM as well, but takes longer to build:
make STATIC_LLVM=1
If you want a faster nlvm
, you can also try the release build - it will becalled nlvmr
:
make STATIC_LLVM=1 nlvmr
When you update nlvm
from git
, don't forget the submodule:
git pull && git submodule update
To build a docker image, use:
make docker
To run built nlvm
docker image use:
docker run -v $(pwd):/code/ nlvm c -r /code/test.nim
On the command line, nlvm
is mostly compatible with nim
.
When compiling, nlvm
will generate a single .o
file with all code from yourproject and link it using $CC
- this helps it pick the right flags forlinking with the C library.
cd $SRC/nlvm/Nim/examples
../../nlvm/nlvm c fizzbuzz
If you want to see the generated LLVM IR, use the -c
option:
cd $SRC/nlvm/Nim/examples
../../nlvm/nlvm c -c fizzbuzz
less fizzbuzz.ll
You can then run the LLVM optimizer on it:
opt -Os fizzbuzz.ll | llvm-dis
... or compile it to assembly (.s
):
llc fizzbuzz.ll
less fizzbuzz.s
Apart from the code of your .nim
files, the compiler will also mix in thecompatibility found library in nlvm-lib/
.
Generally, the nim
compiler pipeline looks something like this:
nim --> c files --> IR --> object files --> executable
In nlvm
, we remove one step and bunch all the code together:
nim --> IR --> single object file --> executable
Going straight to the IR means it's possible to express nim constructs moreclearly, allowing llvm
to understand the code better and thus do a betterjob at optimization. It also helps keep compile times down, because thec-to-IR
step can be avoided.
The practical effect of generating a single object file is similar togcc -fwhole-program -flto
- it is expensive in terms of memory, but resultsin slightly smaller and faster binaries. Notably, the IR-to-machine-code
step,including any optimizations, is repeated in full for each recompile.
nim
uses a runtime dynamic library loading scheme to gain access to sharedlibraries. When compiling, no linking is done - instead, when running yourapplication, nim
will try to open anything the user has installed.
nlvm
does not support the {.dynlib.}
pragma - instead you can use{.passL.}
using normal system linking.
# works with `nim`
proc f() {. importc, dynlib: "mylib" .}
# works with both `nim` and `nlvm`
{.passL: "-lmylib".}
proc f() {. importc .}
When nim
compiles code, it will generate c
code which may include otherc
code, from headers or directly via emit
statements. This means nim
hasdirect access do symbols declared in the c
file, which can be both a featureand a problem.
In nlvm
, {.header.}
directives are ignored - nlvm
looks strictly atthe signature of the declaration, meaning the declaration must exactly matchthe c
header file or subtly ABI issues and crashes ensue!
# When `nim` encounters this, it will emit `jmp_buf` in the `c` code without
# knowing the true size of the type, letting the `c` compiler determine it
# instead.
type C_JmpBuf {.importc: "jmp_buf", header: "<setjmp.h>".} = object
# nlvm instead ignores the `header` directive completely and will use the
# declaration as written. Failure to correctly declare the type will result
# in crashes and subtle bugs - memory will be overwritten or fields will be
# read from the wrong offsets.
#
# The following works with both `nim` and `nlvm`, but requires you to be
# careful to match the binary size and layout exactly (note how `bycopy`
# sometimes help to further nail down the ABI):
when defined(linux) and defined(amd64):
type
C_JmpBuf {.importc: "jmp_buf", bycopy.} = object
abi: array[200 div sizeof(clong), clong]
# In `nim`, `C` constant defines are often imported using the following trick,
# which makes `nim` emit the right `C` code that the value from the header
# can be read (no writing of course, even though it's a `var`!)
#
# assuming a c header with: `#define RTLD_NOW 2`
# works for nim:
var RTLD_NOW* {.importc: "RTLD_NOW", header: "<dlfcn.h>".}: cint
# both nlvm and nim (note how these values often can be platform-specific):
when defined(linux) and defined(amd64):
const RTLD_NOW* = cint(2)
wasm32
support is still very bare-bones, so you will need to do a bit oftinkering to get it to work.
Presently, the wasm32-unknown-unknown
target is mapped to --os:standalone
and --cpu:wasm32
- this choice represents a very raw wasm
engine with 32-bitlittle-endian integers and pointers - in the future, the nim
standard libraryand system.nim
will need to be updated to support WASM system interfaces likeemscripten or WASI.
To compile wasm files, you will thus need a panicoverride.nim
- a minimalexample looks like this and discards any errors:
proc rawoutput(s: string) = discard
proc panic(s: string) {.noreturn.} = discard
After placing the above code in your project folder, you can compile .nim
code to wasm32
:
nim c -c --nlvm.target=wasm32-unknown-unkown myfile.nim
less myfile.ll
To go from there, follow the steps foundhere.