This is a toy project of mine, with the goal of making a compiler for C, written in C, which is able to compile itself.
Clone and build from source, and the binary will be placed in bin/lacc
.
git clone https://github.com/larmel/lacc.git
cd lacc
./configure
make
make install
Default configuration includes some basic probing of the environment to detect which machine and libc we are compiling for.Recognized platforms currently include Linux with glibc or musl, and OpenBSD.
Certain standard library headers, such as stddef.h
and stdarg.h
, contain definitions that are inherently compiler specific, and are provided specifically for lacc under lib/.If you want to use bin/lacc
directly without installing the headers, you can override the location by setting --libdir
to point to this folder directly.
The install
target will copy the binary and headers to the usual locations, or as configured with --prefix
or --bindir
.There is also an option to set DESTDIR
.Execute make uninstall
to remove all the files that were copied.
See ./configure --help
for options to customize build and install parameters.
Command line interface is kept similar to GCC and other compilers, using mostly a subset of the same flags and options.
-E Output preprocessed.
-S Output GNU style textual x86_64 assembly.
-c Output x86_64 ELF object file.
-dot Output intermediate representation in dot format.
-o Specify output file name. If not speficied, default to input file
name with suffix changed to '.s', '.o' or '.dot' when compiling
with -S, -c or -dot, respectively. Otherwise use stdout.
-std= Specify C standard, valid options are c89, c99, and c11.
-I Add directory to search for included files.
-w Disable warnings.
-g Generate debug information (DWARF).
-O{0..3} Set optimization level.
-D X[=] Define macro, optionally with a value. For example -DNDEBUG, or
-D 'FOO(a)=a*2+1'.
-f[no-]PIC Generate position-independent code.
-v Print verbose diagnostic information. This will dump a lot of
internal state during compilation, and can be useful for debugging.
--help Print help text.
Arguments that do not match any option are taken to be input files.If no compilation mode is specified, lacc will act as a wrapper for the system linker /bin/ld
.Some common linker flags are supported.
-Wl, Specify linker options, separated by commas.
-L Add linker include directory.
-l Add linker library.
-shared Passed to linker as is.
-rdynamic Pass -export-dynamic to linker.
As an example invocation, here is compiling test/c89/fact.c to object code, and then using the system linker to produce the final executable.
bin/lacc -c test/c89/fact.c -o fact.o
bin/lacc fact.o -o fact
The program is part of the test suite, calculating 5! using recursion, and exiting with the answer.Running ./fact
followed by echo $?
should print 120
.
The compiler is written in C89, counting around 19k lines of code in total.There are no external dependencies except for the C standard libary, and some system calls required to invoke the linker.
The implementation is organized into four main parts; preprocessor, parser, optimizer, and backend, each in their own directory under src/.In general, each module (a .c
file typically paired with an .h
file defining the public interface) depend mostly on headers in their own subtree.Declarations that are shared on a global level reside in include/lacc/.This is where you will find the core data structures, and interfaces between preprocessing, parsing, and code generation.
Preprocessing includes reading files, tokenization, macro expansion, and directive handling.The interface to the preprocessor is peek(0)
, peekn(1)
, consume(1)
, try_consume(1)
, and next(0)
, which looks at a stream of preprocessed struct token
objects.These are defined in include/lacc/token.h.
Input processing is done completely lazily, driven by the parser calling these functions to consume more input.A buffer of preprocessed tokens is kept for lookahead, and filled on demand when peeking ahead.
Code is modeled as control flow graph of basic blocks, each holding a sequence of three-address code statements.Each external variable or function definition is represented by a struct definition
object, defining a single struct symbol
and a CFG holding the code.The data structures backing the intermediate representation can be found in include/lacc/ir.h.
Visualizing the intermediate representation is a separate output target.If the -dot option is specified, a dot formatted text file is produced as output.
bin/lacc -dot -I include src/backend/compile.c -o compile.dot
dot -Tps compile.dot -o compile.ps
Below is an example from a function found in src/backend/compile.c, showing a slice of the complete graph.The full output can be generated as a PostScript file by running the commands shown.
Each basic block in the graph has a list of statements, most commonly IR_ASSIGN
, which assigns an expression (struct expression
) to a variable (struct var
).Expressions also contain variable operands, which can encode memory locations, addresses and dereferenced pointers at a high level.
DIRECT
operands refer to memory at *(&symbol + offset)
, where symbol is a variable or temporary at a specific location in memory (for example stack).ADDRESS
operands represent exactly the address of a DIRECT
operand, namely (&symbol + offset)
.DEREF
operands refer to memory pointed to by a symbol (which must be of pointer type).The expression is *(symbol + offset)
, which requires two load operations to map to assembly. Only DEREF
and DIRECT
variables can be target for assignment, or l-value.IMMEDIATE
operands hold a constant number, or string. Evaluation of immediate operands do constant folding automatically.The parser is hand coded recursive descent, with main parts split into src/parser/declaration.c, src/parser/initializer.c, src/parser/expression.c, and src/parser/statement.c.The current function control flow graph, and the current active basic block in that graph, are passed as arguments to each production.The graph is gradually constructed as new three-address code instructions are added to the current block.
The following example shows the parsing rule for bitwise or expressions, whichadds a new IR_OP_OR
operation to the current block.Logic in eval_expr
will ensure that the operands value
and block->expr
are valid, terminating in case of an error.
static struct block *inclusive_or_expression(
struct definition *def,
struct block *block)
{
struct var value;
block = exclusive_or_expression(def, block);
while (try_consume('|')) {
value = eval(def, block, block->expr);
block = exclusive_or_expression(def, block);
block->expr = eval_or(def, block, value, eval(def, block, block->expr));
}
return block;
}
The latest evaluation result is always stored in block->expr
.Branching is done by instantiating new basic blocks and maintaining pointers.Each basic block has a true and false branch pointer to other blocks, which is how branches and gotos are modeled.Note that at no point is there any syntax tree structure being built.It exists only implicitly in the recursion.
The main motivation for building a control flow graph is to be able to do dataflow analysis and optimization.The current capabilities here are still limited, but it can easily be extended with additional and more advanced analysis and optimization passes.
Liveness analysis is used to figure out, at every statement, which symbols may later be read.The dataflow algorithm is implemented using bit masks for representing symbols, numbering them 1-63.As a consequence, optimization only works on functions with less than 64 variables.The algorithm also has to be very conservative, as there is no pointer alias analysis (yet).
Using the liveness information, a transformation pass doing dead store elimination can remove IR_ASSIGN
nodes which provably do nothing, reducing the size of the generated code.
There are three backend targets: textual assembly code, ELF object files, and dot for the intermediate representation.Each struct definition
object yielded from the parser is passed to the src/backend/compile.c module.Here we do a mapping from intermediate control flow graph representation down to a lower level IR, reducing the code to something that directly represents x86_64 instructions.The definition for this can be found in src/backend/x86_64/encoding.h.
Depending on function pointers set up on program start, the instructions are sent to either the ELF backend, or text assembly.The code to output text assembly is therefore very simple, more or less just a mapping between the low level IR instructions and their GNU syntax assembly code.See src/backend/x86_64/assemble.c.
Dot output is a separate pipeline that does not need low level IR to be generated.The compile module will simply forward the CFG to src/backend/graphviz/dot.c.
Testing is done by comparing the runtime output of programs compiled with lacc and the system compiler (cc).A collection of small standalone programs used for validation can be found under the test/ directory.Tests are executed using check.sh, which will validate preprocessing, assembly, and ELF outputs.
$ test/check.sh bin/lacc test/c89/fact.c
[-E: Ok!] [-S: Ok!] [-c: Ok!] [-c -O1: Ok!] :: test/c89/fact.c
A complete test of the compiler is done by going through all test cases on a self-hosted version of lacc.
make -C test
This will first use the already built bin/lacc
to produce bin/bootstrap/lacc
, which in turn is used to build bin/selfhost/lacc
.Between the bootstrap and selfhost stages, the intermediate object files are compared for equality.If everything works correctly, these stages should produce identical binaries.The compiler is ''good'' when all tests pass on the selfhost binary.This should always be green, on every commit.
It is hard to come up with a good test suite covering all possible cases.In order to weed out bugs, we can use csmith to generate random programs that are suitable for validation.
./csmith.sh
The csmith.sh script runs csmith to generate an infinite sequence of random programs until something fails the test harness.It will typically run thousands of tests without failure.
The programs generated by Csmith contain a set of global variables, and functions making mutations on these.At the end, a checksum of the complete state of all variables is output.This checksum can then be compared against different compilers to find discreptancies, or bugs.See doc/random.c for an example program generated by Csmith, which is also compiled correctly by lacc.
When a bug is found, we can use creduce to make a minimal repro.This then will end up as a new test case in the normal test suite.
Some effort has been put into making the compiler itself fast (although the generated code is still very much unoptimized).Serving as both a performance benchmark and correctness test, we use the sqlite database engine.The source code is distributed as a single ~7 MB (7184634 bytes) large C file spanning more than 200 K lines (including comments and whitespace), which is perfect for stress testing the compiler.
The following experiments were run on a laptop with an i5-7300U CPU, compiling version 3.20.1 of sqlite3.Measurements are made from compiling to object code (-c).
It takes less than 200 ms to compile the file with lacc, but rather than time we look at a more accurate sampling of CPU cycles and instructions executed.Hardware performance counter data is collected with perf stat
, and memory allocations with valgrind --trace-children=yes
.In valgrind, we are only counting contributions from the compiler itself (cc1
executable) while running GCC.
Numbers for lacc is from an optimized build produced by make CC='clang -O3 -DNDEBUG' bin/lacc
.Each compiler is invoked with arguments -c sqlite/sqlite3.c -o foo.o
.
Compiler | Cycles | Instructions | Allocations | Bytes allocated | Result size |
---|---|---|---|---|---|
lacc | 426,826,782 | 622,119,382 | 51,467 | 26,944,883 | 1,590,482 |
tcc (0.9.27) | 245,142,166 | 397,514,762 | 2,909 | 23,020,917 | 1,409,716 |
gcc (9.3.0) | 9,958,514,599 | 14,524,274,665 | 1,546,790 | 1,111,331,606 | 1,098,408 |
clang (10.0.0) | 4,351,456,211 | 5,466,808,426 | 1,434,072 | 476,529,342 | 1,116,992 |
There is yet work to be done to get closer to TCC, which is probably one of the fastest C compilers available.Still, we are within reasonable distance from TCC performance, and an order of magnitude better than GCC and clang.
From the above table, we can see that the size of the sqlite object file generated by lacc is larger than those generated by other compilers, suggesting that we output less optimal code.
To compare the relative quality of code generated from lacc and GCC, we can look at the number of dynamic instructions executed by the selfhost binary versus a binary built by GCC.We run the same test as above, compiling sqlite to object code.Targets for the test are the default compiler build (bin/lacc
) produced by GCC, and the selfhost binary (bin/selfhost/lacc
) produced by lacc.Both of these targets are produced without any optimizations enabled, and without defining NDEBUG
.
Compiler | Cycles | Instructions |
---|---|---|
lacc | 946,315,653 | 1,481,608,726 |
lacc (selfhost) | 1,417,112,690 | 2,046,996,115 |
The selfhosted binary is slower to compile sqlite than the compiler built by GCC, showing that lacc indeed generates rather inefficient code.Improving the backend with better instruction selection is a priority, so these numbers should hopefully get closer in the future.
These are some useful resources for building a C compiler targeting x86_64.
最常用的BIF之一,返回调用进程的pid。 语法 (Syntax) self() 参数 (Parameters) 没有 返回值 (Return Value) 返回调用进程的pid。 例如 (For example) -module(helloworld). -export([start/0]). start() -> io:fwrite("~p~n",[self()]). 输出 (
在定义类的过程中,无论是显式创建类的构造方法,还是向类中添加实例方法,都要求将 self 参数作为方法的第一个参数。例如,定义一个 Person 类: 那么,self 到底扮演着什么样的角色呢?本节就对 self 参数做详细的介绍。 事实上,Python 只是规定,无论是构造方法还是实例方法,最少要包含一个参数,并没有规定该参数的具体名称。之所以将其命名为 self,只是程序员之间约定俗成的一种习
问题内容: 在子类中重写方法时,我经常这样做: 我的问题是:super(type(self),self)是否有捷径? 问题答案: 不要那样做:如果仅可以将其用作第一个参数,那么就不必将它放在第一位。您必须在此处传递实际的类,而不是如果该类已被子类化的表达式可能会更改的表达式。 super的第一个参数必须是包含当前方法定义的类,因为您要告诉super在基础列表中从哪里开始搜索。 Python 3知道
主要内容:语法,示例SQL SELF JOIN 用于将一个表和自身连接,就好像存在两个表一样。为了区分两个表,在 SQL 语句中需要至少重命名一个表。 自连接通常用于将表的某个字段与该表的同一字段的其它值进行比较。 语法 SELF JOIN 的基本语法如下: 您看,SQL 并没有 SELF JOIN 关键字,而是使用 WHERE 子句来达到自连接的目的。 示例 自连接的语法比较简单,但是结果往往不是那么容易理解,让我
Updates Yarn to the latest version. yarn self-update Note: self-update is not available. See policies for enforcing versions within a project
Vdt模板编译的结果,会添加如下代码 function(obj) { var self = this.data, scope = obj; .... } vdt.render() 方法这样调用模板函数 var vdt = { render: function() { template.call(vdt, data); ... } } 所以 this 模板中t
Self Service Password 是一个 Web 应用,可以让用户修改 LDAP 中的记录密码。支持标准的 LDAPv3 目录服务,包括:OpenLDAP, OpenDS, ApacheDS, Sun Oracle DSEE, Novell, Active Directory 等
self-hosted-services This repository contains everything you need to start self-hosting a core set of privacy-preserving services that I have found helpful, all run via a common Docker Compose configu