Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#fix can't use gdb for debug #26

Open
wants to merge 258 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
258 commits
Select commit Hold shift + click to select a range
8a8ef17
Add type cast
rui314 Aug 11, 2019
47d0f32
Implement usual arithmetic conversion
rui314 Sep 26, 2020
371bcd6
Report an error on undefined/undeclared functions
rui314 Aug 21, 2020
1b3f754
Handle return type conversion
rui314 Apr 18, 2020
7073ef4
Handle function argument type conversion
rui314 Mar 22, 2020
6cc4564
Add _Bool type
rui314 Aug 28, 2020
25031d4
Add character literal
rui314 Aug 11, 2019
3890917
Add enum
rui314 Aug 11, 2019
6cb743f
Support file-scope functions
rui314 Sep 4, 2020
cf3bb29
Allow for-loops to define local variables
rui314 Aug 11, 2019
b5d061d
Add +=, -=, *= and /=
rui314 Oct 7, 2020
bc0bd86
Add pre ++ and --
rui314 Oct 7, 2020
5383d6b
Add post ++ and --
rui314 Apr 13, 2020
e56cbd6
Add hexadecimal, octal and binary number literals
rui314 Aug 13, 2019
283b737
Add ! operator
rui314 Aug 13, 2019
25210cc
Add ~ operator
rui314 Aug 13, 2019
268cfe9
Add % and %=
rui314 Oct 7, 2020
07b502d
Add &, |, ^, &=, |= and ^=
rui314 Oct 7, 2020
3b15151
Add && and ||
rui314 Oct 7, 2020
0f1a59c
Add a notion of an incomplete array type
rui314 Aug 13, 2019
4049732
Decay an array to a pointer in the func param context
rui314 Aug 14, 2019
76ea11e
Add a notion of an incomplete struct type
rui314 Oct 7, 2020
126e3c9
Add goto and labeled statement
rui314 Sep 4, 2020
69534d4
Resolve conflict between labels and typedefs
rui314 Aug 30, 2020
551f41d
Add break statement
rui314 Aug 15, 2019
d9d8a78
Add continue statement
rui314 Aug 27, 2020
e06d3c4
Add switch-case
rui314 Aug 15, 2019
2738009
Add <<, >>, <<= and >>=
rui314 Oct 7, 2020
1bb85f9
Add ?: operator
rui314 Aug 17, 2019
0f86cdb
Add constant expression
rui314 Aug 17, 2019
c79ba43
Support local variable initializers
rui314 Aug 18, 2019
a7beccd
Initialize excess array elements with zero
rui314 Sep 18, 2020
fa46fc3
Skip excess initializer elements
rui314 Apr 16, 2020
c8eb974
Add string literal initializer
rui314 Aug 18, 2019
e19416e
Allow to omit array length if an initializer is given
rui314 Sep 4, 2020
c2f325b
Handle struct initializers for local variables
rui314 Aug 18, 2019
a90687a
Allow to initialize a struct with other struct
rui314 Jul 9, 2020
68af1cd
Handle union initializers for local variables
rui314 Jul 20, 2020
609283e
Add global initializer for scalar and string
rui314 Sep 4, 2020
513f415
Add struct initializer for global variable
rui314 Aug 18, 2019
f969405
Handle union initializers for global variable
rui314 Jul 20, 2020
eefa214
Allow parentheses in initializers to be omitted
rui314 Aug 18, 2019
5e824d3
Allow extraneous braces for scalar initializer
rui314 Apr 16, 2020
4115543
Allow extraneous comma at the end of enum or initializer list
rui314 Oct 6, 2020
0c4f152
Emit uninitialized global data to .bss instead of .data
rui314 Aug 19, 2020
37fbe39
Add flexible array member
rui314 Sep 20, 2020
762b976
Allow to initialize struct flexible array member
rui314 Sep 20, 2020
d3e6bd9
Accept `void` as a parameter list
rui314 Aug 19, 2019
2d959a0
Align global variables
rui314 Aug 19, 2020
859ce40
Add extern
rui314 Sep 4, 2020
069559e
Handle extern declarations in a block
rui314 Sep 4, 2020
41687ed
Add _Alignof and _Alignas
rui314 Sep 4, 2020
3fd941c
[GNU] Allow a variable as an operand of _Alignof
rui314 Sep 25, 2020
a1e0907
Add static local variables
rui314 Sep 4, 2020
e0b5c4b
Add compound literals
rui314 Aug 23, 2019
89bd048
Add return that doesn't take any value
rui314 Aug 27, 2020
36a99d1
Add static global variables
rui314 Sep 4, 2020
bac1a3a
Add do ... while
rui314 Aug 24, 2019
f1c72ab
Align stack frame to 16 byte boundaries
rui314 Sep 13, 2020
e56dfc9
Handle a function returning bool, char or short
rui314 Sep 13, 2020
6033be2
Allow to call a variadic function
rui314 Oct 7, 2020
2258e01
Add va_start to support variadic functions
rui314 Aug 25, 2019
046fd9e
Check the number of function arguments
rui314 Sep 26, 2020
e919d1f
Add `signed` keyword
rui314 Oct 6, 2020
cba5663
Add unsigned integral types
rui314 Aug 28, 2020
fb00249
Add U, L and LL suffixes
rui314 Sep 21, 2019
068e0d5
Use long or ulong instead of int for some expressions
rui314 Mar 24, 2020
b345ef8
When comparing two pointers, treat them as unsigned
rui314 Sep 11, 2020
25ec849
Handle unsigned types in the constant expression
rui314 Mar 26, 2020
7ab9353
Ignore const, volatile, auto, register, restrict or _Noreturn.
rui314 Sep 2, 2020
8514610
Ignore "static" and "const" in array-dimensions
rui314 Aug 31, 2020
1b6cf19
Allow to omit parameter name in function declaration
rui314 Sep 4, 2020
93b191b
Add floating-point constant
rui314 Sep 27, 2020
60e682e
Add "float" and "double" local variables and casts
rui314 Sep 12, 2020
932853f
Add flonum ==, !=, < and <=
rui314 Sep 22, 2020
6e26bed
Add flonum +, -, * and /
rui314 Sep 22, 2020
8c533a1
Handle flonum for if, while, do, !, ?:, || and &&
rui314 Sep 13, 2020
cac0ab2
Allow to call a function that takes/returns flonums
rui314 Aug 27, 2020
5957b4f
Allow to define a function that takes/returns flonums
rui314 Aug 27, 2020
bd6c06e
Implement default argument promotion for float
rui314 Apr 30, 2020
b1c2ef2
Support variadic function with floating-point parameters
rui314 May 1, 2020
ab90669
Add flonum constant expression
rui314 May 1, 2020
c8b8a12
Add "long double" as an alias for "double"
rui314 Aug 26, 2020
c5a6162
Add stage2 build
rui314 Aug 24, 2019
0ed5844
Add function pointer
rui314 Sep 3, 2020
3455895
Decay a function to a pointer in the func param context
rui314 Aug 31, 2020
809c0e4
Add usual arithmetic conversion for function pointer
rui314 Sep 2, 2020
fdf60e5
Split cc1 from compiler driver
rui314 Aug 15, 2020
e0c91ed
Run "as" command unless -S is given
rui314 Oct 8, 2020
0093ad6
Accept multiple input files
rui314 Aug 18, 2020
e57f205
Run "ld" unless -c is given
rui314 Sep 19, 2020
2a77cbf
Add a do-nothing preprocessor
rui314 Aug 18, 2020
db25108
Add the null directive
rui314 Mar 30, 2020
181db7b
Add #include "..."
rui314 Sep 3, 2020
3049c70
Skip extra tokens after `#include "..."`
rui314 Apr 21, 2020
3d7234f
Add -E option
rui314 Aug 20, 2020
a7f8216
Add #if and #endif
rui314 Aug 20, 2020
8d11fb7
Skip nested #if in a skipped #if-clause
rui314 Mar 30, 2020
cb24e04
Add #else
rui314 Mar 30, 2020
482a4bb
Add #elif
rui314 Mar 28, 2020
406d071
Add objlike #define
rui314 Mar 29, 2020
b4ef316
Add #undef
rui314 Mar 29, 2020
70442c3
Expand macros in the #if and #elif argument context
rui314 Aug 20, 2020
db5b291
Do not expand a token more than once for the same objlike macro
rui314 Mar 29, 2020
e141a91
Add #ifdef and #ifndef
rui314 Mar 29, 2020
2b0baf2
Add zero-arity funclike #define
rui314 Aug 18, 2020
952355f
Add multi-arity funclike #define
rui314 Mar 30, 2020
dfff759
Allow empty macro arguments
rui314 Mar 30, 2020
09c5b33
Allow parenthesized expressions as macro arguments
rui314 Mar 29, 2020
e3ce007
Do not expand a token more than once for the same funclike macro
rui314 Aug 31, 2020
995aa74
Add macro stringizing operator (#)
rui314 Aug 29, 2020
c5e6f6f
Add macro token-pasting operator (##)
rui314 Aug 29, 2020
76c681c
Use chibicc's preprocessor for all tests
rui314 Aug 30, 2020
d6f33a8
Add defined() macro operator
rui314 Aug 31, 2020
c7dcbbf
Replace remaining identifiers with 0 in macro constexpr
rui314 Mar 31, 2020
c0b5931
Preserve newline and space during macro expansion
rui314 Aug 29, 2020
7c5d75d
Support line continuation
rui314 Mar 31, 2020
88a5529
Add #include <...>
rui314 Aug 30, 2020
e54105d
Add -I<dir> option
rui314 Sep 25, 2020
ca0dee1
Add default include paths
rui314 Sep 19, 2020
d5ca66f
Add #error
rui314 Apr 1, 2020
4559ea7
Add predefine macros such as __STDC__
rui314 Apr 1, 2020
ad60fe1
Add __FILE__ and __LINE__
rui314 Aug 30, 2020
e5ab35a
Add __VA_ARGS__
rui314 Apr 1, 2020
10102b3
Add __func__
rui314 Sep 4, 2020
5eec02c
[GNU] Add __FUNCTION__
rui314 Sep 4, 2020
2d737a1
Concatenate adjacent string literals
rui314 Apr 26, 2020
61529cc
Recognize wide character literal
rui314 Jul 20, 2020
4116262
Add stdarg.h, stdbool.h, stddef.h, stdalign.h and float.h
rui314 Aug 30, 2020
3830b25
Add va_arg()
rui314 Aug 16, 2020
4357e08
Self-host: including preprocessor, chibicc can compile itself
rui314 Aug 30, 2020
6b317dc
Support passed-on-stack arguments
rui314 Aug 27, 2020
f76716c
Support passed-on-stack parameters
rui314 Jul 30, 2020
b0e187b
Allow struct parameter
rui314 Aug 27, 2020
5195030
Allow struct argument
rui314 Jul 31, 2020
7f42093
Allow to call a fucntion returning a struct
rui314 Aug 27, 2020
7dd60c1
Allow to define a function returning a struct
rui314 Sep 4, 2020
b36506c
Allow variadic function to take more than 6 parameters
rui314 Sep 4, 2020
e64975f
Add va_copy()
rui314 Sep 4, 2020
421a1a3
Dereferencing a function shouldn't do anything
rui314 Apr 23, 2020
7b5c86c
Tokenize numeric tokens as pp-numbers
rui314 Sep 27, 2020
6a66595
Add -D option
rui314 Aug 27, 2020
0fe8a7b
Add -U option
rui314 Aug 18, 2020
665b2b1
Add bitfield
rui314 Aug 27, 2020
a077f81
Support global struct bitfield initializer
rui314 Sep 6, 2020
d42d5b4
Handle op=-style assignments to bitfields
rui314 Aug 27, 2020
5f8e758
Handle zero-width bitfield member
rui314 Aug 23, 2020
66a1fe2
Do not allow to obtain an address of a bitfield
rui314 Aug 23, 2020
c5dbab8
Write to an in-memory buffer before writing to an actual output file
rui314 Aug 18, 2020
1ac475c
Ignore -O, -W and -g and other flags
rui314 Aug 18, 2020
8809f6c
Turn on -Wall compiler flag and fix compiler warnings
rui314 May 10, 2020
a03f271
Make an array of at least 16 bytes long to have alignment of at least…
rui314 May 10, 2020
ae60eaf
Make "main" to implicitly return 0
rui314 May 11, 2020
d7cf3ce
Add anonymous struct and union
rui314 Aug 23, 2020
50faed9
Add __DATE__ and __TIME__ macros
rui314 May 17, 2020
68a4f94
[GNU] Add __COUNTER__ macro
rui314 Jun 8, 2020
69e6206
Canonicalize newline character
rui314 Jun 9, 2020
12dbc90
Add \u and \U escape sequences
rui314 Aug 18, 2020
68fcfb2
Accept multibyte character as wide character literal
rui314 May 6, 2020
cc0818f
Add UTF-16 character literal
rui314 Sep 27, 2020
dac24e5
Add UTF-32 character literal
rui314 Jul 6, 2020
8824778
Add UTF-8 string literal
rui314 Sep 27, 2020
fbe75f9
Add UTF-16 string literal
rui314 Sep 26, 2020
29a7909
Add UTF-32 string literal
rui314 Sep 26, 2020
6c904b6
Add wide string literal
rui314 Jul 5, 2020
f80c7ac
Add UTF-16 string literal initializer
rui314 Jul 7, 2020
477a7c0
Add UTF-32 string literal initializer
rui314 Jul 7, 2020
3d417da
Define __STDC_UTF_{16,32}__ macros
rui314 Jul 8, 2020
0037a0b
Allow multibyte UTF-8 character in identifier
rui314 May 4, 2020
f4e3cbf
[GNU] Accept $ as an identifier character
rui314 Jul 19, 2020
efb5832
Allow to concatenate regular string literals with L/u/U string literals
rui314 Oct 4, 2020
a87ff7a
Skip UTF-8 BOM markers
rui314 Oct 15, 2020
f7468b1
Add array designated initializer
rui314 Sep 18, 2020
083e767
Allow array designators to initialize incomplete arrays
rui314 Sep 20, 2020
96ebbe9
[GNU] Allow to omit "=" in designated initializers
rui314 Jul 23, 2020
11b0dc6
Add struct designated initializer
rui314 Oct 6, 2020
7db650a
Add union designated initializer
rui314 Aug 15, 2020
405315b
Handle struct designator for anonymous struct member
rui314 Jul 19, 2020
2319938
Improve error message for multibyte characters
rui314 Jul 20, 2020
b96c22d
Add #line
rui314 Jul 22, 2020
68dffc3
[GNU] Add line marker directive
rui314 Jul 22, 2020
ff1f2d1
[GNU] Add __TIMESTAMP__ macro
rui314 Jul 22, 2020
6c0ccaa
[GNU] Add __BASE_FILE__ macro
rui314 Aug 20, 2020
b2746d4
Add __VA_OPT__
rui314 Jul 23, 2020
15c76f9
[GNU] Handle ,##__VA_ARG__
rui314 Jul 23, 2020
9b4e257
Ignore #pragma
rui314 Sep 19, 2020
ec78c2e
[GNU] Support GCC-style variadic macro
rui314 Aug 29, 2020
d13b10a
Add typeof
rui314 Jul 23, 2020
86cf40f
[GNU] Add __builtin_types_compatible_p
rui314 Sep 6, 2020
ec844b9
Add _Generic
rui314 Jul 24, 2020
7f2f303
[GNU] Allow sizeof(<function type>)
rui314 Jul 24, 2020
91b284c
[GNU] Add ?: operator with omitted operand
rui314 Sep 13, 2020
e13b1d6
Add basic "asm" statement
rui314 Aug 27, 2020
9620653
Handle inline functions as static functions
rui314 Sep 4, 2020
c147cde
Do not emit static inline functions if referenced by no one
rui314 Sep 13, 2020
6c2e7f9
Use __attribute__((format(print, ...))) to find programming errors
rui314 Aug 2, 2020
24a1bdb
Add -idirafter option
rui314 Sep 19, 2020
18e0053
Add offsetof
rui314 Aug 15, 2020
6d8aeb6
Add tentative definition
rui314 Sep 4, 2020
1886c09
Add -fcommon and -fno-common flags
rui314 Aug 19, 2020
1284e90
Add thread-local variable
rui314 Sep 7, 2020
b19ec6f
Add -include option
rui314 Sep 19, 2020
86003b6
Add -x option
rui314 Sep 19, 2020
ac2b8ef
Make -E to imply -xc
rui314 Sep 28, 2020
f1656f1
Add alloca()
rui314 Sep 4, 2020
1481d1b
Add sizeof() for VLA
rui314 Sep 4, 2020
d401baf
Add pointer arithmetic for VLA
rui314 Sep 3, 2020
e048508
Support sizeof(typename) where typename is a VLA
rui314 Aug 25, 2020
bb8e09c
Do not define __STDC_NO_VLA__
rui314 Aug 25, 2020
c6ae421
Add -l option
rui314 Aug 25, 2020
442ea3c
Add -s option
rui314 Sep 19, 2020
5a9622e
Emit size and type for symbols
rui314 Sep 10, 2020
470645d
Recognize .a and .so files
rui314 Aug 25, 2020
d385947
Add long double
rui314 Aug 29, 2020
0b9218c
[GNU] Support case ranges
rui314 Aug 30, 2020
f37ec40
[GNU] Support array range designator
rui314 Sep 20, 2020
f92240d
[GNU] Support labels-as-values
rui314 Sep 4, 2020
9652442
[GNU] Treat labels-as-values as compile-time constant
rui314 Sep 20, 2020
6b8df9c
Add string hashmap
rui314 Sep 1, 2020
c688a49
Use hashmap for macro name lookup
rui314 Sep 1, 2020
55a40b2
Use hashmap for block-scope lookup
rui314 Oct 7, 2020
0fb4106
Use hashmap for keyword lookup
rui314 Sep 3, 2020
a3618bb
Add -M option
rui314 Sep 3, 2020
a3523f2
Add -MF option
rui314 Aug 18, 2020
8db128c
Add -MP option
rui314 Aug 18, 2020
f33e711
Add -MT option
rui314 Aug 18, 2020
004cbb0
Add -MD option
rui314 Aug 18, 2020
205bb4d
Add -MQ option
rui314 Sep 2, 2020
7f2a0dc
Add -MMD option
rui314 Sep 19, 2020
fb50b62
Add -fpic and -fPIC options
rui314 Sep 8, 2020
240265d
Cache file search results
rui314 Sep 19, 2020
fcf7403
Add include guard optimization
rui314 Oct 5, 2020
fdf57bf
[GNU] Add "#pragma once"
rui314 Oct 5, 2020
866cb64
[GNU] Add #include_next
rui314 Sep 25, 2020
3c3ab57
Add -static option
rui314 Oct 9, 2020
b2db713
Add -shared option
rui314 Oct 9, 2020
cf2ee2c
Add -L option
rui314 Sep 13, 2020
f955405
Add -Wl, option
rui314 Sep 19, 2020
9bc5c64
Add -Xlinker option
rui314 Sep 20, 2020
296ce42
Add scripts to test third-party apps
rui314 Sep 13, 2020
aa9a9d3
Add atomic_compare_exchange
rui314 Sep 15, 2020
1bb084e
Add atomic_exchange
rui314 Sep 15, 2020
7a9996a
Add _Atomic and atomic ++, -- and op= operators
rui314 Sep 15, 2020
8e6af0d
Complete stdatomic.h
rui314 Sep 16, 2020
23be507
Add test/thirdparty/cpython.sh
rui314 Sep 20, 2020
364fab6
redefinition
rui314 Sep 23, 2020
86e9f21
Add __attribute__((packed))
rui314 Oct 7, 2020
e054dd7
Add __attribute__((aligned(N)) for struct declaration
rui314 Sep 25, 2020
bf436cc
Update README
rui314 Sep 30, 2020
d576106
#fix can't use gdb for debug
brewlin Nov 18, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,7 @@
**/*.s
**/a.out
/tmp*
/thirdparty
/chibicc
/test/*.exe
/stage2
36 changes: 30 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,26 +1,50 @@
CFLAGS=-std=c11 -g -fno-common
CFLAGS=-std=c11 -g -fno-common -Wall -Wno-switch

SRCS=$(wildcard *.c)
OBJS=$(SRCS:.c=.o)

TEST_SRCS=$(wildcard test/*.c)
TESTS=$(TEST_SRCS:.c=.exe)

# Stage 1

chibicc: $(OBJS)
$(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS)

$(OBJS): chibicc.h

test/%.exe: chibicc test/%.c
$(CC) -o- -E -P -C test/$*.c | ./chibicc -o test/$*.s -
$(CC) -o $@ test/$*.s -xc test/common
./chibicc -Iinclude -Itest -c -o test/$*.o test/$*.c
$(CC) -pthread -o $@ test/$*.o -xc test/common

test: $(TESTS)
for i in $^; do echo $$i; ./$$i || exit 1; echo; done
test/driver.sh
test/driver.sh ./chibicc

test-all: test test-stage2

# Stage 2

stage2/chibicc: $(OBJS:%=stage2/%)
$(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS)

stage2/%.o: chibicc %.c
mkdir -p stage2/test
./chibicc -c -o $(@D)/$*.o $*.c

stage2/test/%.exe: stage2/chibicc test/%.c
mkdir -p stage2/test
./stage2/chibicc -Iinclude -Itest -c -o stage2/test/$*.o test/$*.c
$(CC) -pthread -o $@ stage2/test/$*.o -xc test/common

test-stage2: $(TESTS:test/%=stage2/test/%)
for i in $^; do echo $$i; ./$$i || exit 1; echo; done
test/driver.sh ./stage2/chibicc

# Misc.

clean:
rm -rf chibicc tmp* $(TESTS) test/*.s test/*.exe
rm -rf chibicc tmp* $(TESTS) test/*.s test/*.exe stage2
find * -type f '(' -name '*~' -o -name '*.o' ')' -exec rm {} ';'

.PHONY: test clean
.PHONY: test clean test-stage2
210 changes: 209 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,209 @@
This is the reference implementation of https://www.sigbus.info/compilerbook.
# chibicc: A Small C Compiler

(The old master has moved to
[historical/old](https://github.com/rui314/chibicc/tree/historical/old)
branch. This is a new one uploaded in September 2020.)

chibicc is yet another small C compiler that implements most C11
features. Even though it still probably falls into the "toy compilers"
category just like other small compilers do, chibicc can compile
several real-world programs, including [Git](https://git-scm.com/),
[SQLite](https://sqlite.org),
[libpng](http://www.libpng.org/pub/png/libpng.html) and chibicc
itself, without making modifications to the compiled programs.
Generated executables of these programs pass their corresponding test
suites. So, chibicc actually supports a wide variety of C11 features
and is able to compile hundreds of thousands of lines of real-world C
code correctly.

chibicc is developed as the reference implementation for a book I'm
currently writing about the C compiler and the low-level programming.
The book covers the vast topic with an incremental approach; in the first
chapter, readers will implement a "compiler" that accepts just a single
number as a "language", which will then gain one feature at a time in each
section of the book until the language that the compiler accepts matches
what the C11 spec specifies. I took this incremental approach from [the
paper](http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf) by Abdulaziz
Ghuloum.

Each commit of this project corresponds to a section of the book. For this
purpose, not only the final state of the project but each commit was
carefully written with readability in mind. Readers should be able to learn
how a C language feature can be implemented just by reading one or a few
commits of this project. For example, this is how
[while](https://github.com/rui314/chibicc/commit/773115ab2a9c4b96f804311b95b20e9771f0190a),
[[]](https://github.com/rui314/chibicc/commit/75fbd3dd6efde12eac8225d8b5723093836170a5),
[?:](https://github.com/rui314/chibicc/commit/1d0e942fd567a35d296d0f10b7693e98b3dd037c),
and [thread-local
variable](https://github.com/rui314/chibicc/commit/79644e54cc1805e54428cde68b20d6d493b76d34)
are implemented. If you have plenty of spare time, it might be fun to read
it from the [first
commit](https://github.com/rui314/chibicc/commit/0522e2d77e3ab82d3b80a5be8dbbdc8d4180561c).

If you like this project, please consider purchasing a copy of the book
when it becomes available! 😀 I publish the source code here to give people
early access to it, because I was planing to do that anyway with a
permissive open-source license after publishing the book. If I don't charge
for the source code, it doesn't make much sense to me to keep it private. I
hope to publish the book in 2021.
You can sign up [here](https://forms.gle/sgrMWHGeGjeeEJcX7) to receive a
notification when a free chapter is available online or the book is published.

I pronounce chibicc as _chee bee cee cee_. "chibi" means "mini" or
"small" in Japanese. "cc" stands for C compiler.

## Status

chibicc supports almost all mandatory features and most optional
features of C11 as well as a few GCC language extensions.

Features that are often missing in a small compiler but supported by
chibicc include (but not limited to):

- Preprocessor
- float, double and long double (x87 80-bit floating point numbers)
- Bit-fields
- alloca()
- Variable-length arrays
- Compound literals
- Thread-local variables
- Atomic variables
- Common symbols
- Designated initializers
- L, u, U and u8 string literals
- Functions that take or return structs as values, as specified by the
x86-64 SystemV ABI

chibicc does not support complex numbers, K&R-style function prototypes
and GCC-style inline assembly. Digraphs and trigraphs are intentionally
left out.

chibicc outputs a simple but nice error message when it finds an error in
source code.

There's no optimization pass. chibicc emits terrible code which is probably
twice or more slower than GCC's output. I have a plan to add an
optimization pass once the frontend is done.

I'm using Ubuntu 20.04 for x86-64 as a development platform. I made a
few small changes so that chibicc works on Ubuntu 18.04, Fedora 32 and
Gentoo 2.6, but portability is not my goal at this moment. It may or
may not work on systems other than Ubuntu 20.04.

## Internals

chibicc consists of the following stages:

- Tokenize: A tokenizer takes a string as an input, breaks it into a list
of tokens and returns them.

- Preprocess: A preprocessor takes as an input a list of tokens and output
a new list of macro-expanded tokens. It interprets preprocessor
directives while expanding macros.

- Parse: A recursive descendent parser constructs abstract syntax trees
from the output of the preprocessor. It also adds a type to each AST
node.

- Codegen: A code generator emits an assembly text for given AST nodes.

## Contributing

When I find a bug in this compiler, I go back to the original commit that
introduced the bug and rewrite the commit history as if there were no such
bug from the beginning. This is an unusual way of fixing bugs, but as a
part of a book, it is important to keep every commit bug-free.

Thus, I do not take pull requests in this repo. You can send me a pull
request if you find a bug, but it is very likely that I will read your
patch and then apply that to my previous commits by rewriting history. I'll
credit your name somewhere, but your changes will be rewritten by me before
submitted to this repository.

Also, please assume that I will occasionally force-push my local repository
to this public one to rewrite history. If you clone this project and make
local commits on top of it, your changes will have to be rebased by hand
when I force-push new commits.

## Design principles

chibicc's core value is its simplicity and the reability of its source
code. To achieve this goal, I was careful not to be too clever when
writing code. Let me explain what that means.

Oftentimes, as you get used to the code base, you are tempted to
_improve_ the code using more abstractions and clever tricks.
But that kind of _improvements_ don't always improve readability for
first-time readers and can actually hurts it. I tried to avoid the
pitfall as much as possible. I wrote this code not for me but for
first-time readers.

If you take a look at the source code, you'll find a couple of
dumb-looking pieces of code. These are written intentionally that way
(but at some places I might be actually missing something,
though). Here is a few notable examples:

- The recursive descendent parser contains many similar-looking functions
for similar-looking generative grammar rules. You might be tempted
to _improve_ it to reduce the duplication using higher-order functions
or macros, but I thought that that's too complicated. It's better to
allow small duplications instead.

- chibicc doesn't try too hard to save memory. An entire input source
file is read to memory first before the tokenizer kicks in, for example.

- Slow algorithms are fine if we know that n isn't too big.
For example, we use a linked list as a set in the preprocessor, so
the membership check takes O(n) where n is the size of the set. But
that's fine because we know n is usually very small.
And even if n can be very big, I stick with a simple slow algorithm
until it is proved by benchmarks that that's a bottleneck.

- Each AST node type uses only a few members of the `Node` struct members.
Other unused `Node` members are just a waste of memory at runtime.
We could save memory using unions, but I decided to simply put everything
in the same struct instead. I believe the inefficiency is negligible.
Even if it matters, we can always change the code to use unions
at any time. I wanted to avoid premature optimization.

- chibicc always allocates heap memory using `calloc`, which is a
variant of `malloc` that clears memory with zero. `calloc` is
slightly slower than `malloc`, but that should be neligible.

- Last but not least, chibicc allocates memory using `calloc` but never
calls `free`. Allocated heap memory is not freed until the process exits.
I'm sure that this memory management policy (or lack thereof) looks
very odd, but it makes sense for short-lived programs such as compilers.
DMD, a compiler for the D programming language, uses the same memory
management scheme for the same reason, for example [1].

## About the Author

I'm Rui Ueyama. I'm the creator of [8cc](https://github.com/rui314/8cc),
which is a hobby C compiler, and also the original creator of the current
version of [LLVM lld](https://lld.llvm.org) linker, which is a
production-quality linker used by various operating systems and large-scale
build systems.

## References

- [tcc](https://bellard.org/tcc/): A small C compiler written by Fabrice
Bellard. I learned a lot from this compiler, but the design of tcc and
chibicc are different. In particular, tcc is a one-pass compiler, while
chibicc is a multi-pass one.

- [lcc](https://github.com/drh/lcc): Another small C compiler. The creators
wrote a [book](https://sites.google.com/site/lccretargetablecompiler/)
about the internals of lcc, which I found a good resource to see how a
compiler is implemented.

- [An Incremental Approach to Compiler
Construction](http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf)

- [Rob Pike's 5 Rules of Programming](https://users.ece.utexas.edu/~adnan/pike.html)

[1] https://www.drdobbs.com/cpp/increasing-compiler-speed-by-over-75/240158941

> DMD does memory allocation in a bit of a sneaky way. Since compilers
> are short-lived programs, and speed is of the essence, DMD just
> mallocs away, and never frees.
Loading