Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llvm-doc: initial review notes for index, basic, funcs #1

Merged
merged 1 commit into from
Apr 2, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 12 additions & 11 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

#### Why LLVM?

When creating a compiler, a classical design looks like this:
When creating a compiler, a classical design may look like this:

{% dot attack_plan.svg
digraph hierarchy {
Expand All @@ -14,9 +14,11 @@ digraph hierarchy {
}
%}

This is quite good in old days. There has only one input language, and one target machine.
This worked quite well in the old days. There was only one input language, and one target machine.

But there has more and more target machines have to support! Therefore, we need LLVM. Here is the new design:
Today there exist a lot of target machines to support! And a lot of input languages. Without a shared representation many parts of the compiler would have to be reimplemented for every input/output pair.

LLVM offers a solution to this problem by defining such a shared representation, namely LLVM IR. Here is the new design:

{% dot attack_plan.svg
digraph hierarchy {
Expand All @@ -28,22 +30,21 @@ digraph hierarchy {
}
%}

Now we only have to focus on our frontend and optimizer! Thanks you, Chris Lattner and who had work for LLVM.
To write a compiler for a new language, now we only have to focus on our frontend. Similarly, to add support for a new target machine, now we only have to add a new backend. And to improve the code generation of all input/output pairs, now we only have to focus on the middle end optimizer. Thank you, Chris Lattner and all those who have contributed to LLVM.

#### Why llir/llvm?

The target of [llir/llvm](https://github.com/llir/llvm) is: interact in Go with LLVM IR without binding with LLVM.
Therefore, you don't have to compile LLVM(could take few hours), no fighting with cgo.
Working under pure Go environment and start your journey.
The aim of [llir/llvm](https://github.com/llir/llvm) is to provide a library for interacting with LLVM IR in pure Go. Importantly, `llir/llvm` is not a binding for LLVM.
Therefore, you don't have to compile LLVM (which could take a few hours), and no need to fight with Cgo.
Work under a pure Go environment and start your journey.

## Installation

To install [llir/llvm](https://github.com/llir/llvm), all you need to do is: `go get github.com/llir/llvm`.

## Usage

According to packages, [llir/llvm](https://github.com/llir/llvm) can be separated to two main parts:

1. `asm`: This package implements a parser for LLVM IR assembly files. Users can use it for analyzing LLVM IR files.
2. `ir`: This package declares the types used to represent LLVM IR modules. Users can use it for build LLVM IR modules and operating on them.
According to packages, [llir/llvm](https://github.com/llir/llvm) can be separated into two main parts:

1. [asm](https://pkg.go.dev/github.com/llir/llvm/asm?tab=doc): This package implements a parser for LLVM IR assembly files. Users can use it for analyzing LLVM IR files.
2. [ir](https://pkg.go.dev/github.com/llir/llvm/ir?tab=doc): This package declares the types used to represent LLVM IR modules. Users can use it for build LLVM IR modules and operating on them.
65 changes: 32 additions & 33 deletions docs/user-guide/basic.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,16 @@

## Module

A LLVM IR file is a module. A module owns many global level components:
An LLVM IR file is a module. A module has many top-level entities:

- global variable
- function
- type
- global variables
- functions
- types
- metadata

In this basic introduction, we don't dig into metadata, but focus on what can we do with global variable, function, and type.
In this basic introduction, we won't dig into metadata, but instead focus on what we can do with global variables, functions, and types.

[llir/llvm](https://github.com/llir/llvm) provides package `ir` for these concepts, let's see what can a C program being translated to LLVM IR using [llir/llvm](https://github.com/llir/llvm).
[llir/llvm](https://github.com/llir/llvm) provides package [ir](https://pkg.go.dev/github.com/llir/llvm/ir?tab=doc) for these concepts. Let's see how a C program can be translated into LLVM IR using [llir/llvm](https://github.com/llir/llvm).

C example:

Expand All @@ -22,8 +22,7 @@ int add(int x, int y) {
return x + y;
}
int main() {
add(1, g);
return 0;
return add(1, g);
}
```

Expand Down Expand Up @@ -55,8 +54,7 @@ func main() {
types.I32,
) // omit parameters
mb := funcMain.NewBlock("") // llir/llvm would give correct default name for block without name
mb.NewCall(funcAdd, constant.NewInt(types.I32, 1), mb.NewLoad(types.I32, globalG))
mb.NewRet(constant.NewInt(types.I32, 0))
mb.NewRet(mb.NewCall(funcAdd, constant.NewInt(types.I32, 1), mb.NewLoad(types.I32, globalG)))

println(m.String())
}
Expand All @@ -77,21 +75,21 @@ define i32 @main() {
; <label>:0
%1 = load i32, i32* @g
%2 = call i32 @add(i32 1, i32 %1)
ret i32 0
ret i32 %2
}
```

In this example, we have global variable and function, mapping to C code. Now we dig into global variable.
In this example, we have one global variable and two functions, mapping to C code. Now let's dig into global variables.

## Global Variable

Globals prefixed with `@` character.
An important thing is globals in LLVM, is a pointer, so have to `load` for its value,`store` to update its value.
In LLVM IR assembly, the identifier of global variables are prefixed with an `@` character.
Importantly, global variables are represented in LLVM as pointers, so we have to use [load](https://pkg.go.dev/github.com/llir/llvm/ir?tab=doc#InstLoad) to retreive the value and [store](https://pkg.go.dev/github.com/llir/llvm/ir?tab=doc#InstStore) to update the value of a global variable.

## Function

As globals, function name prefixed with `@` character. Function composed by prototype and a group of basic blocks.
If there has no basic block, then a function is a declaration, the following code would generate a declaration:
Like globals, in LLVM IR assembly the identifier of functions are prefixed with an `@` character. Functions are composed by a function prototype prototype and a group of basic blocks.
A funciton without basic blocks is a function declaration. The following code would generate a function declaration:

```go
m.NewFunc(
Expand All @@ -108,53 +106,54 @@ Output:
declare i32 @add(i32, i32)
```

When we want to bind to existed function in others object files, we would create a declaration.
When we want to bind to existing functions defined in other object files, we would create function declarations.

### Prototype
### Function Prototype

Prototype means parameters and return type.
A function prototype or function signature defines the parameters and return type of a function.

### Basic Block

If function is group of basic blocks, then basic blocks is a group of instructions.
An important thing is most high-level expression would break down into few instructions.
If function is group of basic blocks, then a basic block is a group of instructions. The basic notion behind a basic block is that if any instruction of a basic block is executed, then all instructions of the basic block are executed. In other words, there may be no branching or terminating instruction in the middle of a basic block, and all incoming branches must transfer control flow to the first instruction of the basic block.

It is worthwhile to note that most high-level expression would be lowered into a set of instructions, covering one or more basic blocks.

[llir/llvm](https://github.com/llir/llvm) provides API to create instructions by a basic block.
To get more information, goto [Block API document](https://pkg.go.dev/github.com/llir/llvm@v0.3.0/ir?tab=doc#Block).
For further information, refer to the [Block API documentation](https://pkg.go.dev/github.com/llir/llvm/ir?tab=doc#Block).

### Instruction

Instruction is a set of operations on assembly abstraction level to operate on an abstract machine model.
To get more information, goto [LLVM Language Reference Manual: instruction reference](https://llvm.org/docs/LangRef.html#instruction-reference).
An instruction is a set of operations on assembly abstraction level which operate on an abstract machine model, as defined by LLVM.
For further information, refer to the [Instruction Reference section of the LLVM Language Reference Manual](https://llvm.org/docs/LangRef.html#instruction-reference).

## Type

There are many types in LLVM type system, here focus on how to create a new type.
There are many types in LLVM type system, here we focus on how to create a new type.

```go
m := ir.NewModule()

m.NewTypeDef("foo", types.NewStruct(types.I32))
```

Above code would produce:
The above code would produce the following IR:

```llvm
%foo = type { i32 }
```

It could map to C code:
Which could be mapped to the following C code:

```c
struct foo {
typedef struct {
int x;
};
} foo;
```

Notice in LLVM, structure field has no name.
Notice that in LLVM, structure fields have no name.

## Conclusion

Hope previous sections provide enough information about how to get enough information to dig into details.
We will not dig into the details of each instruction, instead of that, we would provide a whole picture about how to use the library.
Therefore, the next section is a list of common high-level concept and how to map them to IR.
We hope that the previous sections have provide enough information about how to get use the documentation to dig into details.
We will not dig into the details of each instruction; instead, we aim to provide a whole picture about how to use the library.
Therefore, the next section is a list of common high-level concept and how to map them to IR.
26 changes: 13 additions & 13 deletions docs/user-guide/funcs.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,14 @@ m := ir.NewModule()

add := m.NewFunc("add", types.I64, ir.NewParam("", types.I64))
add.Linkage = enum.LinkageInternal
add = m.NewFunc("add1", types.I64, ir.NewParam("", types.I64))
add.Linkage = enum.LinkageLinkOnce
add = m.NewFunc("add2", types.I64, ir.NewParam("", types.I64))
add.Linkage = enum.LinkagePrivate
add = m.NewFunc("add3", types.I64, ir.NewParam("", types.I64))
add.Linkage = enum.LinkageWeak
add = m.NewFunc("add4", types.I64, ir.NewParam("", types.I64))
add.Linkage = enum.LinkageExternal
add1 := m.NewFunc("add1", types.I64, ir.NewParam("", types.I64))
add1.Linkage = enum.LinkageLinkOnce
add2 := m.NewFunc("add2", types.I64, ir.NewParam("", types.I64))
add2.Linkage = enum.LinkagePrivate
add3 := m.NewFunc("add3", types.I64, ir.NewParam("", types.I64))
add3.Linkage = enum.LinkageWeak
add4 := m.NewFunc("add4", types.I64, ir.NewParam("", types.I64))
add4.Linkage = enum.LinkageExternal
```

The code would produce:
Expand All @@ -33,11 +33,11 @@ declare weak i64 @add3(i64)
declare external i64 @add4(i64)
```

To get more information about linkage, read [llvm doc](https://llvm.org/docs/LangRef.html#linkage-types) and [pkg.go.dev](https://pkg.go.dev/github.com/llir/llvm/ir/enum?tab=doc#Linkage).
For further information about linkage, refer to [LLVM doc](https://llvm.org/docs/LangRef.html#linkage-types) and [pkg.go.dev](https://pkg.go.dev/github.com/llir/llvm/ir/enum?tab=doc#Linkage).

### Variant Argument(a.k.a. VAArg)
### Variant Argument (a.k.a. VAArg)

One example is `printf`:
One example of a variadic function is `printf`. This is how to create a function prototype for `printf`:

```go
m := ir.NewModule()
Expand All @@ -50,12 +50,12 @@ printf := m.NewFunc(
printf.Sig.Variadic = true
```

The code would produce:
The above code would produce the following IR:

```llvm
declare i32 @printf(i8*, ...)
```

### Function Overloading

There has no overloading in IR, therefore solution is creating two functions.
There is no overloading in LLVM IR. One solution is to create one function per function signature, where each LLVM IR function would have a unique name (this is why C++ compilers do name mangling).
2 changes: 1 addition & 1 deletion docs/user-guide/types.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,4 +115,4 @@ case expr of
EString s -> "string"
```

`case` expression would need tag to distinguish variant.
`case` expression would need tag to distinguish variant.