Skip to content

Commit

Permalink
Merge pull request #48 from pedropark99/vectors
Browse files Browse the repository at this point in the history
Add chapter to talk about SIMD and Vectors
  • Loading branch information
pedropark99 authored Sep 28, 2024
2 parents e1ed191 + a6a9b25 commit d764c72
Show file tree
Hide file tree
Showing 54 changed files with 8,095 additions and 32 deletions.
6 changes: 3 additions & 3 deletions Chapters/01-zig-weird.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Zig as a modern and better version of C.
In the author's personal interpretation, Zig is tightly connected with "less is more".
Instead of trying to become a modern language by adding more and more features,
many of the core improvements that Zig brings to the
table are actually about removing annoying and evil behaviours/features from C and C++.
table are actually about removing annoying behaviours/features from C and C++.
In other words, Zig tries to be better by simplifying the language, and by having more consistent and robust behaviour.
As a result, analyzing, writing and debugging applications become much easier and simpler in Zig, than it is in C or C++.

Expand Down Expand Up @@ -1421,7 +1421,7 @@ details about it. Just as a quick recap:

But, for now, this amount of knowledge is enough for us to continue with this book.
Later, over the next chapters we will still talk more about other parts of
Zig's syntax that are also equally important as the other parts. Such as:
Zig's syntax that are also equally important. Such as:


- How Object-Oriented programming can be done in Zig through *struct declarations* at @sec-structs-and-oop.
Expand All @@ -1430,7 +1430,7 @@ Zig's syntax that are also equally important as the other parts. Such as:
- Pointers and Optionals at @sec-pointer;
- Error handling with `try` and `catch` at @sec-error-handling;
- Unit tests at @sec-unittests;
- Vectors;
- Vectors at @sec-vectors-simd;
- Build System at @sec-build-system;


Expand Down
176 changes: 176 additions & 0 deletions Chapters/15-vectors.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
---
engine: knitr
knitr: true
syntax-definition: "../Assets/zig.xml"
---


```{r}
#| include: false
source("../zig_engine.R")
knitr::opts_chunk$set(
auto_main = FALSE,
build_type = "lib"
)
```




# Introducing Vectors and SIMD {#sec-vectors-simd}

In this chapter, I'm going to discuss vectors in Zig, which are
related to SIMD operations (i.e. they have no relationship with the `std::vector` class
from C++).

## What is SIMD?

SIMD (*Single Instruction/Multiple Data*) is a group of operations that are widely used
on video/audio editing programs, and also in graphics applications. SIMD is not a new technology,
but the massive use of SIMD on normal desktop computers is somewhat recent. In the old days, SIMD
was only used on "supercomputers models".

Most modern CPU models (from AMD, Intel, etc.) these days (either in a desktop or in a
notebook model) have support for SIMD operations. So, if you have a very old CPU model installed in your
computer, then, is possible that you have no support for SIMD operations in your computer.

Why people have started using SIMD on their software? The answer is performance.
But what SIMD precisely do to achieve better performance? Well, in essence, SIMD operations are a different
strategy to get parallel computing in your program, and therefore, make faster calculations.

The basic idea behind SIMD is to have a single instruction that operates over multiple data
at the same time. When you perform a normal scalar operation, like for example, four add instructions,
each addition is performed separately, one after another. But with SIMD, these four add instructions
are translated into a single instruction, and, as consequence, the four additions are performed
in parallel, at the same time.

Currently, the following group of operators are allowed to use on vector objects. All of
these operators are applied element-wise and in parallel by default.

- Arithmetic (`+`, `-`, `/`, `*`, `@divFloor()`, `@sqrt()`, `@ceil()`, `@log()`, etc.).
- Bitwise operators (`>>`, `<<`, `&`, `|`, `~`, etc.).
- Comparison operators (`<`, `>`, `==`, etc.).


## Vectors {#sec-what-vectors}

A SIMD operation is usually performed through a "SIMD intrinsic", which is just a fancy
name for a function that performs a SIMD operation. These SIMD intrinsics (or "SIMD functions")
always operate over a special type of object, which are called "vectors". So,
in order to use SIMD, you have to create a "vector object".

A vector object is usually a fixed-sized block of 128 bits (16 bytes).
As consequence, most vectors that you find in the wild are essentially arrays that contains 2 values of 8 bytes each,
or, 4 values of 4 bytes each, or, 8 values of 2 bytes each, etc.
However, different CPU models may have different extensions (or, "implementations") of SIMD,
which may offer more types of vector objects that are bigger in size (256 bits or 512 bits)
to accomodate more data into a single vector object.

You can create a new vector object in Zig by using the `@Vector()` built-in function. Inside this function,
you specify the vector length (number of elements in the vector), and the data type of the elements
of the vector. Only primitive data types are supported in these vector objects.
In the example below, I'm creating two vector objects (`v1` and `v2`) of 4 elements of type `u32` each.

Also notice in the example below, that a third vector object (`v3`) is created from the
sum of the previous two vector objects (`v1` plus `v2`). Therefore,
math operations over vector objects take place element-wise by default, because
the same operation (in this case, addition) is transformed into a single instruction
that is replicated in parallel, across all elements of the vectors.


```{zig}
#| auto_main: true
#| build_type: "run"
const v1 = @Vector(4, u32){4, 12, 37, 9};
const v2 = @Vector(4, u32){10, 22, 5, 12};
const v3 = v1 + v2;
try stdout.print("{any}\n", .{v3});
```

This is how SIMD introduces more performance in your program. Instead of using a for loop
to iterate through the elements of `v1` and `v2`, and adding them together, one element at a time,
we enjoy the benefits of SIMD, which performs all 4 additions in parallel, at the same time.

Therefore, the `@Vector` structure in Zig is essentially, the Zig representation of SIMD vector objects.
But the elements on these vector objects will be operated in parallel, if, and only if your current CPU model
supports SIMD operations. If your CPU model does not support SIMD, then, the `@Vector` structure will
likely produce a similar performance from a "for loop solution".


### Transforming arrays into vectors

There are different ways you can transform a normal array into a vector object.
You can either use implicit conversion (which is when you assign the array to
a vector object directly), or, use slices to create a vector object from a normal array.

In the example below, we implicitly convert the array `a1` into a vector object (`v1`)
of length 4. All we had to do was to just explicitly annotate the data type of the vector object,
and then, assign the array object to this vector object.

Also notice in the example below, that a second vector object (`v2`) is also created
by taking a slice of the array object (`a1`), and then, storing the pointer to this
slice (`.*`) into this vector object.


```{zig}
#| auto_main: true
#| build_type: "run"
const a1 = [4]u32{4, 12, 37, 9};
const v1: @Vector(4, u32) = a1;
const v2: @Vector(2, u32) = a1[1..3].*;
_ = v1; _ = v2;
```


Is worth emphasizing that only arrays and slices whose sizes
are compile-time known can be transformed into vectors. Vectors in general
are structures that work only with compile-time known sizes. Therefore, if
you have an array whose size is runtime known, then, you first need to
copy it into an array with a compile-time known size, before transforming it into a vector.



### The `@splat()` function

You can use the `@splat()` built-in function to create a vector object that is filled
with the same value across all of it's elements. This function was created to offer a quick
and easy way to directly convert a scalar value (a.k.a. a single value, like a single character, or a single integer, etc.)
into a vector object.

Thus, we can use `@splat()` to convert a single value, like the integer `16` into a vector object
of length 1. But we can also use this function to convert the same integer `16` into a
vector object of length 10, that is filled with 10 `16` values. The example below demonstrates
this idea.

```{zig}
#| auto_main: true
#| build_type: "run"
const v1: @Vector(10, u32) = @splat(16);
try stdout.print("{any}\n", .{v1});
```



### Careful with vectors that are too big

As I described at @sec-what-vectors, each vector object is usually a small block of 128, 256 or 512 bits.
This means that a vector object is usually small in size, and when you try to go in the opposite direction,
by creating a vector object in Zig that is very big in size (i.e. sizes that are close to $2^{20}$),
you usually end up with crashes and loud errors from the compiler.

For example, if you try to compile the program below, you will likely face segmentation faults, or, LLVM errors during
the build process. Just be careful to not create vector objects that are too big in size.

```{zig}
#| eval: false
const v1: @Vector(1000000, u32) = @splat(16);
_ = v1;
```

```
Segmentation fault (core dumped)
```




32 changes: 19 additions & 13 deletions ZigExamples/image_filter/src/test.zig
Original file line number Diff line number Diff line change
Expand Up @@ -36,19 +36,27 @@ fn read_data_to_buffer(ctx: *png.spng_ctx, buffer: []u8) !void {

fn apply_image_filter(buffer: []u8) !void {
const len = buffer.len;
const red_factor: f16 = 0.2126;
const green_factor: f16 = 0.7152;
const blue_factor: f16 = 0.0722;
var index: u64 = 0;
var rv: @Vector(1080000, f16) = @splat(0.0);
var gv: @Vector(1080000, f16) = @splat(0.0);
var bv: @Vector(1080000, f16) = @splat(0.0);

var index: usize = 0;
var vec_index: usize = 0;
while (index < (len - 4)) : (index += 4) {
const rf: f16 = @floatFromInt(buffer[index]);
const gf: f16 = @floatFromInt(buffer[index + 1]);
const bf: f16 = @floatFromInt(buffer[index + 2]);
const y_linear: f16 = ((rf * red_factor) + (gf * green_factor) + (bf * blue_factor));
buffer[index] = @intFromFloat(y_linear);
buffer[index + 1] = @intFromFloat(y_linear);
buffer[index + 2] = @intFromFloat(y_linear);
rv[vec_index] = @floatFromInt(buffer[index]);
gv[vec_index + 1] = @floatFromInt(buffer[index + 1]);
bv[vec_index + 2] = @floatFromInt(buffer[index + 2]);
vec_index += 3;
}

const rfactor: @Vector(1080000, f16) = @splat(0.2126);
const gfactor: @Vector(1080000, f16) = @splat(0.7152);
const bfactor: @Vector(1080000, f16) = @splat(0.0722);
rv = rv * rfactor;
gv = gv * gfactor;
bv = bv * bfactor;
const result = rv + gv + bv;
try stdout.print("{any}\n", .{result});
}

fn save_png(image_header: *png.spng_ihdr, buffer: []u8) !void {
Expand Down Expand Up @@ -82,12 +90,10 @@ pub fn main() !void {

var gpa = std.heap.GeneralPurposeAllocator(.{}){};
const allocator = gpa.allocator();
var image_header = try get_image_header(ctx);
const output_size = try calc_output_size(ctx);
var buffer = try allocator.alloc(u8, output_size);
@memset(buffer[0..], 0);

try read_data_to_buffer(ctx, buffer[0..]);
try apply_image_filter(buffer[0..]);
try save_png(&image_header, buffer[0..]);
}
91 changes: 91 additions & 0 deletions ZigExamples/vectors/build.zig
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
const std = @import("std");

// Although this function looks imperative, note that its job is to
// declaratively construct a build graph that will be executed by an external
// runner.
pub fn build(b: *std.Build) void {
// Standard target options allows the person running `zig build` to choose
// what target to build for. Here we do not override the defaults, which
// means any target is allowed, and the default is native. Other options
// for restricting supported target set are available.
const target = b.standardTargetOptions(.{});

// Standard optimization options allow the person running `zig build` to select
// between Debug, ReleaseSafe, ReleaseFast, and ReleaseSmall. Here we do not
// set a preferred release mode, allowing the user to decide how to optimize.
const optimize = b.standardOptimizeOption(.{});

const lib = b.addStaticLibrary(.{
.name = "vectors",
// In this case the main source file is merely a path, however, in more
// complicated build scripts, this could be a generated file.
.root_source_file = b.path("src/root.zig"),
.target = target,
.optimize = optimize,
});

// This declares intent for the library to be installed into the standard
// location when the user invokes the "install" step (the default step when
// running `zig build`).
b.installArtifact(lib);

const exe = b.addExecutable(.{
.name = "vectors",
.root_source_file = b.path("src/main.zig"),
.target = target,
.optimize = optimize,
});

// This declares intent for the executable to be installed into the
// standard location when the user invokes the "install" step (the default
// step when running `zig build`).
b.installArtifact(exe);

// This *creates* a Run step in the build graph, to be executed when another
// step is evaluated that depends on it. The next line below will establish
// such a dependency.
const run_cmd = b.addRunArtifact(exe);

// By making the run step depend on the install step, it will be run from the
// installation directory rather than directly from within the cache directory.
// This is not necessary, however, if the application depends on other installed
// files, this ensures they will be present and in the expected location.
run_cmd.step.dependOn(b.getInstallStep());

// This allows the user to pass arguments to the application in the build
// command itself, like this: `zig build run -- arg1 arg2 etc`
if (b.args) |args| {
run_cmd.addArgs(args);
}

// This creates a build step. It will be visible in the `zig build --help` menu,
// and can be selected like this: `zig build run`
// This will evaluate the `run` step rather than the default, which is "install".
const run_step = b.step("run", "Run the app");
run_step.dependOn(&run_cmd.step);

// Creates a step for unit testing. This only builds the test executable
// but does not run it.
const lib_unit_tests = b.addTest(.{
.root_source_file = b.path("src/root.zig"),
.target = target,
.optimize = optimize,
});

const run_lib_unit_tests = b.addRunArtifact(lib_unit_tests);

const exe_unit_tests = b.addTest(.{
.root_source_file = b.path("src/main.zig"),
.target = target,
.optimize = optimize,
});

const run_exe_unit_tests = b.addRunArtifact(exe_unit_tests);

// Similar to creating the run step earlier, this exposes a `test` step to
// the `zig build --help` menu, providing a way for the user to request
// running the unit tests.
const test_step = b.step("test", "Run unit tests");
test_step.dependOn(&run_lib_unit_tests.step);
test_step.dependOn(&run_exe_unit_tests.step);
}
Loading

0 comments on commit d764c72

Please sign in to comment.