Merge pull request #48 from pedropark99/vectors

Add chapter to talk about SIMD and Vectors
pedropark99 · Sep 28, 2024 · d764c72 · d764c72
2 parents e1ed191 + a6a9b25
commit d764c72
Show file tree

Hide file tree

Showing 54 changed files with 8,095 additions and 32 deletions.
diff --git a/Chapters/01-zig-weird.qmd b/Chapters/01-zig-weird.qmd
@@ -39,7 +39,7 @@ Zig as a modern and better version of C.
 In the author's personal interpretation, Zig is tightly connected with "less is more".
 Instead of trying to become a modern language by adding more and more features,
 many of the core improvements that Zig brings to the
-table are actually about removing annoying and evil behaviours/features from C and C++.
+table are actually about removing annoying behaviours/features from C and C++.
 In other words, Zig tries to be better by simplifying the language, and by having more consistent and robust behaviour.
 As a result, analyzing, writing and debugging applications become much easier and simpler in Zig, than it is in C or C++.
 
@@ -1421,7 +1421,7 @@ details about it. Just as a quick recap:
 
 But, for now, this amount of knowledge is enough for us to continue with this book.
 Later, over the next chapters we will still talk more about other parts of
-Zig's syntax that are also equally important as the other parts. Such as:
+Zig's syntax that are also equally important. Such as:
 
 
 - How Object-Oriented programming can be done in Zig through *struct declarations* at @sec-structs-and-oop.
@@ -1430,7 +1430,7 @@ Zig's syntax that are also equally important as the other parts. Such as:
 - Pointers and Optionals at @sec-pointer;
 - Error handling with `try` and `catch` at @sec-error-handling;
 - Unit tests at @sec-unittests;
-- Vectors;
+- Vectors at @sec-vectors-simd;
 - Build System at @sec-build-system;
 
 

diff --git a/Chapters/15-vectors.qmd b/Chapters/15-vectors.qmd
@@ -0,0 +1,176 @@
+---
+engine: knitr
+knitr: true
+syntax-definition: "../Assets/zig.xml"
+---
+
+
+```{r}
+#| include: false
+source("../zig_engine.R")
+knitr::opts_chunk$set(
+    auto_main = FALSE,
+    build_type = "lib"
+)
+```
+
+
+
+
+# Introducing Vectors and SIMD {#sec-vectors-simd}
+
+In this chapter, I'm going to discuss vectors in Zig, which are
+related to SIMD operations (i.e. they have no relationship with the `std::vector` class
+from C++).
+
+## What is SIMD?
+
+SIMD (*Single Instruction/Multiple Data*) is a group of operations that are widely used
+on video/audio editing programs, and also in graphics applications. SIMD is not a new technology,
+but the massive use of SIMD on normal desktop computers is somewhat recent. In the old days, SIMD
+was only used on "supercomputers models".
+
+Most modern CPU models (from AMD, Intel, etc.) these days (either in a desktop or in a
+notebook model) have support for SIMD operations. So, if you have a very old CPU model installed in your
+computer, then, is possible that you have no support for SIMD operations in your computer.
+
+Why people have started using SIMD on their software? The answer is performance.
+But what SIMD precisely do to achieve better performance? Well, in essence, SIMD operations are a different
+strategy to get parallel computing in your program, and therefore, make faster calculations.
+
+The basic idea behind SIMD is to have a single instruction that operates over multiple data
+at the same time. When you perform a normal scalar operation, like for example, four add instructions,
+each addition is performed separately, one after another. But with SIMD, these four add instructions
+are translated into a single instruction, and, as consequence, the four additions are performed
+in parallel, at the same time.
+
+Currently, the following group of operators are allowed to use on vector objects. All of
+these operators are applied element-wise and in parallel by default.
+
+- Arithmetic (`+`, `-`, `/`, `*`, `@divFloor()`, `@sqrt()`,  `@ceil()`, `@log()`, etc.).
+- Bitwise operators (`>>`, `<<`, `&`, `|`, `~`, etc.).
+- Comparison operators (`<`, `>`, `==`, etc.).
+
+
+## Vectors {#sec-what-vectors}
+
+A SIMD operation is usually performed through a "SIMD intrinsic", which is just a fancy
+name for a function that performs a SIMD operation. These SIMD intrinsics (or "SIMD functions")
+always operate over a special type of object, which are called "vectors". So,
+in order to use SIMD, you have to create a "vector object".
+
+A vector object is usually a fixed-sized block of 128 bits (16 bytes).
+As consequence, most vectors that you find in the wild are essentially arrays that contains 2 values of 8 bytes each,
+or, 4 values of 4 bytes each, or, 8 values of 2 bytes each, etc.
+However, different CPU models may have different extensions (or, "implementations") of SIMD,
+which may offer more types of vector objects that are bigger in size (256 bits or 512 bits)
+to accomodate more data into a single vector object.
+
+You can create a new vector object in Zig by using the `@Vector()` built-in function. Inside this function,
+you specify the vector length (number of elements in the vector), and the data type of the elements
+of the vector. Only primitive data types are supported in these vector objects.
+In the example below, I'm creating two vector objects (`v1` and `v2`) of 4 elements of type `u32` each.
+
+Also notice in the example below, that a third vector object (`v3`) is created from the
+sum of the previous two vector objects (`v1` plus `v2`). Therefore,
+math operations over vector objects take place element-wise by default, because
+the same operation (in this case, addition) is transformed into a single instruction
+that is replicated in parallel, across all elements of the vectors.
+
+
+```{zig}
+#| auto_main: true
+#| build_type: "run"
+const v1 = @Vector(4, u32){4, 12, 37, 9};
+const v2 = @Vector(4, u32){10, 22, 5, 12};
+const v3 = v1 + v2;
+try stdout.print("{any}\n", .{v3});
+```
+
+This is how SIMD introduces more performance in your program. Instead of using a for loop
+to iterate through the elements of `v1` and `v2`, and adding them together, one element at a time,
+we enjoy the benefits of SIMD, which performs all 4 additions in parallel, at the same time.
+
+Therefore, the `@Vector` structure in Zig is essentially, the Zig representation of SIMD vector objects.
+But the elements on these vector objects will be operated in parallel, if, and only if your current CPU model
+supports SIMD operations. If your CPU model does not support SIMD, then, the `@Vector` structure will
+likely produce a similar performance from a "for loop solution".
+
+
+### Transforming arrays into vectors
+
+There are different ways you can transform a normal array into a vector object.
+You can either use implicit conversion (which is when you assign the array to
+a vector object directly), or, use slices to create a vector object from a normal array.
+
+In the example below, we implicitly convert the array `a1` into a vector object (`v1`)
+of length 4. All we had to do was to just explicitly annotate the data type of the vector object,
+and then, assign the array object to this vector object.
+
+Also notice in the example below, that a second vector object (`v2`) is also created
+by taking a slice of the array object (`a1`), and then, storing the pointer to this
+slice (`.*`) into this vector object.
+
+
+```{zig}
+#| auto_main: true
+#| build_type: "run"
+const a1 = [4]u32{4, 12, 37, 9};
+const v1: @Vector(4, u32) = a1;
+const v2: @Vector(2, u32) = a1[1..3].*;
+_ = v1; _ = v2;
+```
+
+
+Is worth emphasizing that only arrays and slices whose sizes
+are compile-time known can be transformed into vectors. Vectors in general
+are structures that work only with compile-time known sizes. Therefore, if
+you have an array whose size is runtime known, then, you first need to
+copy it into an array with a compile-time known size, before transforming it into a vector.
+
+
+
+### The `@splat()` function
+
+You can use the `@splat()` built-in function to create a vector object that is filled
+with the same value across all of it's elements. This function was created to offer a quick
+and easy way to directly convert a scalar value (a.k.a. a single value, like a single character, or a single integer, etc.)
+into a vector object.
+
+Thus, we can use `@splat()` to convert a single value, like the integer `16` into a vector object
+of length 1. But we can also use this function to convert the same integer `16` into a
+vector object of length 10, that is filled with 10 `16` values. The example below demonstrates
+this idea.
+
+```{zig}
+#| auto_main: true
+#| build_type: "run"
+const v1: @Vector(10, u32) = @splat(16);
+try stdout.print("{any}\n", .{v1});
+```
+
+
+
+### Careful with vectors that are too big
+
+As I described at @sec-what-vectors, each vector object is usually a small block of 128, 256 or 512 bits.
+This means that a vector object is usually small in size, and when you try to go in the opposite direction,
+by creating a vector object in Zig that is very big in size (i.e. sizes that are close to $2^{20}$),
+you usually end up with crashes and loud errors from the compiler.
+
+For example, if you try to compile the program below, you will likely face segmentation faults, or, LLVM errors during
+the build process. Just be careful to not create vector objects that are too big in size.
+
+```{zig}
+#| eval: false
+const v1: @Vector(1000000, u32) = @splat(16);
+_ = v1;
+```
+
+```
+Segmentation fault (core dumped)
+```
+
+
+
+
diff --git a/ZigExamples/image_filter/src/test.zig b/ZigExamples/image_filter/src/test.zig
@@ -36,19 +36,27 @@ fn read_data_to_buffer(ctx: *png.spng_ctx, buffer: []u8) !void {
 
 fn apply_image_filter(buffer: []u8) !void {
     const len = buffer.len;
-    const red_factor: f16 = 0.2126;
-    const green_factor: f16 = 0.7152;
-    const blue_factor: f16 = 0.0722;
-    var index: u64 = 0;
+    var rv: @Vector(1080000, f16) = @splat(0.0);
+    var gv: @Vector(1080000, f16) = @splat(0.0);
+    var bv: @Vector(1080000, f16) = @splat(0.0);
+
+    var index: usize = 0;
+    var vec_index: usize = 0;
     while (index < (len - 4)) : (index += 4) {
-        const rf: f16 = @floatFromInt(buffer[index]);
-        const gf: f16 = @floatFromInt(buffer[index + 1]);
-        const bf: f16 = @floatFromInt(buffer[index + 2]);
-        const y_linear: f16 = ((rf * red_factor) + (gf * green_factor) + (bf * blue_factor));
-        buffer[index] = @intFromFloat(y_linear);
-        buffer[index + 1] = @intFromFloat(y_linear);
-        buffer[index + 2] = @intFromFloat(y_linear);
+        rv[vec_index] = @floatFromInt(buffer[index]);
+        gv[vec_index + 1] = @floatFromInt(buffer[index + 1]);
+        bv[vec_index + 2] = @floatFromInt(buffer[index + 2]);
+        vec_index += 3;
     }
+
+    const rfactor: @Vector(1080000, f16) = @splat(0.2126);
+    const gfactor: @Vector(1080000, f16) = @splat(0.7152);
+    const bfactor: @Vector(1080000, f16) = @splat(0.0722);
+    rv = rv * rfactor;
+    gv = gv * gfactor;
+    bv = bv * bfactor;
+    const result = rv + gv + bv;
+    try stdout.print("{any}\n", .{result});
 }
 
 fn save_png(image_header: *png.spng_ihdr, buffer: []u8) !void {
@@ -82,12 +90,10 @@ pub fn main() !void {
 
     var gpa = std.heap.GeneralPurposeAllocator(.{}){};
     const allocator = gpa.allocator();
-    var image_header = try get_image_header(ctx);
     const output_size = try calc_output_size(ctx);
     var buffer = try allocator.alloc(u8, output_size);
     @memset(buffer[0..], 0);
 
     try read_data_to_buffer(ctx, buffer[0..]);
     try apply_image_filter(buffer[0..]);
-    try save_png(&image_header, buffer[0..]);
 }
diff --git a/ZigExamples/vectors/build.zig b/ZigExamples/vectors/build.zig
@@ -0,0 +1,91 @@
+const std = @import("std");
+
+// Although this function looks imperative, note that its job is to
+// declaratively construct a build graph that will be executed by an external
+// runner.
+pub fn build(b: *std.Build) void {
+    // Standard target options allows the person running `zig build` to choose
+    // what target to build for. Here we do not override the defaults, which
+    // means any target is allowed, and the default is native. Other options
+    // for restricting supported target set are available.
+    const target = b.standardTargetOptions(.{});
+
+    // Standard optimization options allow the person running `zig build` to select
+    // between Debug, ReleaseSafe, ReleaseFast, and ReleaseSmall. Here we do not
+    // set a preferred release mode, allowing the user to decide how to optimize.
+    const optimize = b.standardOptimizeOption(.{});
+
+    const lib = b.addStaticLibrary(.{
+        .name = "vectors",
+        // In this case the main source file is merely a path, however, in more
+        // complicated build scripts, this could be a generated file.
+        .root_source_file = b.path("src/root.zig"),
+        .target = target,
+        .optimize = optimize,
+    });
+
+    // This declares intent for the library to be installed into the standard
+    // location when the user invokes the "install" step (the default step when
+    // running `zig build`).
+    b.installArtifact(lib);
+
+    const exe = b.addExecutable(.{
+        .name = "vectors",
+        .root_source_file = b.path("src/main.zig"),
+        .target = target,
+        .optimize = optimize,
+    });
+
+    // This declares intent for the executable to be installed into the
+    // standard location when the user invokes the "install" step (the default
+    // step when running `zig build`).
+    b.installArtifact(exe);
+
+    // This *creates* a Run step in the build graph, to be executed when another
+    // step is evaluated that depends on it. The next line below will establish
+    // such a dependency.
+    const run_cmd = b.addRunArtifact(exe);
+
+    // By making the run step depend on the install step, it will be run from the
+    // installation directory rather than directly from within the cache directory.
+    // This is not necessary, however, if the application depends on other installed
+    // files, this ensures they will be present and in the expected location.
+    run_cmd.step.dependOn(b.getInstallStep());
+
+    // This allows the user to pass arguments to the application in the build
+    // command itself, like this: `zig build run -- arg1 arg2 etc`
+    if (b.args) |args| {
+        run_cmd.addArgs(args);
+    }
+
+    // This creates a build step. It will be visible in the `zig build --help` menu,
+    // and can be selected like this: `zig build run`
+    // This will evaluate the `run` step rather than the default, which is "install".
+    const run_step = b.step("run", "Run the app");
+    run_step.dependOn(&run_cmd.step);
+
+    // Creates a step for unit testing. This only builds the test executable
+    // but does not run it.
+    const lib_unit_tests = b.addTest(.{
+        .root_source_file = b.path("src/root.zig"),
+        .target = target,
+        .optimize = optimize,
+    });
+
+    const run_lib_unit_tests = b.addRunArtifact(lib_unit_tests);
+
+    const exe_unit_tests = b.addTest(.{
+        .root_source_file = b.path("src/main.zig"),
+        .target = target,
+        .optimize = optimize,
+    });
+
+    const run_exe_unit_tests = b.addRunArtifact(exe_unit_tests);
+
+    // Similar to creating the run step earlier, this exposes a `test` step to
+    // the `zig build --help` menu, providing a way for the user to request
+    // running the unit tests.
+    const test_step = b.step("test", "Run unit tests");
+    test_step.dependOn(&run_lib_unit_tests.step);
+    test_step.dependOn(&run_exe_unit_tests.step);
+}