This tutorial assumes you have wake and git installed in your path.
Code sections are intended to either be copy-pasted into a terminal, or to be
added to a tutorial.wake
file. If you're reading the raw file, these are
indicated above the code block; if the Markdown has already been rendered, it
shouldn't be too difficult to guess.
- Invoking wake
- Functions
- Data types and pattern matching
- Dealing with failure
- Executing shell jobs
- Build rules with file inputs
- Building targets from multiple files
- Supplemental file visibility
- Publish/Subscribe
- Downloading and parsing files
- Ignore wake source files
Unlike some other build systems, wake stores data to the filesystem between runs in order to cache and optimize future builds; it therefore needs a (small) bit of setup before working.
$ mkdir ~/tutorial
$ cd ~/tutorial
$ wake --init .
$ wake -x '5 + 6'
11
This sequence of commands creates a new workspace managed by wake.
The --init
option is used to create an initial wake.db
to record the state
of the build in this workspace.
Whenever you run wake, it searches for a wake.db
in parent directories.
The first wake.db
found defines what wake considers to be the workspace.
You can thus safely run wake in any sub-directory of tutorial
and wake
will be aware of all the relevant dependencies and rules.
The output of wake run on -x 'expression'
is the result of evaluating that expression.
In this case, 5 + 6
results in value 11
.
Wake will report more information when run in verbose mode: wake -v
.
$ wake -vx '5 + 6'
5 + 6: Integer = 11
The verbose output is of the form expression: type = value
. The above
will give 5 + 6: Integer = 11
. As before, 5 + 6
results in 11
,
which is an Integer
.
Next, create a file with a .wake
extension (the rest of the tutorial assumes
you call it tutorial.wake
) containing two lines:
export def hello =
"Hello World"
The syntax def x = y
introduces a new variable x
whose value is equal to
y
; in this case, y
is just a String
. The export
keyword is involved
with wake's package visibility system, and doesn't do much here (beyond
preempting what would be a warning), but can come into play with more complex
project setups.
With that file in the current directory, you can access anything named in it.
$ wake -x 'hello'
"Hello World"
Wake processes all wake files (*.wake
) which are in the workspace,
so our new file tutorial.wake
is available to us from the command-line; since
it's the only wake file here, some additional handling happens to make it even
simpler to access from, but the details about that are better described later.
For now, just know that any wake code should be added to this same file.
While wake is first and foremost a build system, it's been written to gracefully handle very complex projects. In service of that goal, wake provides a full programming language for use in writing equally complex build rules. As may be expected, functions play an integral role in that language.
export def increment i =
i + 1
$ wake -x 'increment 3'
4
Wake uses a syntax more closely related to ML and other functional languages.
Most notably, functions in wake are applied to their arguments with simple
spaces rather than parentheses, so f x y
is read as "function f
run
on x
and y
"; in C that would be f(x, y)
. Looking at the example,
def increment i = ...
introduces a function increment
which takes a single
argument i
, while increment 3
calls that function with i
equal to 3
.
$ wake -vx increment
increment: (i: Integer) => Integer = <tutorial.wake:5:[5-9]>
Notice that we didn't have to specify anything for wake to know what types
increment
accepts and returns. Wake is strongly-typed and will definitely
complain if you try to pass a function the wrong type of object, but it is
pretty good about guessing what things are (technically, using a Hindley-Milner
type system). It's often a good practice to explicitly list types anyway -- to
both help when you're reading the code and to catch any mistakes that might slip
in -- but for brevity most examples in this tutorial will skip the annotations.
export def decrement (i: Integer): Integer =
i - 1
Hopefully unsurprisingly, wake can handle more complex functions than simple addition and subtraction.
export def countHex (until: Integer): List String =
def numbers =
seq until
map strHex numbers
$ wake -x 'countHex 20'
"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "a", "b", "c", "d", "e", "f", "10", "11", "12", "13", Nil
Note that Nil
represents an empty list in wake, and any values are just built
on top of it; x, y, Nil
is a List
comprised of the two objects x
and y
.
In wake, the last line is always the value of the define. As countHex
is
a function, you could say the last line is the return value. Only one
standalone expression -- that last one -- is allowed in any block; wake does not
allow the imperative (C, Java, etc.) style of listing multiple statements or
expressions in a single function, to be executed one after another. Even so, by
gathering the intermediate calculations into a series of def
blocks, wake can
be just as powerful.
Beyond that, seq
is a function which takes an Integer
and generates a
List Integer
containing every number from zero to one below the argument (in
other words, a List
starting at zero whose length is determined by the
argument). Wake uses linked lists rather than arrays for most things, which
means that adding items to or taking items from the start of a List
is very
fast and we don't have to worry about running out of indices and having to copy
memory, but also means that doing anything to the end of a List
can be
expensive.
Finally, map
takes another function and applies it to every element in a list;
in this case, strHex
will render each number into a (hexadecimal) String
.
Functions are just another type of object in wake and can be passed around very
easily, so many tasks which might typically be written as a loop are instead
written using functions like map
.
Generally, one should define functions with a name. This makes code more readable for other people. However, sometimes the function is really just an after-thought. For these cases wake makes it possible to define functions inline.
$ wake -vx '\x x^2'
\x x^2: (x: Integer) => Integer = <<command-line>:1:[12-14]>
$ wake -x 'map (\x x^2) (seq 10)'
0, 1, 4, 9, 16, 25, 36, 49, 64, 81, Nil
The backslash syntax is an easy-to-type stand-in for the lambda symbol, λ. While we've not talked about it, the wake language implements a "typed lambda calculus". This syntax for inline functions is wake's homage to its roots.
Multiple arguments can be used by simply prefixing all of them with \
, and you
can pattern-match on any of them; the following two functions are equivalent:
export def lambda x (Pair y z) =
x + y + z
export def expression =
lambda 1 (Pair 2 3)
$ wake -x '(\x \(Pair y z) x + y + z) 1 (Pair 2 3)'
6
To make inline functions even easier to define, wake also supports a syntax
where one specifies the holes _
in an expression and a function is created
which fills the holes from left to right.
$ wake -vx '(_ + 4)'
(_ + 4): Integer => Integer = <<command-line>:1:[10-14]>
$ wake -x 'map (_ + 4) (seq 8)'
4, 5, 6, 7, 8, 9, 10, 11, Nil
$ wake -x 'seq 1000 | filter (_ % 55 == 0) | map str | catWith " "'
"0 55 110 165 220 275 330 385 440 495 550 605 660 715 770 825 880 935 990"
This hole-based syntax is not as powerful as lambda expressions, because
each argument can only be used once. Furthermore, the functions are created
at block boundaries, which include ()
s, which can limit their usefulness.
Nevertheless, this syntax can be convenient.
The last example also demonstrates the syntax x | f 0 | g
. This should be read
like the pipe operator in a shell script, feeding the value on the left into the
final argument of the function on the right. Ultimately, it will be evaluated
as g (f 0 x)
, but when chaining together multiple transformations to data, the
pipe syntax becomes more readable.
So far, we've gotten a lot done with primitive types (Integer
, String
,
...) and Lists
. However, wake does allow you to define your own data types.
These can then be analyzed using pattern matching.
export data Animal =
Cat (name: String)
Dog (age: Integer)
The data
keyword introduces a new type, Animal
.
Types are always capitalized, and use a different namespace from variables.
As defined, Animal
can either be a Cat
or a Dog
,
where Cat
s have names and Dog
s have ages.
If we want the new type available to other files, we put an export
in
front of the data
, just like with values and functions.
$ wake -vx 'Cat'
Cat: (name: String) => Animal = <tutorial.wake:16:[5-7]>
As wake informs us, Cat
is a function that takes a String
and returns an
Animal
. However, unlike normal functions, Cat
is also a type constructor.
Type constructors differ from normal variables in that they are capitalized
and can be used in pattern matches.
export def strAnimal a = match a
Cat x = "a cat called {x}"
Dog y = "a {str y}-year-old dog"
$ wake -x 'strAnimal (Dog 12)'
"a 12-year-old dog"
$ wake -x 'strAnimal (Cat "Fluffy")'
"a cat called Fluffy"
The function strAnimal
is an example of pattern matching. It consists of the
keyword match
followed by the value to pattern match. On the following lines
we indent and provide one or more patterns.
In many ways, this is similar to a switch
statement in other languages, but
can be used with any type by giving a constructor along with new value names
corresponding to each of its arguments. In match
, these values are followed
by =
and then an expression that is the result of when the pattern is matched.
For example, in Cat x = "a cat called {x}"
, the x
is a new variable binding
corresponding to name
in the declaration, thus it is a String
. After =
we
have a String
that is the result whenever strAnimal
is passed a Cat
.
Note that when matching, the patterns must exhaustively catch all possibilities.
For example, it would be illegal to match
but only provide a pattern for
Cat
. A useful pattern is to provide a "default" pattern with _
to match
anything:
export def matchWithDefault a = match a
Cat x = "a cat called {x}"
_ = "not a cat!"
When, as above, we only care about a single constructor of a type, we can use
the require
keyword instead. If the object on the right-hand side of the =
is of a different constructor, the value is passed up a level and the evaluation
is returned from early (similar to a return
statement in C-derived languages).
export def requireCat pet =
require Cat x = pet
else "not a cat!"
"a cat called {x}"
Note that this destructor pattern (Dog y
, Cat x
) not only works for
match
and require
, but also def
when the type only has a single
constructor/destructor.
In cases where the value returned from the surrounding block happens to be the
same type as the value in the require
, we can leave out the else
clause and
anything with a different constructor (in the case below, every Cat
) will be
passed up unchanged.
export def addDogYears years pet =
require Dog age = pet
Dog (age + years * 7)
$ wake -x 'addDogYears 3 (Dog 12)'
Dog 33
$ wake -x 'addDogYears 3 (Cat "Fluffy")'
Cat "Fluffy"
The previous example's Animal
type provides a relatively simple union tagging
one of two potential values. Types can also collect multiple values together,
or both.
export data Customer =
Customer (name: String) (id: Identifier)
export data Identifier =
Authenticated (login: String) (pass: String)
Basic (email: String)
Note also that we can use Identifier
before it has been defined. For the most
part, wake allows a definition (type or variable) to happen anywhere at or above
the level of its usage, though there are still a couple restrictions.
When destructing a multi-parameter type, the number of new variable names needs
to match the number of parameters to the constructor, but _
or something
starting with it can be used for anything which you don't care about using.
export def greetCustomer customer =
def Customer name _id = customer
"Hello, {name}!"
When a type is used for structural purposes rather than data itself, it's often better to not specify what type(s) some or all of its values have. However, this abstraction isn't as simple as just leaving off the type annotation in the constructor; wake still needs to be explicitly told everything about the type. Instead, we give a placeholder "type variable" after the type name:
export data UnsortedTree value =
Node (left: UnsortedTree value) (right: UnsortedTree value)
Leaf (this: value)
By not referencing any one type, UnsortedTree
is able to collect literally any
values, so long as they all have the same type -- note that we need to specify
what inner type is used for both left
and right
in their annotations; we
can't simply say that something is an UnsortedTree
, we have to say it's an
UnsortedTree String
or an UnsortedTree Animal
(or even an
UnsortedTree someOtherTypeVariable
if we want to include it within another
abstract type). Additionally, the capitalization here is important. When
type-checking, wake assumes that everything starting with an uppercase letter is
an actual type and everything starting with a lowercase letter is a type
variable, so both data UnsortedTree Value
and data unsortedTree value
would
be invalid.
$ wake -vx filter
filter: (f: a => Boolean) => List a => List a = </usr/local/share/wake/lib/core/list.wake:577:[12-35]>
We used abstract types previously when looking at Lists
. The same rules for
type variables apply to functions, which allows us to work with structures we
might not know everything about; wake is able to implement filter
for every
type of List
because every a
in the function's type annotation must refer to
the same type. No matter how complex or simple an a
might be, we know that if
we pass one to f
we will get either a True
or a False
, and we know how to
deal with those.
As types grow larger, it can become difficult to keep track of what value is in
which position; it is even more difficult to edit or replace one value among
many. To aid with this, wake provides the tuple
keyword as a counterpart to
data
.
export tuple Module =
Name: String
Imports: List Module
Sources: List Path
$ wake -vx Module
Module: (Name: String) => (Imports: List Module) => (Sources: List Path) => Module = <tutorial.wake:45:[14-19]>
In this case, Name
, Imports
, and Sources
are the fields of the type
constructed by Module
rather than three separate constructors of it.
Notably, three functions each are created behind the scenes:
$ wake -vx getModuleName
getModuleName: Module => String = <tutorial.wake:46:[5-8]>
$ wake -vx setModuleName
setModuleName: (Name: String) => Module => Module = <tutorial.wake:46:[5-8]>
$ wake -vx editModuleName
editModuleName: (fnName: String => String) => Module => Module = <tutorial.wake:46:[5-8]>
With these, we can easily work with types collecting dozens of values, at least
as far as the raw code goes. It is important to note that all objects in wake
are immutable -- the so-called editModuleName
does not change the existing
Module
, but instead creates a new object with all values but Name
copied
from the original. There is no way to pass-by-reference or by pointer which
modifies objects in-place.
Given wake's easily parametric -- and not inheritable -- types, it's generally
considered bad form to have failures be represented by "magic" values which
could be confused with a success (e.g. null pointers or empty strings) or
brushing them off to a side channel (e.g. exception throw/catch). Instead,
they're explicitly marked in the type system through data
types.
The simplest and most common way we deal with failure in wake is with the
Option
type:
export def firstOrZeroWhenEmpty list = match (head list)
Some first = first
None = 0
As in the above, we can use pattern matching on Option
to deal with the
possibility of Some
or None
. In this case, firstOrZeroWhenEmpty
accepts a
List
of Integers
and returns the first element of the list (its "head", as
opposed to the longer "tail" behind it) or 0
if the list is empty.
$ wake -vx firstOrZeroWhenEmpty
firstOrZeroWhenEmpty: (l: List Integer) => Integer = <tutorial.wake:42:[16-18]>
A rich selection of functions exists for operating on Option
values. For
example, a more concise way of accomplishing the same goal is to use getOrElse
which will return the value in a Some
or a default if the Option
is None
.
export def firstOrZeroWhenEmpty2 l =
getOrElse 0 (head l)
Options
are a great way to deal with things that can fail in one obvious way
(eg. head
fails when the List
is empty), but sometimes an operation may have
multiple ways of failing (eg. reading a file, it may not exist or permission may
be denied). wake provides a type called Result
for dealing with such cases.
Result
is similar to Option
: it can be either a Pass p
, where p
represents the value of correct operation, or a Fail f
, where f
is a
description of how an operation failed.
$ wake -x 'stringToRegExp ".?xp"'
Pass (RegExp `.?xp`)
$ wake -x 'stringToRegExp "*bad*"'
Fail (Error "no argument for repetition operator: *" Nil)
Don't pay too much attention to the exact strings, we're currently mainly
interested in the output structure of the functions. Namely, that the first
command returned a Pass
containing an accurate (if verbose) representation of
what we gave it, while the second returned a Fail
with some Error
describing
what was wrong with the input. We can see those same inner types reflected in
the type of the function:
$ wake -vx stringToRegExp
stringToRegExp: (str: String) => Result RegExp Error = </usr/local/share/wake/lib/core/regexp.wake:43:[12-39]>
Like any other type, we can match
on Results
:
export def regexOrOnlyEmpty regexString =
match (stringToRegExp regexString)
Pass regex = regex
Fail _ = `^$`
This will attempt to parse a regex from an unrestricted String
, and fall back
on a regex representing an empty string if the parse failed for any reason.
Similarly to getOrElse
, there is a function getWhenFail
that will return the
value in a Pass
or a default in the case of Fail
. We can rewrite the above
as simply:
export def regexOrOnlyEmpty2 regexString =
getWhenFail `^$` (stringToRegExp regexString)
You may have noticed that the Results
above contain a type called Error
.
Error
is a type that contains a String
"cause", and a List
of Strings
stack trace. You might also have noticed that the cause is "no argument for repetition operator: *"
, but the stack is just Nil
. Wake does not actually
maintain a call stack like traditional languages, so by default Errors
will
contain an empty List
for the stack. If you run wake with -d
(or
--debug
), it will simulate a stack:
$ wake -dx 'stringToRegExp "*bad*"'
Fail (Error "no argument for repetition operator: *" ("stringToRegExp@wake: /usr/local/share/wake/lib/core/regexp.wake:45:[12-16]", "top: src/optimizer/tossa.cpp:185:1", Nil))
You can construct an Error
directly, or use makeError
which simply takes a
String
cause and will record the Stack. There is even another function
failWithError
which takes that same String
and returns a Result a Error
for the very common case where you'd wrap the Error
in a Fail
immediately
after creating it.
In writing build instructions, we frequently want to execute arbitrary commands in the system shell. Wake provides this ability through the job system, in order to allow their execution to be cached.
from wake import _
from plan_scorer import _
export def infoH _args =
def cmdline =
"{which "uname"} -sr"
def plan =
makePlan "get kernel" Nil cmdline
def os =
runJob plan
def str =
getWhenFail "" os.getJobStdout
def body =
"#define OS {str}\n#define WAKE {version}\n"
write "{@here}/info.h" body # created with mode: rw-r--r--
This example creates a header file suitable for inclusion in some C/C++ project. To understand what's happening in this example, let's break down all the new methods being leveraged.
As before, def
introduces infoH
as a new function with a single argument
_args
, which it ignores (as before, the leading underscore silences a compiler
warning). Making it a function like this even though it just discards the
argument serves two purposes. First, wake evaluates objects (without arguments)
as soon as they're encountered but functions only when they're fully called, so
if we were to write it export def infoH =
then any time we did anything with
wake it would write an info.h
file. The second we'll look at soon.
which
is a function which searches wake's path for the named program. On
most systems, which "uname" = "/bin/uname"
, but this may not always be the
case. Using which
buys us a bit of indirection and is usually good form to
use with jobs.
makePlan
is the main method wake uses to set up calls to external shell
scripts or tools. It takes three arguments: a readable name for what the job is
doing, a list of paths of legal inputs, and a string for the
command-line. We will go into more detail on that and runJob
(which is where
the command is actually invoked) in the next section.
The values returned by runJob
can be accessed in many ways. In this case, we use
getJobStdout
to get the standard output from the command; since an empty
string is a reasonable default for our use, we use getWhenFail ""
to recover
in case the job failed. There is other information we can get from a Job
object: getJobStatus
is an Integer
equal to the job's exit status,
getJobOutputs
returns a List
of Paths
created by the job, among others.
version
is just a String
with the current wake version, and @here
is a
String
with the directory of the wake file. We also have an example of a
comment, # ...
, reminding us of the default permissions used by write
.
$ wake infoH
Pass (Path "info.h")
$ cat info.h
#define OS Linux 5.10.16.3-microsoft-standard-WSL2
#define WAKE 0.25.0
Next, notice that we did not use -x
when invoking wake this time.
The default operation of wake invokes the subcommand (in this case
infoH
) specified on the command-line, with any additional command-line
arguments passed to that function. This is the second reason for the _args
:
this form of invocation requires a function which can take a List String
containing any following command-line arguments. In this case, it was Nil
since nothing followed the wake infoH
, but if we had written
wake infoH example
then _args
would have been "example", Nil
.
As a side effect of this feature, any command-line arguments you want to pass to
wake
itself -- rather than to the function invoked -- must be given before
the function name: wake -v infoH
will increase the verbosity as expected,
while wake infoH -v
will not.
Each executed job -- in other words, when wake calls out to an external program -- is recorded, and we can retrieve all the information about them as desired.
$ wake --last
Job 3:
Command-line: /bin/uname -sr
Environment:
PATH=/usr/bin:/bin
...
Notice that the listed environment is much simpler than the actual environment of your system; wake tries to minimize anything which might cause two runs to give different results, and that includes any local environment variables.
Using runJob
, we run processes with the default execution plan.
This means that all environment variables are removed and the job is
executed in the root of the workspace. However, it is possible to
customize the environment used.
from wake import _
from plan_scorer import _
export def showEnv _ =
def plan =
"echo $HAX $FOO"
| makePlan "print from environment" Nil
| editPlanEnvironment ("HAX=peanut", "FOO=bar", _)
def output =
runJob plan
| getJobStdout
| getWhenFail ""
println output
$ wake showEnv
peanut bar
Unit
$ wake --last
Job 6 (print from environment):
Command-line: /bin/dash -c 'echo $HAX $FOO'
Environment:
HAX=peanut
FOO=bar
PATH=/usr/bin:/bin
...
The underlying job execution model of wake uses two phases. First one
constructs a Plan
object which describes what should be executed.
The Plan
is then transformed into a Job
by runJob
. Before one
calls runJob
, one can change various properties, like the environment
variables in this example.
In this case, we construct the Plan
using makePlan
which, in addition to the
command to run and the files the command may access, also asks for a string it
can use to label the resulting job; we can see it show up in the first line of
the --last
output, which can often be easier to debug than needing to read the
command line itself. Other methods of making Plan
objects -- such as
makeExecPlan
-- may or may not ask for a label, but one can always be added
with setPlanLabel
.
Finally, println
is very useful when debugging wake code. However,
be forewarned that the execution order of wake is not sequential!
This can result in print
output that does not appear to follow the
definition order of your build program.
To illustrate wake's use as a build system, we'll use a few simple programs written in C++. This is certainly not the only language which wake can be used for, but it's the only one which the wake standard library provides builtin functions to handle.
from wake import _
from gcc_wake import compileC linkO
def variant = "native-cpp11-release"
export def buildSimple _ =
require Pass mainSrc =
source "main.cpp"
require Pass main =
compileC variant ("-I.", Nil) Nil mainSrc
linkO variant Nil main "simple" Nil
Let's ignore the contents of tutorial.wake
briefly and instead focus on how
wake behaves.
$ git init .
$ echo 'int main() { return 0; }' > main.cpp
$ git add main.cpp
$ wake buildSimple
Pass (Path "simple.native-cpp11-release", Nil)
(The git commands are truly important; by default wake finds files through the
git index. Additionally, the ./simple.native-cpp11-release
the Path
refers
to is a complete and executable file, even if it does literally nothing at the
moment.)
As the build instructions become more complex, the --last
output becomes
larger and harder to sort through. You can always pipe it through less or your
favorite pager, or you can filter it by the specific files involved:
$ wake --input main.cpp
Job 9 (compile c++ main.native-cpp11-release.o):
...
$ wake --output tutorial.native-cpp11-release
Job 13 (link c++ simple):
Command-line: /usr/bin/c++ -o simple.native-cpp11-release main.native-cpp11-release.o -std=c++11
Environment:
PATH=/usr/bin:/bin
Directory: .
...
If the resulting file is removed, or main.cpp
modified, then wake will,
of course, rebuild the output.
$ rm simple.native-cpp11-release
$ wake buildSimple
Pass (Path "simple.native-cpp11-release", Nil)
$ wake --last
Job 17 (link c++ simple):
...
Because of the underlying database, wake can be a bit smarter about rebuilds when the contents of the file haven't changed:
$ touch main.cpp
$ wake buildSimple
Pass (Path "simple.native-cpp11-release", Nil)
$ wake -o simple.native-cpp11-release
Job 17 (link c++ simple):
...
Notice that wake does not rebuild the object file in this case (the job ID in my output is still 17). It checked that the hash of the input has not changed and concluded that the existing output would have been exactly reproduced by gcc.
Now let's turn our attention back to the wake file:
$ tail -n 10 tutorial.wake
from wake import _
from gcc_wake import compileC linkO
def variant = "native-cpp11-release"
export def buildSimple _args =
require Pass mainSrc =
source "main.cpp"
require Pass main =
compileC variant ("-I.", Nil) Nil mainSrc
linkO variant ("-lm", Nil) main "simple" Nil
The from ... import ...
lines indicate that we want to use something from an
external package. Any file which doesn't specify any imports will
automatically import everything in the standard wake
package, but as soon as
we explicitly add an import
, we need to explicitly list the wake
import
ourselves. The package system is described in more detail in its
own documentation.
While this is not a tutorial about C/C++, it's also worth noting a few things
about how the C/C++ build process works before continuing. First, you'll notice
that the above code has split it into the two steps of compileC
and linkO
;
this directly reflects the underlying process where every source file is first
compiled to object code individually, and then afterward the multiple object
files are linked into a single binary. GCC typically hides that complexity, but
you can still see it reflected in some basic Makefiles as well:
OBJ = main.o
main: ${OBJ}
gcc -o $@ ${OBJ}
.cpp.o:
gcc -c $<
The def variant = "native-cpp11-release"
line similarly refers to a predefined
set of flags that wake provides for C++ code. Since we'll only ever access it
from within functions in this file, we don't need to mark it export
. The full
range and implications of the variant system are outside the scope of this
tutorial, but just know that the String
will show up in several places through
the output.
The source
function verifies that a "source file" exists, returning a
Result Path Error
pointing to it if it does. This is also where the git
commands come into play, as source
only searches the git index to find files.
Don't forget to add anything you create!
Since we need to use the inner value of the Result
rather than the wrapper
itself, we use the common destructor pattern require Pass x = ...
to obtain
it. If source
does indeed have trouble and returns a Fail
, the require
will stop the rest of buildSimple
from being run and instead simply return
that Fail
at the top level; otherwise we assign the Path
to mainSrc
.
Now that we have the source file, buildSimple
invokes the compileC function:
$ wake --in gcc_wake -vx compileC
compileC: (variant: String) => (extraFlags: List String) => (headers: List Path) => (cfile: Path) => Result (List Path) Error = </usr/local/share/wake/lib/gcc_wake/gcc.wake:78:[12-37]>
The type of compileC
is a bit more complex than others we've looked at
previously, but taking one element at a time, it should be read as "a function
that takes a String
named variant
, then a List
of Strings
named
extraFlags
, then a List
of Paths
named headers
, another Path named
cfile
, and finally returns a List Path
if nothing failed but an Error
if
something did."
Indeed, we can see in our use of compileC
, we passed a String
for the
first argument and mainSrc
-- which is indeed a Path
-- for the last
argument, while the second and third arguments are Lists
.
This returns a Result (List Path) Error
pointing to the object file, which,
through the same sequence as the source file, gets passed to linkO
for
assembly into a binary. As this is then the last command in the def
block,
the output is returned as the function result.
Of course, most programs are going to split across several files.
export def buildMultiple _ =
def mainResult =
require Pass mainSrc =
source "main.cpp"
compileC variant ("-I.", Nil) Nil mainSrc
def helpResult =
require Pass helpSrc =
source "help.cpp"
compileC variant ("-I.", Nil) Nil helpSrc
def multipleResult =
require Pass main = mainResult
require Pass help = helpResult
linkO variant ("-lm", Nil) (main ++ help) "multiple" Nil
multipleResult
$ echo -e '#include "stdio.h"\nvoid helper() { printf("Built with wake\\n"); }' > help.cpp
$ git add help.cpp
While most of the changes are minor, the separation of mainResult
and
helpResult
into the two steps of def
followed by require
is the most
obvious. This may initially seem like it creates an unnecessary intermediate
variable, but it is in fact very important for achieving the best performance.
Namely, in order for require
to define values or return early based on the
exact constructor it's passed, it enforces a degree of serialization on the
code. def
doesn't have the same restriction, so by starting the computation
in def
blocks, we allow wake to compile main.cpp
and help.cpp
in parallel.
(The require
unwrapping within multipleResult
ensures they're both compiled
before being linked, even though each def
block can be started out of order.)
$ wake buildMultiple
Pass (Path "multiple.native-cpp11-release", Nil)
$ wake --last
Job 25 (compile c++ help.native-cpp11-release.o):
Command-line: /usr/bin/c++ -std=c++11 -Wall -O2 -I. -c help.cpp -frandom-seed=help.native-cpp11-release.o -o help.native-cpp11-release.o
...
Job 29 (link c++ multiple):
Command-line: /usr/bin/c++ -o multiple.native-cpp11-release main.native-cpp11-release.o help.native-cpp11-release.o -std=c++11
...
Note that there's no entry for compiling main.cpp
. Because wake invokes the
compilation and linking steps separately, it was able to recognize that
main.cpp
had already been compiled -- even though that compilation happened in
a different function in a previous invocation.
$ rm *.o
$ wake buildMultiple
Pass (Path "multiple.native-cpp11-release", Nil)
$ wake --last
Job 31 (compile c++ main.native-cpp11-release.o):
...
Job 32 (compile c++ help.native-cpp11-release.o):
...
Similarly, wake did NOT re-link the objects of the program despite needing to build them again. That's because wake remembers the hashes of the objects it gave to the linker last time. The rebuilt object files have the same hashes, so there was no need to re-link the program. Similarly, whitespace-only changes to the files will not cause a re-link.
$ echo 'int main() { return 2; }' > main.cpp
$ wake buildMultiple
Pass (Path "multiple.native-cpp11-release", Nil)
$ wake --last
Job 39 (compile c++ main.native-cpp11-release.o):
...
Job 43 (link c++ multiple):
...
Of course, if you change main.cpp
meaningfully, it will be recompiled (without
also recompiling help.cpp
) and the program re-linked.
Having to list all cpp files is cumbersome. You have probably organized your codebase so that all the files in the current directory should be linked together. This example demonstrates how to support that.
export def buildAll _ =
def compile =
compileC variant ("-I.", Nil) Nil
def objectsResult =
require Pass srcFiles =
sources @here `.*\.cpp`
map compile srcFiles
| findFail
def allResult =
require Pass objects = objectsResult
linkO variant ("-lm", Nil) (flatten objects) "all" Nil
allResult
Notice that we've defined compile
to be compileC
with every argument
supplied except the Path
of the file to compile.
This is known as "partial function evaluation" or "currying".
Thus, compile
is a function that takes a Path
and returns a Path
.
We could equivalently express compile
as:
def compile src =
compileC variant ("-I.", Nil) Nil src
The argument src
here is a bit more explicit, but not strictly necessary.
In either case, this allows us to write compile cppFile
to
compile a single cpp file, saving some typing.
However, we can use the map
function to save even more! We use the
sources
function to find all the .cpp
files. That gives us a List
of
Paths
. Recall that compile
is a function that takes one argument, a Path
.
map
applies the function supplied as its first
argument to every element of the List
supplied as its second argument.
$ wake buildAll
Pass (Path "all.native-cpp11-release", Nil)
$ wake --last
...
Inputs:
60cde6e2 help.native-cpp11-release.o
31745228 main.native-cpp11-release.o
Outputs:
209de066 all.native-cpp11-release
After the map
, we're left with a List (Result (List Path) Error)
, or several
compilations which would each return some Paths
if they succeed, but which
may individually fail. However, require
can only unwrap the outermost type, so
we need to switch the List
and the Result
. This is exactly what findFail
does: if any of the inner computations failed, then the entire thing fails, but
otherwise it returns the successes as a Result (List (List Path)) Error
.
Once we unwrap that, we can pass the list of lists to flatten
(which has type
List (List a) => List a
, concatenating all of the lists) to get the
List Path
that we need to link.
Our wake file is now both smaller and will automatically work when new .cpp
files are added.
Recall that the third argument to compileC
is a list of additional legal input
files. Wake forbids jobs from reading files in the workspace that are not
declared inputs. This means that if you include header files, they must be
declared in the list of legal inputs passed to compileC or the compile
will fail.
export def buildHeaders _ =
require Pass headers =
sources @here `.*\.h`
def compile =
compileC variant ("-I.", Nil) headers
def objectsResult =
require Pass srcFiles =
sources @here `.*\.cpp`
map compile srcFiles
| findFail
def headersResult =
require Pass objects = objectsResult
linkO variant ("-lm", Nil) (flatten objects) "headers" Nil
headersResult
$ echo 'void helper();' > help.h
$ echo -e '#include "help.h"\nint main() { helper(); }' > main.cpp
$ git add help.h
$ wake buildHeaders
Pass (Path "headers.native-cpp11-release", Nil)
Indeed, we can see that failure if we try to run buildSimple
or
buildMultiple
now that main.cpp
depends on a header -- if you run into
trouble with jobs missing files, checking the "Visible" list will give you a
better understanding than your local filesystem:
$ wake buildSimple
main.cpp:1:10: fatal error: help.h: No such file or directory
#include "help.h"
^~~~~~~~
compilation terminated.
Fail (Error "Non-zero exit status (Exited 1) for '/usr/bin/c++ -std=c++11 -Wall -O2 -I. -c main.cpp -frandom-seed=main.native-cpp11-release.o -o main.native-cpp11-release.o'" Nil)
$ wake -v --failed
...
Visible:
0b10435dd8947e57cbad4f4326d65dd0909c026b8e2bcaa2f87c9e6018507451 main.cpp
Inputs:
0b10435dd8947e57cbad4f4326d65dd0909c026b8e2bcaa2f87c9e6018507451 main.cpp
Outputs:
Stderr:
main.cpp:1:10: fatal error: help.h: No such file or directory
#include "help.h"
^~~~~~~~
compilation terminated.
In buildHeaders
, we've used the sources
command to find all the header
files in the same directory and pass them as legal inputs to gcc -- the
keyword @here
expands to the directory of the .wake
file. The second
argument to sources
is a regular expression to select which files to
return. We've used ``
s (backticks) here which define regular expression literals
with the standard syntax.
In addition, the parser verifies that regular expression literals are legal.
Note that "source files" are those files tracked by git. Wake will never
return built files from a call to sources
, helping repeatability. We can see
this through the fact that the info.h
file we generated is not listed in the
visible list, despite the regex supposedly matching all .h
files in the
current directory. If we did want to use it, we'd need to reference the Path
returned when it is created (or retrieved from the cache):
def infoResult =
infoH Nil
def compile =
require Pass info = infoResult
def visible =
info, headers
compileC variant ("-I.", Nil) visible
In a classic Makefile it would be considered bad form to list all header files as dependencies for all cpp files. That's because make would recompile every cpp file whenever any header file changes. In wake, we don't have this problem. Wake monitors jobs to see which files they actually used and remembers this for later builds. Therefore, it's best in wake to err on the side of caution (and convenience) by just listing all the headers in directories that are interesting to the cpp files.
$ wake -o main.native-cpp11-release.o
...
Inputs:
d2a7bfde help.h
9d220741 main.cpp
Outputs:
810e17cb main.native-cpp11-release.o
For this file, wake recorded that it needed both main.cpp
and help.h
.
$ wake -o help.native-cpp11-release.o
...
Inputs:
5b8beead help.cpp
Outputs:
13350319 help.native-cpp11-release.o
For this file, wake recorded that it only needed help.cpp
, despite
help.h
being a legal input.
Wake includes a publish/subscribe interface to support accumulating information
between multiple files. publish x = y
adds y
to the List
of things which
will be returned by a subscribe x
expression.
topic animal: String
publish animal =
"Cat", Nil
publish animal =
"Dog", "Wolf", Nil
publish animal =
replace `u` "o" "Mouse", Nil
export def animals =
subscribe animal
$ wake -x 'animals'
"Cat", "Dog", "Wolf, "Moose", Nil
Note that animal
is not a variable; it is a topic, which is in a different
namespace than normal variables. Note also that y
must be a List
.
For example, this API can be used to accumulate all the unit tests in the
workspace into a single location that runs them all at once. Keep in mind that
the published List
can be of any type (including functions and data types), so
the types of workspace-wide information that can be accumulated this way is
wide open. However, all publishes to a particular topic must agree to use
the same type in the List
, or the files will not type check.
Consider a build setup where the local project depends on the sources of some external one. Especially for security-sensitive dependencies where using the most recent version (no matter what version that may be), this is the sort of example that becomes very difficult very quickly in a system like make, but is fairly straight-forward in wake:
from wake import _
from plan_scorer import _
def curl url extension =
def outputFile =
# This construction is somewhat specific to GitHub's url scheme, which
# names release tarballs according to the tag version, but doesn't
# include the `.tar.gz` (or `.json`) extension.
"{@here}/{basename url}.{extension}"
def cmdline =
"{which "curl"} -o {outputFile} {url}"
def curl =
makePlan "download {url}" Nil cmdline
| runJob
curl.getJobOutput
export def downloadGithubRelease projectList =
require (project, Nil) = projectList
else failWithError "Exactly one project must be downloaded per call"
def releasesResult =
# Query the GitHub microservice for a structured list of releases.
# Note that this is rate-limited, so you might want a different setup if
# including this as part of CI for a large project.
curl "https://api.github.com/repos/{project}/releases" "json"
def releaseTarballsResult =
require Pass releases = releasesResult
require Pass releasesData =
parseJSONFile releases
def releases =
# Search through the list of all releases to retrieve the values of
# the `tarball_url` field. Don't spend too much time memorizing
# this particular JSON interface; it will be replaced soon.
releasesData // `tarball_url`
| getJArray
| getOrElse Nil
Pass (mapPartial getJString releases)
def mostRecentUrlResult =
require Pass releaseTarballs = releaseTarballsResult
# The returned JSON results are ordered newest-first.
head releaseTarballs
| getOrFail "No GitHub releases found for {project}".makeError
def mostRecentTarballResult =
require Pass mostRecentUrl = mostRecentUrlResult
curl mostRecentUrl "tar.gz"
mostRecentTarballResult
$ wake downloadGithubRelease sifive/wake
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 385k 0 385k 0 0 501k 0 --:--:-- --:--:-- --:--:-- 501k
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
Pass (Path "v0.25.2.tar.gz")
As we've covered before, job
is used to launch curl to download the table.
The curl
function also makes use of replace
with a regular expression
to split the filename out of the URL. replace
accepts a "replacement" String
which it substitutes for every substring that matches the regular expression. We also use
the simplify
function which transforms paths into canonical form. In this,
case simplify
removes the leading "./"
.
The require project, Nil =
line is helpful for restricting a list to contain
exactly one item. We could also use head
and require Some
or
getOrElse
/getOrFail
to discard any arguments after the first, but in this
case we have chosen to fail loudly.
The GitHub API returns a JSON document, for which wake has built-in handling.
The full library will be described elsewhere, but for a brief overview of what
is used in this example, parseJSONFile
does as the name suggests and reads a
file (here retrieved by curl
, but it could also be from source
or any other
call) into an internal representation which we then search through for all
tarball_url
keys.
By passing the resulting URL to a second call of curl
, we're able to retrieve
the required files determined by the output of a previous job. This ability
for the complex interweaving of programming language features with build-system
output caching is where the full power of wake truly shines.
By default, wake will recursively search the entire workspace for build files
(*.wake
). However, there are occasions where you want it to ignore some of
them: for example, during testing where you could exclude some files from
regular usage, or a directory structure that includes duplicate repository
checkouts, where duplicate symbol definitions would raise an error.
To allow these, wake looks for files named .wakeignore
containing patterns.
The pattern language is shell filename globbing.
One pattern per line, each relative to the path of the .wakeignore
file.
The concrete syntax is:
- empty lines are ignored
- lines starting with a
#
are comments and are ignored ?
matches a single non-slash character*
matches any number (including zero) of non-slash characters[a-z]
matches a single lower-case character/**/
in an expression likefoo/**/bar
stands in for any number of directories (including zero)foo/**
recursively matches all contents of the directoryfoo
**/bar
matches all filesbar
contained in this directory or any subdirectory
An example:
workspace
├── wake.db
├── repo1
│ └── foo.wake
└── repo2
├── .wakeignore
└── repo1
└── foo.wake
If repo1
is checked-out out twice like above, then if a .wakeignore
file
at path workspace/repo2
contained repo1/**
then foo.wake
would not
be read twice.
The following patterns in workspace/repo2/.wakeignore
would all match
the above foo.wake
:
repo1/foo.wake
repo1/**
repo1/**/foo.wake
**/[a-z]?[o].wake