Skip to content

Commit

Permalink
Update comment on cost model tests (IntersectMBO#4975)
Browse files Browse the repository at this point in the history
* Update comment

* Add a sentence about plutus-tx

* Rearrange order of steps

* Rearrange order of steps

* Rearrange order of steps

* Rearrange order of steps

* Clarification

* Wrong step number
  • Loading branch information
kwxm authored Nov 25, 2022
1 parent 79779bd commit a645d1e
Show file tree
Hide file tree
Showing 2 changed files with 106 additions and 87 deletions.
185 changes: 100 additions & 85 deletions plutus-core/cost-model/CostModelGeneration.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,11 @@ details of how to add a new built-in function see the extensive notes on "How to
add a built-in function" in
[`PlutusCore.Default.Builtins`](../plutus-core/src/PlutusCore/Default/Builtins.hs).
For documentation on how to add a new built-in type, see
[`Universe.Core`](../plutus-core/src/Universe/Core.hs).
[`Universe.Core`](../plutus-core/src/Universe/Core.hs). Note that the procedure
described here will only add a new built-in function to Plutus Core: to make a
new function available from Haskell more work will be required in the
[`plutus-tx`](https://github.com/input-output-hk/plutus/tree/master/plutus-tx)
codebase.


### Adding a new function
Expand Down Expand Up @@ -284,69 +288,28 @@ run the appropriate `param<builtin-name>` function:
Data.ByteString.pack $ zipWith (Data.Bits.xor) (Data.ByteString.unpack a) (Data.ByteString.unpack b)
```

#### Step 5: add a benchmark for the new builtin
#### Step 5: add a benchmark for the new builtin and run it

Now a CPU usage benchmark for the function will have to be added in
[`plutus-core/cost-model/budgeting-bench`](./budgeting-bench) and new R code
will have to be added in [`models.R`](./data/models.R) to process the results of
the benchmark. The benchmark should aim to cover a wide range of inputs in
order to get a good idea of the worst-case behaviour of the function. The exact
form of the R code will depend on the behaviour of the function being added and
will probably be based on the expected time complexity of the function, backed
up by examination of the initial benchmark results. In simpler cases it may be
possible to re-use existing R code, but sometimes more complex code may be
required to obtain a good model of the behaviour of the function. Ideally the R
model should accurate over a wide range of inputs so that charges for "typical"
inputs are reasonable but worst-case inputs which require large computation
times incur large charges which penalise excessive computation. Some
experimentation may be required to achieve this, and it may not always be
possible to satisfy both goals simultaneously. In such cases it may be
necessary to sacrifice some accuracy in order to guarantee security.


#### Step 6: add code to read the costing function from R into Haskell
the benchmark (see Step 6 below). The benchmark should aim to cover a wide
range of inputs in order to get a good idea of the worst-case behaviour of the
function: experimentation may be needed to achieve this.

Next we have to update the code which converts benchmarking results into JSON
models. Go to
[`CreateBuiltinCostModel`](./create-cost-model/CreateBuiltinCostModel.hs) and add
entries for the new builtin in builtinCostModelNames

```
, paramXorByteString = "xorByteStringModel"
```
(Getting the string wrong here, for example putting "xorByteString" instead will
give `parse error (not enough input) at ""`. Errors will occur whenever the
Haskell code attempts to read something from an R object that doesn't actually
occur in the object, and they can sometimes be quite cryptic.)

Also add a new clause in [`CreateBuiltinCostModel`](./create-cost-model/CreateBuiltinCostModel.hs):

```
paramXorByteString <- getParams xorByteString paramXorByteString
```

and a function to extract the cost parameters for the R code. This should be modelled on the existing
functions at the end of the file:
Once the benchmark is in its final form, run it using `cabal run
plutus-core:cost-model-budgeting-bench -- --csv <file>` as described in the
first section of this document. Either run the full set of benchmarks and save
the full output in a CSV file or run the new benchmark alone using `cabal run
plutus-core:cost-model-budgeting-bench -- --csv <file> <benchmark name>` to run
the benchmark on its own and then add the output in `<file>` results to a CSV
file (such as `benching.csv`) containing earlier benchmark results for the rest
of the builtin functions. If the latter method (which will be much faster) is
used it is advisable to run some other costing benchmarks as well to check that
the results are at least approximately consistent with the previous ones.

```
xorByteString :: MonadR m => (SomeSEXP (Region m)) -> m (CostingFun ModelTwoArguments)
xorByteString cpuModelR = do
cpuModel <- ModelTwoArgumentsMinSize <$> readModelMinSize cpuModelR
let memModel = ModelTwoArgumentsMinSize $ ModelMinSize 0 1
pure $ CostingFun (cpuModel) memModel
```

The CPU costing function is obtained by running the R code, but the memory usage
costing function is defined statically here. Memory usage costing functions
only account for memory retained after the function has returned and not for any
working memory that may be allocated during its execution. Typically this means
that the memory costing function should measure the size of the object returned
by the builtin. For our `xorByteString` implementation, if the arguments have
sizes `m` and `n` then the result will have size `min(m,n)` so we define the memory
costing function to be `(m,n) -> 0 + 1*min(m,n)`.


#### Step 7: update the R code
#### Step 6: update the R code

We now have to extend the R code in [`models.R`](./data/models.R). Firstly, add
an entry for the arity of the builtin in the `arity` function:
Expand All @@ -362,21 +325,35 @@ an entry for the arity of the builtin in the `arity` function:
```

Now add a function to infer coefficients for the CPU costing function from
benchmarking data. In the case of `xorByteString` we assume that the time taken
will be linear in the minimum of the sizes of the arguments (ie, the arguments
of the new builtin). It is often worthwhile to plot the benchmark data and
experiment with it in order to check that it has the form expected when the
basic shape of the costing function was selected (Steps 1, 3 and 6). For
example, we have assumed that the execution time of `xorByteString` is linear in
the _minimum_ of the argument sizes since the function stops when it gets to the
end of the smaller argument, but note that we call `unpack` on both arguments
and that this takes linear time. Examination of benchmark results might reveal
that if one input is very large then the unpacking step will dominate the
execution time, and if this is the case it might be more sensible to use a model
linear in the _maximum_ of the input sizes. In general, think carefully about
the structure of the model and issues such as whether the raw data might need to
have outliers discarded or whether only some subset of the data should be used
to arrive at an accurate worst-case model.
benchmarking data. The exact form of the R code will depend on the behaviour of
the function being added and will probably be based on the expected time
complexity of the function, backed up by examination of the initial benchmark
results. In simpler cases it may be possible to re-use existing R code, but
sometimes more complex code may be required to obtain a good model of the
behaviour of the function. Ideally the R model should accurate over a wide range
of inputs so that charges for "typical" inputs are reasonable but worst-case
inputs which require large computation times incur large charges which penalise
excessive computation. Developing the model may involve some experimentation,
and it may not always be possible to satisfy both goals simultaneously. In such
cases it may be necessary to sacrifice some accuracy in order to guarantee
security.


In the case of `xorByteString` we assume that the time taken will be linear in
the minimum of the sizes of the arguments (ie, the arguments of the new
builtin). It is often worthwhile to plot the benchmark data and experiment with
it in order to check that it has the form expected when the basic shape of the
costing function was selected (Steps 1, 3 and 6). For example, we have assumed
that the execution time of `xorByteString` is linear in the _minimum_ of the
argument sizes since the function stops when it gets to the end of the smaller
argument, but note that we call `unpack` on both arguments and that this takes
linear time. Examination of benchmark results might reveal that if one input is
very large then the unpacking step will dominate the execution time, and if this
is the case it might be more sensible to use a model linear in the _maximum_ of
the input sizes. In general, think carefully about the structure of the model
and issues such as whether the raw data might need to have outliers discarded or
whether only some subset of the data should be used to arrive at an accurate
worst-case model.

```
xorByteStringModel <- {
Expand All @@ -389,7 +366,8 @@ to arrive at an accurate worst-case model.
}
```

Finally, add an entry to the list which is returned by `modelFun` (at the very end of the file):
Finally, add an entry to the list which is returned by `modelFun` (at the very
end of the file):

```
xorByteStringModel = xorByteStringModel,
Expand All @@ -401,7 +379,49 @@ object. (That's what gets read in by the code in Step 6: `paramXorByteString`
contains the string "xorByteStringModel" and that lets the Haskell code retrieve
the correct thing from R.)

### Step 8: test the Haskell versions of the costing functions
#### Step 7: add code to read the costing function from R into Haskell

Next we have to update the code which converts benchmarking results into JSON
models. Go to
[`CreateBuiltinCostModel`](./create-cost-model/CreateBuiltinCostModel.hs) and add
entries for the new builtin in builtinCostModelNames

```
, paramXorByteString = "xorByteStringModel"
```
(Getting the string wrong here, for example putting "xorByteString" instead will
give `parse error (not enough input) at ""`. Errors will occur whenever the
Haskell code attempts to read something from an R object that doesn't actually
occur in the object, and they can sometimes be quite cryptic.)

Also add a new clause in [`CreateBuiltinCostModel`](./create-cost-model/CreateBuiltinCostModel.hs):

```
paramXorByteString <- getParams xorByteString paramXorByteString
```

and a function to extract the cost parameters for the R code. This should be modelled on the existing
functions at the end of the file:

```
xorByteString :: MonadR m => (SomeSEXP (Region m)) -> m (CostingFun ModelTwoArguments)
xorByteString cpuModelR = do
cpuModel <- ModelTwoArgumentsMinSize <$> readModelMinSize cpuModelR
let memModel = ModelTwoArgumentsMinSize $ ModelMinSize 0 1
pure $ CostingFun (cpuModel) memModel
```

The CPU costing function is obtained by running the R code, but the memory usage
costing function is defined statically here. Memory usage costing functions
only account for memory retained after the function has returned and not for any
working memory that may be allocated during its execution. Typically this means
that the memory costing function should measure the size of the object returned
by the builtin. For our `xorByteString` implementation, if the arguments have
sizes `m` and `n` then the result will have size `min(m,n)` so we define the memory
costing function to be `(m,n) -> 0 + 1*min(m,n)`.


#### Step 8: test the Haskell versions of the costing functions

The code in [`CreateCostModel`](./create-cost-model/CreateBuiltinCostModel.hs)
converts the cost modelling functions fitted by R into Haskell functions. As
Expand All @@ -416,15 +436,14 @@ how to do this) and then run the tests with `cabal bench
plutus-core:cost-model-test`.


### Step 9: update the cost model JSON file
#### Step 9: update the cost model JSON file

Once the previous steps have been carried out, proceed as described in the first
section: run `cost-model-budgeting-bench` on the reference machine and then feed
the results to `generate-cost-model` to produce a new JSON cost model file
(which will contain sensible coefficients for the costing functions for the new
builtin in place of the arbitray ones we added in Step 3), and check it in along
with a CSV file containing a full set of benchmark results which can be used to
reproduce it.
section: feed the results of the costing benchmarks to `generate-cost-model` to
produce a new JSON cost model file (which will contain sensible coefficients for
the costing functions for the new builtin in place of the arbitray ones we added
in Step 3), and check it in along with a CSV file containing a full set of
benchmark results which can be used to reproduce it.

If you're confident that the evaluator hasn't changed too much since
the cost model was last fully updated it may be possible to save time
Expand All @@ -440,7 +459,3 @@ can run benchmarks on their own machine and have the results re-scaled
to be compatible with our reference machine, thereby removing (or at
least lessening) the necessity for Cardano developers to do the
benchmarking).




8 changes: 6 additions & 2 deletions plutus-core/cost-model/test/TestCostModels.hs
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,12 @@ import Hedgehog.Range qualified as Range
be any problem because the functions should be the same as the ones we
construct from R here (they're essentially the contents of 'costModelsR'
converted to JSON), but it wouldn't do any harm to include any possible loss
of accuracy due to serialisation/deserialisation in the tests as well.
of accuracy due to serialisation/deserialisation in the tests as well. Doing
the tests the way they're done here is arguably better because it may reveal
problems in the costing interface before the cost model file is updated, and
we want to be sure that we don’t include an incorrect costing function in the
JSON. Maybe it would be sensible to have some separate tests that check that
converting to JSON and then back is the identity.
-}

-- How many tests to run for each costing function
Expand Down

0 comments on commit a645d1e

Please sign in to comment.