Update comment on cost model tests (IntersectMBO#4975)

* Update comment * Add a sentence about plutus-tx * Rearrange order of steps * Rearrange order of steps * Rearrange order of steps * Rearrange order of steps * Clarification * Wrong step number
dQuadrant · Nov 25, 2022 · a645d1e · a645d1e
1 parent 79779bd
commit a645d1e
Show file tree

Hide file tree

Showing 2 changed files with 106 additions and 87 deletions.
diff --git a/plutus-core/cost-model/CostModelGeneration.md b/plutus-core/cost-model/CostModelGeneration.md
@@ -115,7 +115,11 @@ details of how to add a new built-in function see the extensive notes on "How to
 add a built-in function" in
 [`PlutusCore.Default.Builtins`](../plutus-core/src/PlutusCore/Default/Builtins.hs).
 For documentation on how to add a new built-in type, see
-[`Universe.Core`](../plutus-core/src/Universe/Core.hs).
+[`Universe.Core`](../plutus-core/src/Universe/Core.hs).  Note that the procedure
+described here will only add a new built-in function to Plutus Core: to make a
+new function available from Haskell more work will be required in the
+[`plutus-tx`](https://github.com/input-output-hk/plutus/tree/master/plutus-tx)
+codebase.
 
 
 ### Adding a new function
@@ -284,69 +288,28 @@ run the appropriate `param<builtin-name>` function:
                       Data.ByteString.pack $ zipWith (Data.Bits.xor) (Data.ByteString.unpack a) (Data.ByteString.unpack b)
 ```
 
-#### Step 5: add a benchmark for the new builtin
+#### Step 5: add a benchmark for the new builtin and run it
 
 Now a CPU usage benchmark for the function will have to be added in
 [`plutus-core/cost-model/budgeting-bench`](./budgeting-bench) and new R code
 will have to be added in [`models.R`](./data/models.R) to process the results of
-the benchmark.  The benchmark should aim to cover a wide range of inputs in
-order to get a good idea of the worst-case behaviour of the function.  The exact
-form of the R code will depend on the behaviour of the function being added and
-will probably be based on the expected time complexity of the function, backed
-up by examination of the initial benchmark results.  In simpler cases it may be
-possible to re-use existing R code, but sometimes more complex code may be
-required to obtain a good model of the behaviour of the function.  Ideally the R
-model should accurate over a wide range of inputs so that charges for "typical"
-inputs are reasonable but worst-case inputs which require large computation
-times incur large charges which penalise excessive computation.  Some
-experimentation may be required to achieve this, and it may not always be
-possible to satisfy both goals simultaneously.  In such cases it may be
-necessary to sacrifice some accuracy in order to guarantee security.
-
-
-#### Step 6: add code to read the costing function from R into Haskell 
+the benchmark (see Step 6 below).  The benchmark should aim to cover a wide
+range of inputs in order to get a good idea of the worst-case behaviour of the
+function: experimentation may be needed to achieve this.
 
-Next we have to update the code which converts benchmarking results into JSON
-models.  Go to
-[`CreateBuiltinCostModel`](./create-cost-model/CreateBuiltinCostModel.hs) and add
-entries for the new builtin in builtinCostModelNames
-
-```
-  , paramXorByteString                   = "xorByteStringModel"
-```
-(Getting the string wrong here, for example putting "xorByteString" instead will
-give `parse error (not enough input) at ""`. Errors will occur whenever the
-Haskell code attempts to read something from an R object that doesn't actually
-occur in the object, and they can sometimes be quite cryptic.)
-
-Also add a new clause in [`CreateBuiltinCostModel`](./create-cost-model/CreateBuiltinCostModel.hs):
-
-```
-    paramXorByteString                   <- getParams xorByteString paramXorByteString
-```
-
-and a function to extract the cost parameters for the R code.  This should be modelled on the existing
-functions at the end of the file:
+Once the benchmark is in its final form, run it using `cabal run
+plutus-core:cost-model-budgeting-bench -- --csv <file>` as described in the
+first section of this document. Either run the full set of benchmarks and save
+the full output in a CSV file or run the new benchmark alone using `cabal run
+plutus-core:cost-model-budgeting-bench -- --csv <file> <benchmark name>` to run
+the benchmark on its own and then add the output in `<file>` results to a CSV
+file (such as `benching.csv`) containing earlier benchmark results for the rest
+of the builtin functions.  If the latter method (which will be much faster) is
+used it is advisable to run some other costing benchmarks as well to check that
+the results are at least approximately consistent with the previous ones.
 
-```
-xorByteString :: MonadR m => (SomeSEXP (Region m)) -> m (CostingFun ModelTwoArguments)
-xorByteString cpuModelR = do
-  cpuModel <- ModelTwoArgumentsMinSize <$> readModelMinSize cpuModelR
-  let memModel = ModelTwoArgumentsMinSize $ ModelMinSize 0 1
-  pure $ CostingFun (cpuModel) memModel
-```
-
-The CPU costing function is obtained by running the R code, but the memory usage
-costing function is defined statically here.  Memory usage costing functions
-only account for memory retained after the function has returned and not for any
-working memory that may be allocated during its execution.  Typically this means
-that the memory costing function should measure the size of the object returned
-by the builtin.  For our `xorByteString` implementation, if the arguments have
-sizes `m` and `n` then the result will have size `min(m,n)` so we define the memory
-costing function to be `(m,n) -> 0 + 1*min(m,n)`.
 
-
-#### Step 7: update the R code
+#### Step 6: update the R code
 
 We now have to extend the R code in [`models.R`](./data/models.R).  Firstly, add
 an entry for the arity of the builtin in the `arity` function:
@@ -362,21 +325,35 @@ an entry for the arity of the builtin in the `arity` function:
 ```
 
 Now add a function to infer coefficients for the CPU costing function from
-benchmarking data.  In the case of `xorByteString` we assume that the time taken
-will be linear in the minimum of the sizes of the arguments (ie, the arguments
-of the new builtin).  It is often worthwhile to plot the benchmark data and
-experiment with it in order to check that it has the form expected when the
-basic shape of the costing function was selected (Steps 1, 3 and 6).  For
-example, we have assumed that the execution time of `xorByteString` is linear in
-the _minimum_ of the argument sizes since the function stops when it gets to the
-end of the smaller argument, but note that we call `unpack` on both arguments
-and that this takes linear time. Examination of benchmark results might reveal
-that if one input is very large then the unpacking step will dominate the
-execution time, and if this is the case it might be more sensible to use a model
-linear in the _maximum_ of the input sizes.  In general, think carefully about
-the structure of the model and issues such as whether the raw data might need to
-have outliers discarded or whether only some subset of the data should be used
-to arrive at an accurate worst-case model.
+benchmarking data. The exact form of the R code will depend on the behaviour of
+the function being added and will probably be based on the expected time
+complexity of the function, backed up by examination of the initial benchmark
+results. In simpler cases it may be possible to re-use existing R code, but
+sometimes more complex code may be required to obtain a good model of the
+behaviour of the function. Ideally the R model should accurate over a wide range
+of inputs so that charges for "typical" inputs are reasonable but worst-case
+inputs which require large computation times incur large charges which penalise
+excessive computation. Developing the model may involve some experimentation,
+and it may not always be possible to satisfy both goals simultaneously. In such
+cases it may be necessary to sacrifice some accuracy in order to guarantee
+security.
+
+
+In the case of `xorByteString` we assume that the time taken will be linear in
+the minimum of the sizes of the arguments (ie, the arguments of the new
+builtin).  It is often worthwhile to plot the benchmark data and experiment with
+it in order to check that it has the form expected when the basic shape of the
+costing function was selected (Steps 1, 3 and 6).  For example, we have assumed
+that the execution time of `xorByteString` is linear in the _minimum_ of the
+argument sizes since the function stops when it gets to the end of the smaller
+argument, but note that we call `unpack` on both arguments and that this takes
+linear time. Examination of benchmark results might reveal that if one input is
+very large then the unpacking step will dominate the execution time, and if this
+is the case it might be more sensible to use a model linear in the _maximum_ of
+the input sizes.  In general, think carefully about the structure of the model
+and issues such as whether the raw data might need to have outliers discarded or
+whether only some subset of the data should be used to arrive at an accurate
+worst-case model.
 
 ```
     xorByteStringModel <- {
@@ -389,7 +366,8 @@ to arrive at an accurate worst-case model.
     }
 ```
 
-Finally, add an entry to the list which is returned by `modelFun` (at the very end of the file):
+Finally, add an entry to the list which is returned by `modelFun` (at the very
+end of the file):
 
 ```
         xorByteStringModel = xorByteStringModel,
@@ -401,7 +379,49 @@ object. (That's what gets read in by the code in Step 6: `paramXorByteString`
 contains the string "xorByteStringModel" and that lets the Haskell code retrieve
 the correct thing from R.)
 
-### Step 8: test the Haskell versions of the costing functions
+#### Step 7: add code to read the costing function from R into Haskell 
+
+Next we have to update the code which converts benchmarking results into JSON
+models.  Go to
+[`CreateBuiltinCostModel`](./create-cost-model/CreateBuiltinCostModel.hs) and add
+entries for the new builtin in builtinCostModelNames
+
+```
+  , paramXorByteString                   = "xorByteStringModel"
+```
+(Getting the string wrong here, for example putting "xorByteString" instead will
+give `parse error (not enough input) at ""`. Errors will occur whenever the
+Haskell code attempts to read something from an R object that doesn't actually
+occur in the object, and they can sometimes be quite cryptic.)
+
+Also add a new clause in [`CreateBuiltinCostModel`](./create-cost-model/CreateBuiltinCostModel.hs):
+
+```
+    paramXorByteString                   <- getParams xorByteString paramXorByteString
+```
+
+and a function to extract the cost parameters for the R code.  This should be modelled on the existing
+functions at the end of the file:
+
+```
+xorByteString :: MonadR m => (SomeSEXP (Region m)) -> m (CostingFun ModelTwoArguments)
+xorByteString cpuModelR = do
+  cpuModel <- ModelTwoArgumentsMinSize <$> readModelMinSize cpuModelR
+  let memModel = ModelTwoArgumentsMinSize $ ModelMinSize 0 1
+  pure $ CostingFun (cpuModel) memModel
+```
+
+The CPU costing function is obtained by running the R code, but the memory usage
+costing function is defined statically here.  Memory usage costing functions
+only account for memory retained after the function has returned and not for any
+working memory that may be allocated during its execution.  Typically this means
+that the memory costing function should measure the size of the object returned
+by the builtin.  For our `xorByteString` implementation, if the arguments have
+sizes `m` and `n` then the result will have size `min(m,n)` so we define the memory
+costing function to be `(m,n) -> 0 + 1*min(m,n)`.
+
+
+#### Step 8: test the Haskell versions of the costing functions
 
 The code in [`CreateCostModel`](./create-cost-model/CreateBuiltinCostModel.hs)
 converts the cost modelling functions fitted by R into Haskell functions.  As
@@ -416,15 +436,14 @@ how to do this) and then run the tests with `cabal bench
 plutus-core:cost-model-test`.
 
 
-### Step 9: update the cost model JSON file
+#### Step 9: update the cost model JSON file
 
 Once the previous steps have been carried out, proceed as described in the first
-section: run `cost-model-budgeting-bench` on the reference machine and then feed
-the results to `generate-cost-model` to produce a new JSON cost model file
-(which will contain sensible coefficients for the costing functions for the new
-builtin in place of the arbitray ones we added in Step 3), and check it in along
-with a CSV file containing a full set of benchmark results which can be used to
-reproduce it.
+section: feed the results of the costing benchmarks to `generate-cost-model` to
+produce a new JSON cost model file (which will contain sensible coefficients for
+the costing functions for the new builtin in place of the arbitray ones we added
+in Step 3), and check it in along with a CSV file containing a full set of
+benchmark results which can be used to reproduce it.
 
 If you're confident that the evaluator hasn't changed too much since
 the cost model was last fully updated it may be possible to save time
@@ -440,7 +459,3 @@ can run benchmarks on their own machine and have the results re-scaled
 to be compatible with our reference machine, thereby removing (or at
 least lessening) the necessity for Cardano developers to do the
 benchmarking).
-
-
-
-
diff --git a/plutus-core/cost-model/test/TestCostModels.hs b/plutus-core/cost-model/test/TestCostModels.hs
@@ -60,8 +60,12 @@ import Hedgehog.Range qualified as Range
    be any problem because the functions should be the same as the ones we
    construct from R here (they're essentially the contents of 'costModelsR'
    converted to JSON), but it wouldn't do any harm to include any possible loss
-   of accuracy due to serialisation/deserialisation in the tests as well.
-
+   of accuracy due to serialisation/deserialisation in the tests as well.  Doing
+   the tests the way they're done here is arguably better because it may reveal
+   problems in the costing interface before the cost model file is updated, and
+   we want to be sure that we don’t include an incorrect costing function in the
+   JSON. Maybe it would be sensible to have some separate tests that check that
+   converting to JSON and then back is the identity.
 -}
 
 -- How many tests to run for each costing function