Improvements to DynamicPPLBenchmarks #346

torfjelde · 2021-12-03T00:42:24Z

Produces results such as can be seen here: #309 (comment)

… for downstream tasks

benchmarks/src/DynamicPPLBenchmarks.jl

benchmarks/src/tables.jl

benchmarks/src/DynamicPPLBenchmarks.jl

yebai · 2021-12-16T18:48:16Z

This might be helpful for running benchmarks via CI - https://github.com/tkf/BenchmarkCI.jl

benchmarks/src/DynamicPPLBenchmarks.jl

yebai · 2022-08-29T22:34:23Z

@torfjelde should we improve this PR by incorporating TuringBenchmarks ? Alternatively, we can move all benchmarking code here into TuringBenchmarks . I am happy with both cases, but ideally, these benchmarking utilities should live in only one place to minimise confusion.

Also, https://github.com/TuringLang/TuringExamples contains some very old benchmarking code.

cc @xukai92 @devmotion

benchmarks/benchmark_body.jmd

benchmarks/benchmarks.jmd

coveralls · 2023-02-02T22:36:16Z

Pull Request Test Coverage Report for Build 5458519079

0 of 0 changed or added relevant lines in 0 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage remained the same at 76.408%

Totals
Change from base Build 5358326778:	0.0%
Covered Lines:	1927
Relevant Lines:	2522

💛 - Coveralls

codecov · 2023-02-02T22:39:42Z

Codecov Report

Patch and project coverage have no change.

Comparison is base (e6dd4ef) 76.40% compared to head (c867ae8) 76.40%.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #346   +/-   ##
=======================================
  Coverage   76.40%   76.40%           
=======================================
  Files          21       21           
  Lines        2522     2522           
=======================================
  Hits         1927     1927           
  Misses        595      595

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

yebai · 2024-09-16T18:27:12Z

We could implement a setup similar to EnzymeAD/Reactant.jl#105 (comment)

shravanngoswamii · 2024-09-19T11:43:40Z

I will look into this soon!

torfjelde · 2024-10-03T11:56:53Z

I think there are few different things we need to address:

How to set up the benchmarks for a given Model. This is already taken care of in TuringBenchmarking.jl; if anything is missing, we should just contribute to that, since this is also useful for end-users.
How do we track and compare benchmarks across versions?
How do we present in the information? Do we use Weave docs like in this PR or do we just present stuff in a table?
Which models should we benchmark?
- Might want to look at https://github.com/JasonPekos/TuringPosteriorDB.jl.
- We should also benchmark different implementations of the same model, so we can see how the different "types" of approaches are affected by changes we make.
Should the benchmarking be part of the CI? If so, how should this be triggered? How do we get compute for this (we can't just use a standard GH action for this but will need our "own" server to run this on)?

IMO, the CI stuff is not really that crucial. The most important things are a) choose a suite of models that answers all the questions we want, e.g. how does changes we make affect different impls of a model, how is scaling wrt. number of parameters affacted, how are compilation times affect, etc., and b) what's the output format for all of this.

torfjelde · 2024-10-03T12:14:37Z

How do we present in the information? Do we use Weave docs like in this PR or do we just present stuff in a table?

Some further notes on this. IMO we're mainly interested in a few different "experiments". We don't want to be testing every model out there, and so there are things we want to "answer" with our benchmarks.

As a result, I'm leaning more towards a Weave approach with each notebook containing answering a distinct question, e.g. "how does the model scale with number of observations", which subsequently produces outputs that can be compared across versions somehow. That is, I think the overall approach taken in this PR is "correct", but we need to make it much nicer + update how the benchmarks are performed.

But then the question is: what are the "questions" we want to answer. Here's few I can think of:

How does performance vary across implementations, going from "everything uses for-loops" to "everything is vectorized"?
How does both runtime performance and compilation times scale wrt. number of parameters and observations?

shravanngoswamii · 2024-10-19T08:23:59Z

How do we track and compare benchmarks across versions?

We can store html of benchmarks.md with some setup of different versions in gh-pages and serve it on /benchmarks

How do we present in the information? Do we use Weave docs like in this PR or do we just present stuff in a table?

Weave approach looks fine as each notebook could address a specific questions!

Should the benchmarking be part of the CI? If so, how should this be triggered? How do we get compute for this (we can't just use a standard GH action for this but will need our "own" server to run this on)?

It took a lot of time to run benchmarks from this PR locally, so I guess GH action is not preferred for this!

Let me know what to do next, I will proceed as you say!

shravanngoswamii · 2024-10-19T09:56:28Z

Might want to look at https://github.com/JasonPekos/TuringPosteriorDB.jl.

I have looked into this, there are many models, we must figure out which ones to benchmark.

torfjelde added 11 commits August 2, 2021 02:08

bigboy update to benchmarks

57b5d47

Merge branch 'master' into tor/benchmark-update

e7c0a76

Merge branch 'master' into tor/benchmark-update

60ec2c8

Merge branch 'master' into tor/benchmark-update

eb1b83c

Merge branch 'master' into tor/benchmark-update

d8afa71

make models return random variables as NamedTuple as it can be useful…

5bb48d2

… for downstream tasks

add benchmarking of evaluation with SimpleVarInfo with NamedTuple

02484cf

added some information about the execution environment

5c59769

added judgementtable_single

f1f1381

added benchmarking of SimpleVarInfo, if present

a48553a

Merge branch 'master' into tor/benchmark-update

f2dc062

torfjelde marked this pull request as draft December 3, 2021 00:43

github-actions bot reviewed Dec 3, 2021

View reviewed changes

added ComponentArrays benchmarking for SimpleVarInfo

fa675de

github-actions bot reviewed Dec 5, 2021

View reviewed changes

benchmarks/src/DynamicPPLBenchmarks.jl Outdated Show resolved Hide resolved

yebai mentioned this pull request Dec 16, 2021

Lightweight benchmarking on Github Actions TuringLang/Turing.jl#822

Closed

This was referenced Dec 16, 2021

Trigger benchmarking with Bors commands TuringLang/Turing.jl#1341

Closed

Lightweight benchmarks for Turing TuringLang/Turing.jl#1534

Closed

Merge branch 'master' into tor/benchmark-update

3962da2

github-actions bot reviewed Aug 29, 2022

View reviewed changes

benchmarks/src/DynamicPPLBenchmarks.jl Outdated Show resolved Hide resolved

yebai and others added 3 commits November 2, 2022 20:42

Merge branch 'master' into tor/benchmark-update

53dc571

Merge branch 'master' into tor/benchmark-update

f5705d5

formatting

7f569f7

yebai mentioned this pull request Nov 12, 2022

Easy way to get gradient evaluation timing TuringLang/Turing.jl#1721

Closed

Merge branch 'master' into tor/benchmark-update

4a06150

github-actions bot reviewed Feb 2, 2023

View reviewed changes

yebai and others added 3 commits February 2, 2023 22:39

Apply suggestions from code review

a1cc6bf

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update benchmarks/benchmarks.jmd

3e7e200

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Merge branch 'master' into tor/benchmark-update

c867ae8

yebai assigned shravanngoswamii Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to DynamicPPLBenchmarks #346

Improvements to DynamicPPLBenchmarks #346

torfjelde commented Dec 3, 2021

yebai commented Dec 16, 2021

yebai commented Aug 29, 2022

coveralls commented Feb 2, 2023 •

edited by github-actions bot

Loading

codecov bot commented Feb 2, 2023 •

edited

Loading

yebai commented Sep 16, 2024

shravanngoswamii commented Sep 19, 2024

torfjelde commented Oct 3, 2024

torfjelde commented Oct 3, 2024

shravanngoswamii commented Oct 19, 2024 •

edited

Loading

shravanngoswamii commented Oct 19, 2024

Improvements to DynamicPPLBenchmarks #346

Are you sure you want to change the base?

Improvements to DynamicPPLBenchmarks #346

Conversation

torfjelde commented Dec 3, 2021

yebai commented Dec 16, 2021

yebai commented Aug 29, 2022

coveralls commented Feb 2, 2023 • edited by github-actions bot Loading

Pull Request Test Coverage Report for Build 5458519079

💛 - Coveralls

codecov bot commented Feb 2, 2023 • edited Loading

Codecov Report

yebai commented Sep 16, 2024

shravanngoswamii commented Sep 19, 2024

torfjelde commented Oct 3, 2024

torfjelde commented Oct 3, 2024

shravanngoswamii commented Oct 19, 2024 • edited Loading

shravanngoswamii commented Oct 19, 2024

coveralls commented Feb 2, 2023 •

edited by github-actions bot

Loading

codecov bot commented Feb 2, 2023 •

edited

Loading

shravanngoswamii commented Oct 19, 2024 •

edited

Loading