Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
perf: move data out of
Scanner.Token
By storing the token data in a separate `Scanner` field and having `Token` be merely the token type, we can avoid a decent amount of copying when tokens are passed around. This leads to considerable speedups for the `TokenReader` and `Reader` benchmarks (the `Scanner` benchmark is slightly slower, but that probably has more to do with how that particular benchmark is written, since the token data was previously discarded). ``` Benchmark 1 (120 runs): zig-out/bin-old/scanner Gtk-4.0.gir measurement mean ± σ min … max outliers delta wall_time 41.6ms ± 470us 40.7ms … 43.5ms 1 ( 1%) 0% peak_rss 7.27MB ± 88.0KB 7.08MB … 7.34MB 0 ( 0%) 0% cpu_cycles 152M ± 839K 151M … 158M 3 ( 3%) 0% instructions 472M ± 20.8 472M … 472M 0 ( 0%) 0% cache_references 270K ± 625K 206K … 7.03M 10 ( 8%) 0% cache_misses 7.95K ± 260 7.61K … 9.81K 3 ( 3%) 0% branch_misses 511K ± 631 510K … 512K 18 (15%) 0% Benchmark 2 (116 runs): zig-out/bin/scanner Gtk-4.0.gir measurement mean ± σ min … max outliers delta wall_time 43.0ms ± 452us 41.9ms … 44.4ms 4 ( 3%) 💩+ 3.3% ± 0.3% peak_rss 7.28MB ± 77.9KB 7.08MB … 7.34MB 0 ( 0%) + 0.2% ± 0.3% cpu_cycles 158M ± 694K 156M … 159M 0 ( 0%) 💩+ 4.0% ± 0.1% instructions 527M ± 19.4 527M … 527M 27 (23%) 💩+ 11.7% ± 0.0% cache_references 234K ± 265K 207K … 3.06M 10 ( 9%) - 13.5% ± 45.6% cache_misses 7.93K ± 435 7.49K … 11.8K 5 ( 4%) - 0.3% ± 1.1% branch_misses 514K ± 335 513K … 515K 1 ( 1%) + 0.7% ± 0.0% ``` ``` Benchmark 1 (44 runs): zig-out/bin-old/token_reader Gtk-4.0.gir measurement mean ± σ min … max outliers delta wall_time 116ms ± 631us 115ms … 117ms 0 ( 0%) 0% peak_rss 7.30MB ± 59.0KB 7.21MB … 7.34MB 0 ( 0%) 0% cpu_cycles 462M ± 1.91M 459M … 466M 0 ( 0%) 0% instructions 1.14G ± 21.9 1.14G … 1.14G 0 ( 0%) 0% cache_references 233K ± 6.77K 226K … 253K 3 ( 7%) 0% cache_misses 9.69K ± 1.48K 8.05K … 13.6K 0 ( 0%) 0% branch_misses 815K ± 1.16K 813K … 817K 0 ( 0%) 0% Benchmark 2 (72 runs): zig-out/bin/token_reader Gtk-4.0.gir measurement mean ± σ min … max outliers delta wall_time 70.2ms ± 782us 68.9ms … 75.3ms 2 ( 3%) ⚡- 39.4% ± 0.2% peak_rss 7.29MB ± 63.4KB 7.21MB … 7.34MB 0 ( 0%) - 0.2% ± 0.3% cpu_cycles 271M ± 2.75M 268M … 291M 7 (10%) ⚡- 41.3% ± 0.2% instructions 885M ± 19.2 885M … 885M 17 (24%) ⚡- 22.6% ± 0.0% cache_references 224K ± 7.03K 219K … 263K 7 (10%) ⚡- 3.9% ± 1.1% cache_misses 8.32K ± 909 7.80K … 14.4K 6 ( 8%) ⚡- 14.1% ± 4.5% branch_misses 671K ± 42.3K 664K … 1.03M 3 ( 4%) ⚡- 17.6% ± 1.6% ``` ``` Benchmark 1 (35 runs): zig-out/bin-old/reader Gtk-4.0.gir measurement mean ± σ min … max outliers delta wall_time 145ms ± 857us 143ms … 148ms 2 ( 6%) 0% peak_rss 7.29MB ± 65.1KB 7.21MB … 7.34MB 0 ( 0%) 0% cpu_cycles 582M ± 3.06M 578M … 596M 1 ( 3%) 0% instructions 1.38G ± 24.7 1.38G … 1.38G 0 ( 0%) 0% cache_references 758K ± 196K 513K … 1.59M 2 ( 6%) 0% cache_misses 14.3K ± 6.84K 11.4K … 49.2K 4 (11%) 0% branch_misses 1.06M ± 14.0K 1.05M … 1.11M 3 ( 9%) 0% Benchmark 2 (48 runs): zig-out/bin/reader Gtk-4.0.gir measurement mean ± σ min … max outliers delta wall_time 105ms ± 1.55ms 104ms … 113ms 1 ( 2%) ⚡- 27.2% ± 0.4% peak_rss 7.27MB ± 93.6KB 7.08MB … 7.34MB 0 ( 0%) - 0.2% ± 0.5% cpu_cycles 419M ± 6.29M 414M … 450M 1 ( 2%) ⚡- 28.1% ± 0.4% instructions 1.13G ± 19.6 1.13G … 1.13G 11 (23%) ⚡- 18.1% ± 0.0% cache_references 575K ± 59.7K 490K … 797K 1 ( 2%) ⚡- 24.2% ± 7.9% cache_misses 12.5K ± 876 11.4K … 15.3K 5 (10%) - 12.3% ± 13.9% branch_misses 1.07M ± 4.22K 1.07M … 1.09M 8 (17%) + 1.2% ± 0.4% ```
- Loading branch information