Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad NEON performance #20

Open
chriselrod opened this issue Aug 26, 2022 · 1 comment
Open

Bad NEON performance #20

chriselrod opened this issue Aug 26, 2022 · 1 comment

Comments

@chriselrod
Copy link
Member

julia> using VectorizedRNG, Random

julia> x = Vector{Float64}(undef, 1024);

julia> @benchmark randn!(local_rng(), $x)
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range (min  max):  2.838 μs   4.028 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     2.852 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.862 μs ± 72.118 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

    █▄                                                        
  ▃███▅▄▂▂▂▂▁▁▁▁▂▁▁▁▁▂▁▁▁▁▁▁▂▁▂▁▁▂▁▂▂▁▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂ ▂
  2.84 μs        Histogram: frequency by time        3.16 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark randn!(Random.default_rng(), $x)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min  max):  1.533 μs   6.983 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     1.688 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.693 μs ± 77.624 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                      ▂   ▇▂▂▂▇▅▂▂▁█▁▁ ▄                      
  ▂▁▁▁▂▂▂▂▂▂▂▂▃▃▃▃▆▄▅▅█▇████████████████▆▆▅▇▄▄▃▄▃▃▃▃▂▂▂▂▂▂▂▂ ▄
  1.53 μs        Histogram: frequency by time        1.83 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark rand!(local_rng(), $x)
BenchmarkTools.Trial: 10000 samples with 146 evaluations.
 Range (min  max):  698.918 ns    8.839 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     700.630 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   707.420 ns ± 120.934 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▄█▇▄                 ▄▄▁                                      ▂
  █████▇▆▆▆▄▅▄▆▅▆▆▆▆▅▅▅████▆▆▇█▅▆▆▄▆▆▆▆▇▆▅▄▁▁▄▅▅▄▅▄▄▅▆▅▅▄▃▆▅▅▄▅ █
  699 ns        Histogram: log(frequency) by time        755 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark rand!(Random.default_rng(), $x)
BenchmarkTools.Trial: 10000 samples with 152 evaluations.
 Range (min  max):  682.566 ns  949.836 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     683.664 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   687.544 ns ±  11.115 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ██▅▃▁                ▃▄     ▁▁                                ▁
  ██████▇▇▇▇▇▆▅▆▆▅▆▄▅▄▄████▅▅▅███▇▆▅▆▆▆▆▆▅▅▃▄▄▄▄▄▅▂▄▅▃▅▂▄▅▄▄▄▅▅ █
  683 ns        Histogram: log(frequency) by time        735 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> versioninfo()
Julia Version 1.9.0-DEV.1073
Commit 0b9eda116d* (2022-08-01 14:27 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin21.5.0)
  CPU: 8 × Apple M1
@chriselrod
Copy link
Member Author

For comparison, on Cascadelake:

julia> using VectorizedRNG, Random

julia> x = Vector{Float64}(undef, 1024);

julia> @benchmark randn!(local_rng(), $x)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min  max):  1.183 μs   2.638 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     1.227 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.229 μs ± 27.961 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

             ▃▅▆███▇▅▃▁
  ▂▁▂▂▂▃▃▄▅▇███████████▇▆▄▃▃▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂ ▃
  1.18 μs        Histogram: frequency by time        1.34 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark randn!(Random.default_rng(), $x)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min  max):  1.594 μs   4.573 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     1.742 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.744 μs ± 49.598 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                        ▁▂▃▄▅▆████▆▆▇▆▅▂▂▁
  ▂▁▁▂▂▁▂▂▂▂▂▂▃▃▃▄▄▄▅▆▇████████████████████▇▆▅▄▄▄▃▃▃▃▃▂▂▂▂▂▂ ▅
  1.59 μs        Histogram: frequency by time        1.87 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark rand!(local_rng(), $x)
BenchmarkTools.Trial: 10000 samples with 732 evaluations.
 Range (min  max):  173.176 ns  229.518 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     180.137 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   180.299 ns ±   1.007 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                                ▅█
  ▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▁▁▂▁▂▁▁▁▂▁▁▁▂▂▁▂▂▂▂▁▂▁▁▂▂▂▅▆██▆▄▂▂▂▂▂▂▃▃▄▃▂ ▂
  173 ns           Histogram: frequency by time          182 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark rand!(Random.default_rng(), $x)
BenchmarkTools.Trial: 10000 samples with 323 evaluations.
 Range (min  max):  266.056 ns  382.669 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     266.514 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   266.989 ns ±   1.768 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▅▇███▆▅▄▃▁       ▁▂▂▂▂▂▁▂▂▁             ▁▁▂▃▃▂▂▁             ▂
  ███████████▅▅▃▅▆▅▇██████████▇▆▇▇▅▃▁▁▃▃▆▇███████████▇▇▇▆▆▆▅▅▅▆ █
  266 ns        Histogram: log(frequency) by time        272 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> versioninfo()
Julia Version 1.9.0-DEV.1172
Commit 18fa3835a7* (2022-08-23 13:44 UTC)
Platform Info:
  OS: Linux (x86_64-redhat-linux)
  CPU: 36 × Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant