Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

10x performance regression on v1.11 #57028

Open
mhauru opened this issue Jan 13, 2025 · 4 comments
Open

10x performance regression on v1.11 #57028

mhauru opened this issue Jan 13, 2025 · 4 comments
Labels
performance Must go faster regression Regression in behavior compared to a previous version regression 1.11 Regression in the 1.11 release types and dispatch Types, subtyping and method dispatch

Comments

@mhauru
Copy link
Contributor

mhauru commented Jan 13, 2025

module MWE

using Turing
using Turing: DynamicPPL
using Random

Random.seed!(42)

num_iterations = 10_000
adbackend = AutoForwardDiff()

@model function m(x=1.5)
    s ~ InverseGamma(2, 3)
    m ~ Normal(0, sqrt(s))
    x ~ Normal(m, s)
    return nothing
end

model = m()
initial_params = [0.5, 0.5]

component_sampler = HMC(0.1, 32; adtype=adbackend)
sampler = Turing.Gibbs(@varname(s) => component_sampler, @varname(m) => component_sampler)

@info "Starting sampling"
sample(model, sampler, num_iterations; initial_params=initial_params)

end

The above code runs in about 4s on v1.10.6 and in about 30s on v1.11.2 (recording second runs, so excluding compilation time). This is using the latest master from Turing.jl.

I would need to minimise the example to find the cause, but does anyone have clues as to where to look? A type inference failure seems like a possibility to me, any known regressions there on v1.11?

@KristofferC
Copy link
Member

KristofferC commented Jan 13, 2025

FWIW, basically all the time seems to be spent in subtyping.

  ╎ 9983  …rc/logging.jl:11; macro expansion
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9983  [unknown stackframe]
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9983  …c/logging.jl:36; with_progresslogger(f::Func…
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    9983  [unknown stackframe]
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     9983  …/logging.jl:632; with_logger(f::Function, …
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9983  [unknown stackframe]
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   +1 9983  …/logging.jl:522; with_logstate(f::Abstrac…
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   +2 9982  …/logging.jl:12; (::AbstractMCMC.var"#24#2…
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   +3 9982  …sLogging.jl:328; macro expansion
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   +4 9973  …c/sample.jl:217; macro expansion
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   +5 9969  [unknown stackframe]
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   +6 9969  …mc/gibbs.jl:460; kwcall(::@NamedTuple{ini…
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   +7 7390  …mc/gibbs.jl:480; step(rng::TaskLocalRNG, …
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   +8 6854  …c/subtype.c:2148; ijl_subtype_env
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   +9 6854  …c/subtype.c:1698; forall_exists_subtype
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +10 6854  …c/subtype.c:1684; _forall_exists_subtype
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +11 6854  …c/subtype.c:1653; exists_subtype
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +12 6854  …c/subtype.c:1462; subtype
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +13 6854  …c/subtype.c:1302; subtype_tuple
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +14 6847  …c/subtype.c:1220; subtype_tuple_tail
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +15 6820  …c/subtype.c:2148; ijl_subtype_env
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +16 6820  …c/subtype.c:1698; forall_exists_subtype
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +17 6820  …c/subtype.c:1684; _forall_exists_subtype
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +18 6820  …c/subtype.c:1653; exists_subtype
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +19 6819  …c/subtype.c:1425; subtype
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +20 6818  …c/subtype.c:913; subtype_unionall
  ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +21 6818  …c/subtype.c:1425; subtype

With "Ctrl-C" profiling it seems to consistently stop at

ERROR: InterruptException:
Stacktrace:
  [1] indexed_iterate
    @ ./tuple.jl:159 [inlined]
  [2] step(rng::TaskLocalRNG, model::DynamicPPL.Model{…}, spl::DynamicPPL.Sampler{…}, state::Turing.Inference.GibbsState{…}; kwargs::@Kwargs{…})
    @ Turing.Inference ~/.julia/packages/Turing/4O8UF/src/mcmc/gibbs.jl:480
  [3] macro expansion
    @ ~/.julia/packages/AbstractMCMC/FSyVk/src/sample.jl:217 [inlined]

which points to https://github.com/TuringLang/Turing.jl/blob/7d6f8ed53b59047c86fb1c09c9d593ca30250a60/src/mcmc/gibbs.jl#L480-L482

The indexed_iterate in the stacktrace could be that the return value of gibbs_step_inner is no longer inferred?

@oscardssmith oscardssmith added types and dispatch Types, subtyping and method dispatch regression Regression in behavior compared to a previous version labels Jan 13, 2025
@N5N3
Copy link
Member

N5N3 commented Jan 13, 2025

Local profile shows that this MWE shares a similar pattern as #56606, which should have been fixed in #56640.
We can close this once 1.11.3 get released.

@giordano
Copy link
Contributor

Someone can probably test this on #56741? With juliaup you can use juliaup add pr56741

@jakobnissen
Copy link
Contributor

jakobnissen commented Jan 14, 2025

I can verify it runs >10x faster on #56741 compared to 1.11.2.

@ViralBShah ViralBShah added the regression 1.11 Regression in the 1.11 release label Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster regression Regression in behavior compared to a previous version regression 1.11 Regression in the 1.11 release types and dispatch Types, subtyping and method dispatch
Projects
None yet
Development

No branches or pull requests

8 participants