-
-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bounds erro for Flux.reset! in loss function #2057
Comments
Looks like the same bug as [Edit] FluxML/Zygote.jl#1297 . Can you try with |
Thanks for replying so quickly. As far as I can tell that's a PR on docs for CUDA, not sure how it relates? |
Oh sorry, I mean FluxML/Zygote.jl#1297 on Zygote not here. |
Beautiful, that works perfectly. Thank you |
For future reference, "contradict" on Slack suggested this alternative solution: function ℓ(xs, ys)
ChainRules.ignore_derivatives() do
Flux.reset!(rnn)
end
mse.([rnn(x) for x in xs], ys) |> sum
end which has the nice feature of being very explicit about what gets fixed where and why. |
Hey,
I'm tring to implement an RNN in Flux but I'm having some problems. Here's the MWE.
Starting by generating some data
then we define the model
and the loss function
and finally we train:
This gives an error:


As far as I can tell,
Flux.reset!(rnn)
turnsrnn.layers[2].state
( theRNN
state) from aVector
to aTuple
, but ONLY when called within the loss function. If I doFlux.reset!
not while training nothing happens.If I replace the loss function with
I don't get any errors, but it feels like this should be unnecessary. It also seems like the network is not learning at all, but that's for another discussion.
Now, I'm just getting started with Flux so maybe I'm missing something obvious, but I've based the example code on examples/tutorials found online, including in the docs, so I'm not sure what's going on here.
Thank you,
Fede
The text was updated successfully, but these errors were encountered: