Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to set svd_alg? #133

Open
Confusio opened this issue Feb 16, 2025 · 6 comments
Open

How to set svd_alg? #133

Confusio opened this issue Feb 16, 2025 · 6 comments

Comments

@Confusio
Copy link

I'm trying to optimize a PEPS with SU(2) symmetry using the following settings:

ctm_alg = SimultaneousCTMRG(; tol=1e-6, verbosity=2, trscheme=truncdim(36))
opt_alg = PEPSOptimize(
    boundary_alg = ctm_alg,
    optimizer = LBFGS(4; maxiter=100, gradtol=1e-3, verbosity=3),
    gradient_alg = ManualIter(; tol=gradtol, iterscheme=:fixed),
    reuse_env = false,
)

However, during the optimization, I encountered an error from svdsolve:

┌ Warning: `svdsolve` cotangent linear problem (3) returns unexpected result: error = 0.00018388099781627167 vs tol = 1.0e-8
└ @ KrylovKitChainRulesCoreExt ~/.julia/packages/KrylovKit/i0JjG/ext/KrylovKitChainRulesCoreExt/svdsolve.jl:248
ERROR: TaskFailedException

Since I don't require an iterative method for computing the SVD—and I believe the tsvd function in TensorKit is sufficient—I would like to disable the iterative SVD. I've tried looking into the code to adjust the settings accordingly, but I haven't been successful so far.

@lkdvos
Copy link
Member

lkdvos commented Feb 16, 2025

The SVD is actually not being computed by a Krylov algorithm here, since we default to using SDD. The warning you are seeing is from the AD implementation, which needs to invert some operator, hence the linear problem.
I think you can avoid that by using SVDAdjoint(; fwd_alg=SDD(), rrule_alg=nothing) as the actual svd algorithm in the projector_alg of the ctm_alg.

We are aware of the mess of settings by the way, and how it is really hard to figure out what and how to control the algorithms. Hopefully we'll be able to improve on this via #130, but it's definitely something we are actively trying to improve.

@pbrehmer
Copy link
Collaborator

Note that using rrule_alg=nothing will possibly use a lot of memory, depending on how many CTMRG iterations are required (since the AD framework will differentiate through all iterations, needing to cache intermediate results).

You could also try using gradient_alg=ManualIter(; tol=gradtol, iterscheme=:diffgauge) where the :diffgauge mode will typically yield more stable differentiation than :fixed mode. However, it is also slower because the gauge-fixing step is also being differentiated.

In case you encounter more difficulties you could also provide a minimally working example so we can look further into the settings. But hopefully #130 will simplify the interface, that should get merged soon :-)

@lkdvos
Copy link
Member

lkdvos commented Feb 17, 2025

@pbrehmer i don't think that's true, I'm talking about the rrule alg for the SVD, which will then presumably use the TensorKit rrule for the SVD? I think?

@pbrehmer
Copy link
Collaborator

Oh sorry indeed, I misread that! Then the :diffgauge mode is actually necessary because the TensorKit SVD rrule is incompatible with :fixed mode since it needs access to the full (untruncated) SVD and in :fixed mode the FixedSVD only has access to the truncated decomposition. (There is an @assert which forbids this combination but I'll make that more explicit in #130.)

@Confusio
Copy link
Author

Thank you for useful suggestions. I recently realized that PEPSKit is nearly functional, so I decided to set aside my own code and focus on further studies with PEPSKit. I do have one minor suggestion: during the projection process, would it be possible to normalize the singular values by the largest one? This might help mitigate issues with the save_inv function during the AD for tsvd, especially when the largest singular value is quite small.

@pbrehmer
Copy link
Collaborator

I do have one minor suggestion: during the projection process, would it be possible to normalize the singular values by the largest one?

Yes, this is possible, we just have to make sure to use norm(S, Inf) in order for this to be differentiable. Perhaps this could just replace the normalization of the corners and edges at the end of each CTMRG iteration? In any case, it seems that normalizing by the largest singular value is generically better than by the norm of the entire tensor (for a decaying singular value spectrum the entire norm is close to the largest singular value anyway, for a flat spectrum the entire norm might be too large and make the normalized tensor values too small).

@lkdvos What do you think? I could give this a go together with #137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants