Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When the depth > 3, variable (invariant and coors_out) will be NaN #2

Open
JunDecNo opened this issue Jan 1, 2025 · 7 comments
Open

Comments

@JunDecNo
Copy link

JunDecNo commented Jan 1, 2025

I encountered this issue when using the example and setting the depth to 4.

lucidrains added a commit that referenced this issue Jan 1, 2025
@lucidrains
Copy link
Owner

lucidrains commented Jan 1, 2025

@JunDecNo indeed it does

i threw in an extra norm in a place that makes sense (aggregated inner product across the higher order degrees, before projected into the update), and stabilized it

do you want to perhaps send the author of this paper an email and see if he/she is missing normalization in the hierarchical tensor refinement module?

@anine09
Copy link

anine09 commented Jan 6, 2025

@lucidrains Have you looked at the code for this repository ? Everything seems to work fine when I use it.

@lucidrains
Copy link
Owner

@anine09 oh nice, the code was released after all! why did I bother 🤔🤣

@lucidrains
Copy link
Owner

I'll reconcile the differences later this week and wrap up the project

@anine09
Copy link

anine09 commented Jan 6, 2025

@lucidrains After you answered my question about lens in version 0.2.4, I was also experiencing output as NaN, then I found this repository, and recently I switched to their implementation, and right after I commented on this issue, my loss immediately became NaN, so please be careful! Good luck!🤣

@lucidrains
Copy link
Owner

@anine09 lol oh no, ok

@cenjc
Copy link

cenjc commented Jan 13, 2025

I give an explanation. High-degree steerable features can be understood as electron clouds. The inner product describes the direct phase relationship between two electron clouds. The meaning of the modulus component is relatively weak, so the modulus component can be completely separated.

In addition, Nan may be due to multi-channel reasons. This has been corrected by using the Frobenius norm in GMN[1]. In addition, in section 3.4 of EquiformerV2[2], there is a similar correction method in the HEGNN[3] code (https://github.com/GLAD-RUC/HEGNN/blob/main/models/HEGNN.py#L94). However, the method of directly correcting high-degree steerable features used in the GotenNet implementation mentioned by anine09 is also very interesting.

Finally, I also recommend our HEGNN project (https://github.com/GLAD-RUC/HEGNN). You can consider using e3nn.o3.FullyConnectedTensorProduct to implement multi-channel high-degree steerable features (e.g. Eqs. (10)-(11) in the current version of GotenNet paper).

[1] Equivariant Graph Mechanics Networks with Constraints
[2] EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations
[3] Are High-Degree Representations Really Unnecessary in Equivarinat Graph Neural Networks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants