Erase dtype and device #166

E-Rum · 2025-02-03T16:53:48Z

A couple of PRs ago, we decided to include dtype and device as explicit and obligatory parameters for both calculators and potentials.

Unfortunately, after thorough consideration of how typical pipelines are built, I concluded that we should abandon this design choice.

The main reason is that, in most cases, when working with an NN model, the preferred strategy is to first initialize the model and then move it to the desired device using model.to(device).

Since torch-pme is designed to be an internal part of the model, this creates a conflict. We initialize dtype and device once, but when we later move the model to a different device, it undermines our prior device-checking logic.

Luckily, since our entire pipeline is either a torch.nn.Module or its subclass, we can integrate it smoothly with models that change their device and dtype. The key idea is to thoroughly rewrite the pipeline so that all newly created tensors during calculations are registered as buffers using self.register_buffer.

This PR aims to achieve exactly that.

📚 Documentation preview 📚: https://torch-pme--166.org.readthedocs.build/en/166/

…ed dtype and device parameters from Calculator

…al classes

E-Rum · 2025-02-10T10:56:32Z

Done! I dropped "dtype", "device" initialization from "Potential" and "Calculator" classes and rewrote all the tests and examples accordingly. By default, all initialized torch.Tensors are now registered as buffers with the "device" and "dtype" they were passed on. All floats that we register as buffers are explicitly registered with torch.float64, as Python floats are float64 by default.

Since I was not involved in the tuning code development, I would kindly ask you to pay special attention to the changes in that part to ensure I didn’t break anything.

PicoCentauri · 2025-02-10T14:54:59Z

docs/src/references/changelog.rst

@@ -32,6 +32,11 @@ Added
 * Require consistent ``dtype`` between ``positions`` and ``neighbor_distances`` in
  ``Calculator`` classes and tuning functions.

+Changed


We can probably change it to

Suggested change

Changed

Removed

and also remove our statements in changed about using dtypes everywhere...

PicoCentauri · 2025-02-10T14:55:36Z

examples/08-combined-potential.py

+pot_1 = pot_1.to(dtype=dtype)
+pot_2 = pot_2.to(dtype=dtype)


Maybe add a comment why you need this here.

PicoCentauri · 2025-02-10T15:18:15Z

src/torchpme/calculators/p3m.py

+        cell = torch.eye(
+            3,
+            device=self.potential.smearing.device,
+            dtype=self.potential.smearing.dtype,
+        )
+        ns_mesh = torch.ones(3, dtype=int, device=cell.device)
+


When I apply to .to to the class, is this correctly passed to self.kspace_filter and self.mesh_interpolator?

PicoCentauri · 2025-02-10T15:24:55Z

src/torchpme/lib/splines.py


    # Calculate intervals
    intervals = x[1:] - x[:-1]
    dy = (y[1:] - y[:-1]) / intervals

    # Create zero boundary conditions (natural spline)
-    d2y = torch.zeros_like(x, dtype=torch.float64)
+    torch.zeros_like(x)


I think you can remove it?

PicoCentauri · 2025-02-10T15:31:01Z

src/torchpme/calculators/p3m.py

+        if potential.smearing is None:
+            raise ValueError(
+                "Must specify smearing to use a potential with P3MCalculator"
+            )


I think there is no test for it. Should be go into workflow tests.

Also I think more and more that we need an Ewald Base class and another "Base" class for the direct calculator.
I fact the EwaldCalculator can serve as a base class and PME and P3M only override the k-space method. But this is something for later...

…incompatibility

E-Rum added 8 commits February 3, 2025 16:44

Remove unused dtype and device parameters from potential classes

a680e9b

Refactor parameter validation in _validate_parameters and remove unus…

c9111df

…ed dtype and device parameters from Calculator

Remove unused dtype and device parameters from Calculator and potenti…

0cd2e80

…al classes

Continue changing

973c9e8

Fix tests

c763a09

Examples and lint

e8bd8f6

Remove debug print statements from CoulombPotential class

c63e026

Changelog update

5fd40aa

E-Rum marked this pull request as ready for review February 10, 2025 10:51

E-Rum requested review from PicoCentauri and GardevoirX February 10, 2025 11:02

PicoCentauri reviewed Feb 10, 2025

View reviewed changes

Remove unnecessary zero boundary condition and add test for smearing …

9285cda

…incompatibility

E-Rum requested a review from PicoCentauri February 10, 2025 17:40

GardevoirX approved these changes Feb 11, 2025

View reviewed changes

PicoCentauri approved these changes Feb 11, 2025

View reviewed changes

update changelog

69a0f51

PicoCentauri merged commit fb760cd into main Feb 11, 2025
13 checks passed

PicoCentauri deleted the fix_device_dtype branch February 11, 2025 08:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Erase dtype and device #166

Erase dtype and device #166

E-Rum commented Feb 3, 2025 •

edited by github-actions bot

Loading

E-Rum commented Feb 10, 2025

PicoCentauri Feb 10, 2025

PicoCentauri Feb 10, 2025

PicoCentauri Feb 10, 2025

PicoCentauri Feb 10, 2025

PicoCentauri Feb 10, 2025

		pot_1 = pot_1.to(dtype=dtype)
		pot_2 = pot_2.to(dtype=dtype)

Erase dtype and device #166

Erase dtype and device #166

Conversation

E-Rum commented Feb 3, 2025 • edited by github-actions bot Loading

E-Rum commented Feb 10, 2025

PicoCentauri Feb 10, 2025

Choose a reason for hiding this comment

PicoCentauri Feb 10, 2025

Choose a reason for hiding this comment

PicoCentauri Feb 10, 2025

Choose a reason for hiding this comment

PicoCentauri Feb 10, 2025

Choose a reason for hiding this comment

PicoCentauri Feb 10, 2025

Choose a reason for hiding this comment

E-Rum commented Feb 3, 2025 •

edited by github-actions bot

Loading