Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i-PI integration #307

Merged
merged 28 commits into from
Apr 12, 2021
Merged

i-PI integration #307

merged 28 commits into from
Apr 12, 2021

Conversation

max-veit
Copy link
Contributor

@max-veit max-veit commented Dec 11, 2020

Add support for i-PI using a dedicated calculator, which is also usable as a "generic" MD interface for any other MD code. Works with the branch https://github.com/cosmo-epfl/i-pi/tree/feat/librascal

The new calculator is implemented in bindings/rascal/models/genericmd.py.

Still to be worked out:

  • Confirm sign convention of virial Confirmed experimentally
  • Make units handling more robust (fewer assumptions) moved to i-PI
  • Does i-PI run non-periodic simulations? If so, how is this signalled to the force driver? No

Tasks before review:

  • Add tests, examples, and complete documentation of any added functionality
    • Make sure the autogenerated documentation appears in Sphinx and is
      formatted correctly (ask @max-veit if you need help with this task).
    • Write additional documentation in the Sphinx (not docstrings!) to
      explain the feature and its usage in plain English
    • Make sure the examples run (!) and produce the expected output
    • For bugfix pull requests, list here which tests have been added or re-enabled
      to verify the fix and catch future regressions as well as similar bugs
      elsewhere in the code.
  • Run make lint on the project, ensure it passes
    • Run make pretty-cpp and make pretty-python, check that the
      auto-formatting is sensible
    • Review variable names, make sure they're descriptive (ask a friend)
    • Review all TODOs, convert to issues wherever possible
    • Make sure draft, in-progress, and commented-out code is moved to its
      own branch
  • Merge with master and resolve any conflicts

Co-written with Lorenzo (@lgigli)

@max-veit max-veit marked this pull request as ready for review February 25, 2021 15:42
def compute_KNM(frames, X_sparse, kernel, soap):
Nstructures, Ngrads, Ngrad_stride = get_strides(frames)
KNM = np.zeros((Nstructures + Ngrads, X_sparse.size()))
pbar = tqdm(frames, desc="compute KNM", leave=False)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I'm not sure how I feel about including tqdm at such a deep level -- it seems wrong for the code to need to know about something that's in principle unrelated, but I'm not sure I see a better solution.

One idea would instead be to wrap frames in a tqdm before passing it to this function in whatever notebook uses these - will look into this later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will look into this later.

Did you try this? I would find it much more elegant, and it gives the user more control over the tqdm display name/positions/style etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet; I ended up just using the fix in cf0e660.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I found a way that works -- it's not pretty, because the function iterates through the frames twice and I can't find a way to reset a tqdm bar once it's finished (you'd think that would be a pretty basic feature, but whatever).

@Luthaf
Copy link
Contributor

Luthaf commented Mar 25, 2021

A few high level comments before I dive into this for a review

  • could you fix conflicts with master?
  • do we really need an additional 11Mb file reference_data/tests_only/simple_gap_fit_params.json just for the tests? Could this be made much smaller by using less sparse points when training? We don't need an accurate potential just to check that the machinery around it works. Related to this, could you then remove the file completely from history by rebasing this branch or squashing the corresponding commits?
  • where is the format for reference_data/tests_only/simple_gap_fit_params.json defined and documented? My current understanding is that this file contains the raw JSON serialization of all C++ classes involved in the model, plus the weights and sparse points, am I right? I'm afraid to start distributing / encourage users to generate and distribute models with an under-defined and un-documented file format. Do we want to guarantee support for this format? Or should people re-train their models whenever we change librascal internals? Overall I feel that the format we want to use to store trained models warrant a larger discussion than on the side of semi-related PR.

@max-veit
Copy link
Contributor Author

max-veit commented Mar 25, 2021

  • could you fix conflicts with master?

I'll do that when I rebase this branch. The (tiny) conflicts here were caused by a rebase/squash of yesterday's bugfix branch (that was extracted from this one).

  • do we really need an additional 11Mb file reference_data/tests_only/simple_gap_fit_params.json just for the tests? Could this be made much smaller by using less sparse points when training? We don't need an accurate potential just to check that the machinery around it works. Related to this, could you then remove the file completely from history by rebasing this branch or squashing the corresponding commits?

Didn't realize it was that large... I was already using what felt like a very small number of sparse points, but I'll try to cut it down further.

  • where is the format for reference_data/tests_only/simple_gap_fit_params.json defined and documented? My current understanding is that this file contains the raw JSON serialization of all C++ classes involved in the model, plus the weights and sparse points, am I right? I'm afraid to start distributing / encourage users to generate and distribute models with an under-defined and un-documented file format. Do we want to guarantee support for this format? Or should people re-train their models whenever we change librascal internals? Overall I feel that the format we want to use to store trained models warrant a larger discussion than on the side of semi-related PR.

That's really a discussion for #305 (which is going to be my main "project" after this PR; it was also the branch used to generate this model file). To summarize for now, the model JSON is the serialization of a Python KRR object, which can be restored (via the librascal deserialization mechanism) and used to evaluate the model via the public KRR interface.
It does contain the serialized representation and kernel parameters, as well as the weights and sparse points -- but as to the concern about model files changing every time we change something in the internals of librascal, that seems unlikely as the serialization happens on a relatively high level and only uses the Python interface.

And anyway, the file lives in reference_data/*tests_only*, not examples/, so I hope that would be enough of an indication for users not to take this as a definitive model specification. The only point to having this model file right now is to be able to test the i-PI interface.

ceriottm and others added 19 commits March 26, 2021 12:37
(still need to work out some kinks in the return format, depending on
what i-PI expects)
also update description and autoformat
Simplify and test IPI tutorial notebook; replace BaTiO3 example with
Zundel
Remove i-PI specific unit conversions and input/output formatting; this
should be taken care of on the i-PI side.
Edit: reduce the size of simple testing GAP model file
Cherry-picked from feat/gaptools for compatibility with
gaptools-generated model
(although probably they should be moved to something like
a notebook-utils folder, or eventually just merged into gaptools)

also, remove unused imports in zundel i-PI example
README.rst Show resolved Hide resolved
bindings/rascal/models/IP_generic_md.py Outdated Show resolved Hide resolved
bindings/rascal/models/IP_generic_md.py Outdated Show resolved Hide resolved
bindings/rascal/models/IP_generic_md.py Outdated Show resolved Hide resolved
bindings/rascal/models/IP_generic_md.py Outdated Show resolved Hide resolved
examples/iPi/zundel/zundel_IP.ipynb Outdated Show resolved Hide resolved
examples/iPi/zundel/zundel_IP.ipynb Outdated Show resolved Hide resolved
examples/iPi/zundel/zundel_IP.ipynb Outdated Show resolved Hide resolved
examples/iPi/zundel/zundel_IP.ipynb Outdated Show resolved Hide resolved
examples/iPi/zundel/zundel_IP.ipynb Outdated Show resolved Hide resolved
@Luthaf
Copy link
Contributor

Luthaf commented Mar 26, 2021

The updated example model file is much better, and I'm fine with leaving the discussion of the serialization format for models for a later PR!

- Remove 'assume_pbc' option; user must now specify PBC explicitly
  on initialization

- Allow specifying bare 'atomic_numbers' in lieu of structure template

- Check for number-of-atoms consistency
max-veit added a commit to lab-cosmo/i-pi that referenced this pull request Mar 29, 2021
@max-veit
Copy link
Contributor Author

I'm pretty much done on the bindings side; perhaps @lgigli can have a look at the notebook comments since he's the one who wrote it?

@max-veit max-veit requested a review from Luthaf April 8, 2021 15:54
@max-veit
Copy link
Contributor Author

max-veit commented Apr 8, 2021

not sure what's happening with the doc build... Sphinx is complaining about some utf-8 characters in a file it's trying to interpret in ASCII mode (why??? why won't ASCII die already??) -- but I'm pretty sure I didn't add any funny characters or change file encodings or the build configuration since the last successful doc build. And it runs just fine on my machine.

@max-veit
Copy link
Contributor Author

max-veit commented Apr 9, 2021

not sure what's happening with the doc build... Sphinx is complaining about some utf-8 characters in a file it's trying to interpret in ASCII mode (why??? why won't ASCII die already??) -- but I'm pretty sure I didn't add any funny characters or change file encodings or the build configuration since the last successful doc build. And it runs just fine on my machine.

This is full (!) traceback of the error, looks like it happens when trying to load the ExtBabel extension for some reason. Why this is failing now and not in earlier builds, I have no idea.

# Sphinx version: 3.5.3
# Python version: 3.6.9 (CPython)
# Docutils version: 0.17 release
# Jinja2 version: 2.11.3
# Last messages:

# Loaded extensions:
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/sphinx/cmd/build.py", line 279, in build_main
    args.tags, args.verbosity, args.jobs, args.keep_going)
  File "/usr/local/lib/python3.6/dist-packages/sphinx/application.py", line 241, in __init__
    self.setup_extension(extension)
  File "/usr/local/lib/python3.6/dist-packages/sphinx/application.py", line 402, in setup_extension
    self.registry.load_extension(self, extname)
  File "/usr/local/lib/python3.6/dist-packages/sphinx/registry.py", line 417, in load_extension
    mod = import_module(extname)
  File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/usr/local/lib/python3.6/dist-packages/sphinx/builders/latex/__init__.py", line 25, in <module>
    from sphinx.builders.latex.util import ExtBabel
  File "/usr/local/lib/python3.6/dist-packages/sphinx/builders/latex/util.py", line 13, in <module>
    from docutils.writers.latex2e import Babel
  File "/usr/local/lib/python3.6/dist-packages/docutils/writers/latex2e/__init__.py", line 575, in <module>
    for line in fp:
  File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 69: ordinal not in range(128)

@max-veit
Copy link
Contributor Author

max-veit commented Apr 9, 2021

omg no way, the build is failing due to a bug in the latest version (0.17) of docutils. Specifically, the '©' character in the copyright notice of docutils.sty... already reported here and fixed in the latest Sourceforge version, but for us I think we can just force the previous version in requirements.txt until a hotfix is pushed to pypi.

Copy link
Contributor

@Luthaf Luthaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good, feel free to merge after squashing related commits together!

examples/i-PI/zundel/zundel_IP.ipynb Outdated Show resolved Hide resolved
max-veit and others added 2 commits April 12, 2021 16:15
Note also: i-PI examples dir has been renamed

Co-authored-by: Guillaume Fraux <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants