Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIF parsing assumes particular format, breaks for pymatgen-generated CIF files #24

Open
johannbrehmer opened this issue Oct 14, 2024 · 9 comments · May be fixed by #25
Open

CIF parsing assumes particular format, breaks for pymatgen-generated CIF files #24

johannbrehmer opened this issue Oct 14, 2024 · 9 comments · May be fixed by #25

Comments

@johannbrehmer
Copy link

I believe there is a bug in the parsing of CIF files.

The LoadCIFData() function assumes that atoms are described in lines of the following form:

atom_label	atom_type	[entries that do not involve a dot]	x	y	z	[...]

However, the CIF format is extremely generic, and allows several other forms. For instance, CIF files generated by pymatgen can look like this:

...
loop_
 _atom_site_type_symbol
 _atom_site_label
 _atom_site_symmetry_multiplicity
 _atom_site_fract_x
 _atom_site_fract_y
 _atom_site_fract_z
 _atom_site_occupancy
  C  C0  1  0.92693230  0.09413810  0.20146000  1
  C  C1  1  0.88283400  0.00951251  0.09697130  1
  C  C2  1  0.87383020  0.05616420  0.08434580  1
  C  C3  1  0.91268250  0.07610070  0.05408850  1
  C  C4  1  0.96251430  0.05028320  0.03666980  1
...

Running EQeq on such a CIF file leads to EQeq taking the atom labels for the types, and the types for the labels. This is visible for instance from its std output:

==================================================
========= Atom types - X & J values used =========
==================================================
Co	Z: 27	Ch. Cent: 2	X: -7.6	J: 16.44	
H2	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
H3	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
C0	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
C1	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
C2	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
C3	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
C4	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
C5	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
C6	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
C7	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
C8	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
C9	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
N3	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
O4	Z: 1	Ch. Cent: 0	X: 7.1761	J: 12.8438	
==================================================
===== Calculating charges... please wait. ========
...

If I'm not mistaken, this affects for instance mofchecker (@kjappelbaum), since that uses pymatgen to store CIF files and then EQeq to analyze the charges.

@kjappelbaum
Copy link
Collaborator

seems like a good catch!

@kjappelbaum
Copy link
Collaborator

seems like there should be a check or at least a clearer warning about this in the docs

@kjappelbaum
Copy link
Collaborator

I think https://github.com/danieleongari/manage_crystal was quite commonly used in our group to "clean" CIF files

@johannbrehmer
Copy link
Author

Thanks, I didn't know this one yet!

Plugging a call to manage_crystal into mofchecker (after creating the CIF file here) or pyeqeq (in the beginning of run_on_cif()) should indeed solve my issue. The latter seems like the right thing to do to me.

Would you be interested in me opening a merge request for this?

@johannbrehmer
Copy link
Author

I can confirm that including a call to manage_crystal in EQeq fixes the problem.

@kjappelbaum
Copy link
Collaborator

A PR for that would be very much appreciated! 🙏🏼

@johannbrehmer johannbrehmer linked a pull request Oct 22, 2024 that will close this issue
@johannbrehmer
Copy link
Author

See #25

@kjappelbaum
Copy link
Collaborator

oh thanks! I did not get (or did not see) the notification for this PR.

@kjappelbaum
Copy link
Collaborator

I'll give this and your other questions (e.g. oximachine) a look asap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants