A Partial Atomic Charge Predicter for Porous Materials based on Graph Convolutional Neural Network (PACMAN)
- DDEC6 (1, 2, 3, 4), Bader, Charge Model 5 (CM5), REPEAT for metal-organic frameworks (MOFs)
- DDEC6 for covalent-organic frameworks (COFs)
Developed by: Guobin Zhao
pip install PACMAN-charge
Git clone
git clone https://github.com/mtap-research/PACMAN-charge.git
cd PACMAN-charge
pip install -r requirements.txt
Jupyter notebook (using pip)
from PACMANCharge import pmcharge
pmcharge.predict(cif_file="./test/Cu-BTC.cif",charge_type="DDEC6",digits=10,atom_type=True,neutral=True,keep_connect=True)
Terminal
python pmcharge.py folder-name[path] --charge_type[DDEC6/Bader/CM5/REPEAT] --digits[int] --atom_type[bool] --neutral[bool] --keep_connect[bool]
Example command: python pmcharge.py test_file/test-1/ --charge_type DDEC6 --digits 10
Help usage information: python pmcharge.py -h
- folder-name: relative path to a folder with cif files without partial atomic charges
- charge-type (default: DDEC6): DDEC6, Bader, CM5 or REPEAT
- digits (default: 6): number of decimal places to print for partial atomic charges. ML models were trained on a 6-digit dataset
- atom-type (default: True): Default is to keep the same partial atomic charge for the same atom types (based on the similarity of partial atomic charges up to 3 decimal places)
- neutral (default: True): Default is to keep the net charge is zero. We use "mean" method to neuralize the system where the excess charges are equally distributed across all atoms
- keep_connect (default: True): retain the atomic and connection information (such as _atom_site_adp_type, bond) for the structure.
- Predict partial atomic charges using an online APP π link
- Full code and dataset can be downloaded from π link
- Note: All future releases will be uploaded on Github and pip only
If you use PACMAN charge, please consider citing this paper:
@article{,
title={PACMAN: A Robust Partial Atomic Charge Predicter for Nanoporous Materials based on Crystal Graph Convolution Network},
DOI={10.1021/acs.jctc.4c00434},
journal={Journal of Chemical Theory and Computation},
author={Zhao, Guobin and Chung, Yongchul},
year={2024},
volume = {20},
number = {12},
pages={5368-5380}
}
Databases with partial atomic charges | url | size |
---|---|---|
QMOF | link | 16,779 |
CoRE MOF 2014 DDEC | link | 2,932 |
CoRE MOF 2014 DFT-optimized | link | 502 |
CURATED-COFs | link | 612 |
ARC-MOF | link | 279,118 |
If you encounter any problem during using PACMAN, please email [email protected]
or create "issues"
.
βββ ..
βββ figs # Figures used for introduction
β βββ toc.jpg # Table of Contents
β βββ workflow.png # Workflow of this project
β
βββ model # Python files used for dataset prepartion & GCN training
β βββ GCN_E.py # Networks model for energy/bandgap training
β βββ GCN_charge.py # Networks model for atomic charge training
β βββ cif2data.py # Convert QMOF database to dataset
β βββ data_E.py # Convert cif to graph & target (energy/bandgap)
β βββ data_charge.py # Convert cif to graph & target (atomic charge)
β βββ utils.py # Normalizer, sampling, AverageMeter, save_checkpoint
β
βββ model4pre # Python files used for prediction
β βββ GCN_E.py # Networks model for energy/bandgap prediction
β βββ GCN_charge.py # Networks model for atomic charge prediction
β βββ atom_init.json # a JSON file that stores the initialization vector for each element
β βββ cif2data.py # Read/write cif file
β βββ data.py # Convert cif to graph & target (energy/bandgap)
β βββ data_charge.py # Convert cif to graph & target (atomic charge)
β βββ utils.py # Normalizer, sampling, AverageMeter, save_checkpoint
β
βββ pth # Models of this project
β βββ best_bader # Bader
β β βββ bader.pth # Bader charge model
β β βββ normalizer-bader.pkl # Normalizer of bandgap
β βββ best_bandgap # Bandgap
β β βββ bandgap.pth # Bandgap model
β β βββ normalizer-bandgap.pkl # Normalizer of bandgap
β βββ best_cm5 # CM5
β β βββ bandgap.pth # ///
β β βββ normalizer-bandgap.pkl # ///
β βββ best_ddec # ///
β β βββ ddec.pth # ///
β β βββ normalizer-ddec.pkl # ///
β βββ best_pbe # ///
β β βββ pbe-atom.pth # ///
β β βββ normalizer-pbe.pkl # ///
β βββ best_repeat # ///
β β βββ repeat.pth # ///
β β βββ normalizer-repeat.pkl # ///
β βββ chk_bader # Bader
β β βββ checkpoint.pth # Checkpoint of bader
β βββ chk_bandgap # Bandgap
β β βββ checkpoint.pth # Checkpoint of bandgap
β βββ chk_cm5 # CM5
β β βββ checkpoint.pth # ///
β βββ chk_ddec # ///
β β βββ checkpoint.pth # ///
β βββ chk_pbe # ///
β β βββ checkpoint.pth # ///
β βββ chk_repeat # ///
β βββ checkpoint.pth # ///
β
βββ pmcharge.py # main python file for atomic charge assignment by command line
βββ LICENSE.txt # MIT license
βββ README.md # Usage/Source
βββ requirements.txt # packages need to be installed
βββ train_E.py # main python file for energy/bandgap training
βββ train_charge.py # main python file for atomic charge training
(Elements that have been used by the model training process, not all the elements contained in the database)
- DDEC6/CM5/Bader Charges
- REPEAT Charges
- DDEC6 Charges
Parity plot of partial atomic charges from DDEC6 and PACMAN on the test set (QMOF).
- CM5 Charges
Parity plot of partial atomic charges CM5 and PACMAN on the test set (QMOF).
- Bader Charges
Parity plot of partial atomic charges from Bader and PACMAN on the test set (QMOF).
For the Bader model, use caution with Th-MOF predictions due to just 2 points used in traning set. The big error shows in the below figure isTh
.
- REPEAT Charges
Parity plot of partial atomic charges from REPEAT and PACMAN on the test set (ARC-MOF).
- Guobin Zhao ([email protected])
- Guobin Zhao ([email protected]): models, training, data preparation
- Yongchul G. Chung ([email protected]): supervising