Silicon-Metal Alloy Potentials Project

Overview

This project investigates the use of machine learning (ML) techniques to model and predict interatomic potentials in silicon-metal alloy systems. It combines Density Functional Theory (DFT) calculations with ML to achieve accurate and computationally efficient potential predictions. Three ML models—Support Vector Regression (SVR), Gaussian Mixture Models (GMM), and Fully Connected Neural Networks (FCNN)—are trained on the generated datasets.

Workflow

1. Install Requirements

Install the required Python libraries using the requirements.txt file:

pip install -r requirements.txt

2. Data Collection and Preprocessing

Run data_collection.ipynb to:

Collect material properties using libraries like pymatgen and matminer.
Perform DFT calculations to generate potential energy surface data.
Process and structure data into CSV files for training.

Key Datasets Generated:

Material Properties: Includes energy per atom, formation energy per atom, band gap, etc.
Categorical Data: Material classifications and labels.
Featurized Data: Includes density features, XRD powder patterns, orbital field matrices, DFT-based generated data.

3. Model Training

Train three ML models on the generated datasets:

nn.py: Trains a Fully Connected Neural Network (FCNN) for predicting energy-related properties.
svr.py: Trains a Support Vector Regression (SVR) model to predict formation and potential energies.
gmm.py: Fits a Gaussian Mixture Model (GMM) to probabilistically model energy distributions.

Each Script:

Outputs model performance metrics (e.g., RMSE).

4. Plot Results

Run plot.ipynb to visualize:

Actual vs. Predicted Potentials for each ML model.
RMSE performance across models and datasets.
Comparisons of training and testing RMSE for each technique.

The notebook reproduces the figures shown in the project report.

Results

Performance Metrics

The models were evaluated using Root Mean Square Error (RMSE):

Support Vector Regression (SVR): Achieved the lowest RMSE, showing strong predictive performance and good generalization.
Gaussian Mixture Models (GMM): Moderate RMSE values but struggled with generalization.
Neural Networks (NN): Highest RMSE, indicating overfitting and poor generalization.

Dataset Observations

XRD Dataset: Best performance for all models, particularly SVR.
Orbital and Sine Datasets: SVR still outperformed other models, but with slightly higher RMSE.
DFT Dataset: Most challenging for all models, with NN showing the poorest performance.

Comparative Metrics

Dataset	Model	Train RMSE	Test RMSE
XRD	SVR	0.095	0.087
XRD	GMM	0.476	1.323
XRD	NN	0.731	1.874

Acknowledgment

This project was developed as part of the ME438 course at IIT Bombay, under the guidance of Prof. Amit Singh.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
ME438_Report_210020156 (1).pdf		ME438_Report_210020156 (1).pdf
Orbital_mp_test.csv		Orbital_mp_test.csv
README.md		README.md
Sine_MP_train.csv		Sine_MP_train.csv
Xrd_MP_train.csv		Xrd_MP_train.csv
data_collection.ipynb		data_collection.ipynb
dft.py		dft.py
fcnn_actual_vs_predicted.png		fcnn_actual_vs_predicted.png
gmm.py		gmm.py
gmm_actual_vs_predicted.png		gmm_actual_vs_predicted.png
nn.py		nn.py
orbital_MP_train.csv		orbital_MP_train.csv
output.png		output.png
plot.ipynb		plot.ipynb
requirements.txt		requirements.txt
sine_mp_test.csv		sine_mp_test.csv
svr.py		svr.py
svr_actual_vs_predicted.png		svr_actual_vs_predicted.png
xrd_test_mp.csv		xrd_test_mp.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Silicon-Metal Alloy Potentials Project

Overview

Workflow

1. Install Requirements

2. Data Collection and Preprocessing

Key Datasets Generated:

3. Model Training

Each Script:

4. Plot Results

Results

Performance Metrics

Dataset Observations

Comparative Metrics

Acknowledgment

About

Releases

Packages

Languages

HeisenbergsCat03/ME438

Folders and files

Latest commit

History

Repository files navigation

Silicon-Metal Alloy Potentials Project

Overview

Workflow

1. Install Requirements

2. Data Collection and Preprocessing

Key Datasets Generated:

3. Model Training

Each Script:

4. Plot Results

Results

Performance Metrics

Dataset Observations

Comparative Metrics

Acknowledgment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages