Skip to content

Jonathanlyj/ALIGNN-BERT-TL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ALIGNN-BERT-TL

This repository contains the code for Hybrid-LLM-GNN transfer learning framework to predict materials properties using concatenated embeddings extracted from ALIGNN and BERT models. The code provides the following functions:

  • Generate string representations of provided crystal samples using NLP tools: Robocystallographer and ChemNLP
  • Use a pre-trained BERT/MatBERT langugae model to extract context-aware word embeddings from text descriptions of a given crystal structure
  • Concatenate LLM embeddings with ALIGNN embeddings obtained from feature-extraction-based ALIGNNTL
  • Post-analysis script for model performance analysis and text-based model explanation

Installation

The basic requirement for using the files are a Python 3.8 with the packages listed in requirements.txt. It is advisable to create a virtual environment with the correct dependencies.

Source Files

  • generator.py: code to generate text descriptions for crystal samples using Robocystallographer and ChemNLP tools.
  • preprocess.py: code to extract contextual-aware word embeddings from text representations.
  • feature.py: code to combine LLM-based and GNN-based embeddings and construct datasets for predictor model.
  • analysis: script code to parse predictions and create visualizations for model performance analysis and text-based model explanation

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published