Skip to content

shrrynsh/BYOP-25-HireSense

Repository files navigation

BYOP-25-HireSense

HireSense ✨

This repository contains a Candidate Selection Model "HireSense" designed to parse resumes and job descriptions (JDs) into JSON format using fine-tuned NER models, and calculate the similarity between them using embeddings generated by the T5 model. 🌐


🔢 Project Structure

The project includes the following Python notebooks:

  1. 📄 jdnertrainnotebooknew

    • Contains the training code for the Job Description Parser Model.
  2. 📄 resumenertrainnotebooknew

    • Contains the training code for the Resume Parser Model.
  3. 📄 finalbyopnew

    • Loads the trained models and demonstrates the process of parsing resumes and JDs into JSON format.
    • Generates embeddings using the T5 model.
    • Calculates the similarity of an Electrical Engineer JD with resumes of an Electrical Engineer, Animator, and Bar Tender.

📂 Input Formats (TO BE GIVEN IN THE THIRD NOTEBOOK)

🔑 Resumes:

To process resumes, use the following format:

text = text_preprocess(pdf_load("path of the resume"))

ensemble = EnsembleNERResume(
    pretrained_model_path="/kaggle/input/vvvvvvvvvvvvv/resume_ner_model_pickle.pkl",
    device="cuda" if torch.cuda.is_available() else "cpu"
)

predictions = ensemble.predict(text)

parsed_resume = group_entities_unique_resume(predictions)

Here, parsed_resume contains the parsed resume in JSON format. 🔐

🔑 Job Descriptions:

To process JDs, use the following format:

jd = text_preprocess("type your jd here")

ensemble = EnsembleNERJd(
    pretrained_model_path="/kaggle/input/vvvvvvvvvvvvv/jd_ner_model_pickle.pkl",
    device="cuda" if torch.cuda.is_available() else "cpu"
)

predictionsjd = ensemble.predict(jd)

parsed_jd = group_entities_unique_jd(predictionsjd)

Here, parsed_jd contains the parsed JD in JSON format. 🔐

🔍 Similarity Calculation:

Upload the parsed_resume and parsed_jd into the eval() function in the third notebook to calculate the similarity score. 🔎


📚 Training Data

  • The repository includes training data for testing the model.
  • Resumes are provided in PDF format. 📄
  • JDs are provided in TXT format. 📄

📝 How to Run the Notebooks

  1. Clone this repository: 🔧

    git clone <repository_url>
    cd <repository_folder>
  2. Open the desired notebook in Jupyter Notebook, JupyterLab, or any compatible IDE. 🌐

  3. Follow the instructions provided in each notebook to:

    • Load and parse resumes and JDs. 🔑
    • Calculate similarity scores. 🔍
  4. Ensure that the required dependencies are installed:

    pip install -r requirements.txt
  5. For the third notebook, upload the resumes and JDs for which embeddings need to be calculated, and follow the prescribed formats to process them. 🔄


🎮 Notes

  • The embeddings are generated using a T5 model. ✨
  • Ensure that the pretrained model paths for resume_ner_model_pickle.pkl and jd_ner_model_pickle.pkl are correctly specified.
  • Use the provided training data to test the model's functionality. 📚

This project simplifies the candidate selection process by automating the parsing of resumes and job descriptions and computing their similarity effectively. 🚀

Enjoy exploring and enhancing the model! 😊

NOTE: The two parser models are uploaded in the Releases Section(due to large size).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published