Skip to content

arulkumarann/vqa_implementation

Repository files navigation

VQA V1 PyTorch Implementation

This repository contains the code and data necessary to replicate the experiments from the paper VQA: Visual Question Answering. The aim is to develop a model capable of answering questions about images, similar to the system presented in the paper.

Model architecture

model

Steps to Run

  1. Download Data
    python download_data.py
  2. Preprocess Images
    python preprocess_image.py
  3. Create Vocabulary
    python make_vocabulary.py
  4. Prepare VQA Inputs
    python make_vqa_inputs.py
    

This is how the datasets firectory should like after finishing the final preprocessing step

datasets
├── Annotations
├── Images
├── Questions
├── Resized_Images
├── test-dev.npy
├── test.npy
├── train_valid.npy
├── train.npy
├── valid.npy
├── vocab_answers.txt
├── vocab_questions.txt\

Train the Model

```bash
python train.py

loss_and_accuracy

Acknowledgements

About

vanilla vqa_v1 PyTorch implementation

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published