KICNLE

This repository contains the official PyTorch implementation of paper "Knowledge-Augmented Visual Question Answering with Natural Language Explanation" for Transaction on Image Processing (TIP) 2024.

Overview

The KICNLE model enhances visual question answering by using an iterative method where each answer is refined based on the previous explanation. It includes a knowledge retrieval module to ensure relevant and accurate information. This results in high-quality, consistent answers and explanations closely tied to the visual content.

Installation

Install Anaconda or Miniconda distribution based on Python3.8
Main packages: PyTorch = 1.12, transformers = 4.30

Pre-trained Model

CLIP ViT-based model

pip install git+https://github.com/openai/CLIP.git

Training & Evaluation

For VQA-X dataset

python vqaX.py

For A-OKVQA dataset

python a_okvqa.py

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
models		models
pic		pic
utils		utils
.DS_Store		.DS_Store
README.md		README.md
a_okvqa.py		a_okvqa.py
vqaX.py		vqaX.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KICNLE

Overview

Installation

Pre-trained Model

Training & Evaluation

About

Releases

Packages

Languages

Gary-code/KICNLE

Folders and files

Latest commit

History

Repository files navigation

KICNLE

Overview

Installation

Pre-trained Model

Training & Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages