Image Caption

This is a simple implementation of Image Caption trained on MS COCO dataset.

The project is based on these repos:

LemonATsu/Keras-Image-Caption

DeepRNN/image_captioning

environments

system: win10 x64

cuda version: 8.0

cudnn version: 5.1

You should also have a GPU card with 4GB or larger graph memory. Nvidia GTX 1060+ is recommended.

requirements

joblib==0.11
numpy==1.12.1
tensorflow-gpu==0.12.0
keras==1.2.2

How to use

prepare inception_v3 model

Download inception_v3_2016_08_28_frozen.pb and unpack it to model/inception_v3_2016_08_28_frozen.pb

prepare COCO training data

Download COCO 2014 Training images [80K/13GB] dataset and unpack all training jpg files to train/images/

prepare anns.csv

The anns.csv is a table contains training images' path and their captions. When training, ONLY captions in anns.csv will be used.

We provide a default anns.csv contains about 56K captionss. You can generate this file on your own.

extract image features

Run

python extractor.py

to generate image features.

Warning: pickle.dump method in python will cost a large amount of memory.

fix a keras bug

When using tensorflow as keras backend, Maybe You should modify keras/optimizers.py like this.

train

Run

python train.py

to train the models. Checkpoint file will be save to weights/.

test

Modify model_path to checkpoint file you have got and run

python test.py path/to/test/image.jpg

to get the result.

TODO

add val data

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
model		model
train		train
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
extractor.py		extractor.py
model.py		model.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Caption

environments

requirements

How to use

prepare inception_v3 model

prepare COCO training data

prepare anns.csv

extract image features

fix a keras bug

train

test

TODO

About

Releases

Packages

Languages

License

MaticsL/image_caption

Folders and files

Latest commit

History

Repository files navigation

Image Caption

environments

requirements

How to use

prepare inception_v3 model

prepare COCO training data

prepare anns.csv

extract image features

fix a keras bug

train

test

TODO

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages