GitHub - 1RZLLL/gaussian-ram: Official PyTorch implementation of the paper: "Gaussian RAM: Lightweight Image Classification via Stochastic Retina Inspired Glimpse and Reinforcement Learning"

Gaussian RAM: Lightweight Image Classification via Stochastic Retina Inspired Glimpse and Reinforcement Learning

ICROS ICCAS 2020 Student Best Paper Finalist

This repo is an official PyTorch implementation of Gaussian-RAM (https://arxiv.org/abs/2011.06190).

Abstract

Previous studies on image classification have been mainly focused on the performance of the networks, not on real-time operation or model compression. We propose a Gaussian Deep Recurrent visual Attention Model (GDRAM)- a reinforcement learning based lightweight deep neural network for large scale image classification that outperformsthe conventional CNN (Convolutional Neural Network) which uses the entire image as input. Highly inspired by the biological visual recognition process, our model mimics the stochastic location of the retina with Gaussian distribution. We evaluate the model on Large cluttered MNIST, Large CIFAR-10 and Large CIFAR-100 datasets which are resized to 128 in both width and height.

Dataset

Cluttered MNIST(download), CIFAR10 and CIFAR100 are used to train and evaluate. All the images are resized to 128 in both height and weight for generating high scale image.

Requirements

Python3
PyTorch (> 1.0)
torchvision (> 0.2)
PIL
NumPy

Training

python train.py --data_path --dataset --batch_size --lr --epochs --random_seed --log_interval --resume --checkpoint

Inference

python inference.py --data_path --dataset --random_seed --fast

Acknowledgement

This work was supported by Institute of Information & Communications Technology Planning & Evaluation(IITP) grant funded by the Korea government (MSIT) (No. 2019-0-01367, Infant-Mimic Neurocognitive Developmental Machine Learning from Interaction Experience with Real World (BabyMind))

References

[1] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner,“Gradient-based learning applied to documentrecognition,” inProceedings of the IEEE, 1998, pp.2278–2324.
[2] K. Simonyan and A. Zisserman, “Very deep con-volutional networks for large-scale image recogni-tion,”arXiv preprint arXiv:1409.1556, 2014.
[3] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed,D. Anguelov, D. Erhan, V. Vanhoucke, and A. Ra-binovich, “Going deeper with convolutions,” inPro-ceedings of the IEEE conference on computer visionand pattern recognition, 2015, pp. 1–9.
[4] K. He, X. Zhang, S. Ren, and J. Sun, “Deep resid-ual learning for image recognition,” inProceedingsof the IEEE conference on computer vision and pat-tern recognition, 2016, pp. 770–778.
[5] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q.Weinberger, “Densely connected convolutional net-works,” inProceedings of the IEEE conference oncomputer vision and pattern recognition, 2017, pp.4700–4708.
[6] Y. LeCun, “The mnist database of handwritten dig-its,”http://yann. lecun. com/exdb/mnist/.
[7] O. Russakovsky, J. Deng, H. Su, J. Krause,S. Satheesh, S. Ma, Z. Huang, A. Karpathy,A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei,“ImageNet Large Scale Visual Recognition Chal-lenge,”International Journal of Computer Vision(IJCV), vol. 115, no. 3, pp. 211–252, 2015.
[8] V. Mnih, N. Heess, A. Graveset al., “Recurrentmodels of visual attention,” inAdvances in neuralinformation processing systems, 2014, pp. 2204–2212.
[9] J. Ba, V. Mnih, and K. Kavukcuoglu, “Multi-ple object recognition with visual attention,”arXivpreprint arXiv:1412.7755, 2014.
[10] Q. Liu, R. Hang, H. Song, and Z. Li, “Learn-ing multi-scale deep features for high-resolutionsatellite image classification,”arXiv preprintarXiv:1611.03591, 2016.
[11] M. Iftenea, Q. Liub, and Y. Wangc, “Very high res-olution images classification by fusing deep convo-lutional neural networks.”
[12] A. Ablavatski, S. Lu, and J. Cai, “Enriched deeprecurrent visual attention model for multiple objectrecognition,” in2017 IEEE Winter Conference onApplications of Computer Vision (WACV).IEEE,2017, pp. 971–978.
[13] M. Jaderberg, K. Simonyan, A. Zissermanet al., “Spatial transformer networks,” inAdvances inneural information processing systems, 2015, pp.2017–2025.
[14] J. Redmon and A. Farhadi,“Yolov3:Anincrementalimprovement,”arXiv preprintarXiv:1804.02767, 2018.
[15] J. Choi, D. Chun, H. Kim, and H.-J. Lee, “Gaussianyolov3: An accurate and fast object detector usinglocalization uncertainty for autonomous driving,” inProceedings of the IEEE International Conferenceon Computer Vision, 2019, pp. 502–511.
[16] S. Ioffe and C. Szegedy, “Batch normaliza-tion: Accelerating deep network training by re-ducing internal covariate shift,”arXiv preprintarXiv:1502.03167, 2015.
[17] S. Hochreiter and J. Schmidhuber, “Long short-termmemory,”Neural computation, vol. 9, no. 8, pp.1735–1780, 1997.
[18] R. S. Sutton, D. A. McAllester, S. P. Singh, andY. Mansour, “Policy gradient methods for reinforce-ment learning with function approximation,” inAd-vances in neural information processing systems,2000, pp. 1057–1063.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
checkpoints		checkpoints
LICENSE		LICENSE
README.md		README.md
dataloader.py		dataloader.py
fig.png		fig.png
inference.py		inference.py
mnist_generation.py		mnist_generation.py
model.py		model.py
modules.py		modules.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gaussian RAM: Lightweight Image Classification via Stochastic Retina Inspired Glimpse and Reinforcement Learning

ICROS ICCAS 2020 Student Best Paper Finalist

Abstract

Dataset

Requirements

Training

Inference

Acknowledgement

References

About

Releases

Packages

Languages

License

1RZLLL/gaussian-ram

Folders and files

Latest commit

History

Repository files navigation

Gaussian RAM: Lightweight Image Classification via Stochastic Retina Inspired Glimpse and Reinforcement Learning

ICROS ICCAS 2020 Student Best Paper Finalist

Abstract

Dataset

Requirements

Training

Inference

Acknowledgement

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages