This is the source-code of technical report using the Food-11 dataset. In next steps, all necessary information to perform the experiments will be presented.
- Alan C. Neves
- Caio C. V. da Silva
- Jéssica Soares
- Sérgio J. de Sousa
The experiments were built using Python 2.7.12, with some libraries: scikit-learn, theano, keras, numpy, scipy, matplotlib, Pillow and h5py. It is required to install these libraries to run the code.
- scikit-learn - Machine Learning library for Python
- Theano - Deep Learning toolkit
- Keras - High-level API for Deep learning toolkits
- numpy - Scientific computing library for Python
- scipy - Ecosystem of open-source software for mathematics, science, and engineering
- matplotlib - 2D plotting library
- Pillow - Python Image Library Fork
- h5py - HDF5 for Python
All of these libraries could be installed using pip (Python Package Index).
sudo -H pip install scikit theano keras numpy scipy matplotlib Pillow h5py
Python-Tk must be installed too, to render the GUI with python. To install it, just execute
sudo apt-get install python-tk
After installing them, we can run the experiments.
The dataset used in our experiments is the Food-11 dataset. Food-11 contains 16643 food images grouped in 11 major food categories. The 11 categories are Bread, Dairy product, Dessert, Egg, Fried food, Meat, Noodles/Pasta, Rice, Seafood, Soup, and Vegetable/Fruit. The dataset is divided on training, validation and testing.
The total file size of the Food-11 dataset is about 1.16 GB.
Download it and put in the same directory of source-code.
We modified a TensorFlow version of AlexNet, found on Heuritech Github. To the feature extraction work, the ImageNet weights must be downloaded.
Put it on same directory of source-code.
The experiments consists of three sequentially, but separated process:
- Mount subsampled dataset
- Extract features using pre-trained AlexNet
- Choose and execute one of available experiments
The first two steps should be executed once. The third step comprises a set of experiments that can be executed as long as needed.
Just run the build_dataset.py
script. In the script, folder variable chooses where to get images and max_size defines the amount of images used.
An array in numpy format (trab2_dataset.npz) will be saved on data folder.
Just run the feature_extraction.py
script. Three numpy arrays will be created on data folder (trab2_conv1.npz, trab2_conv5.npz and trab2_dense2.npz).
Three experiments are available:
- Deep Features: Classification over C1, C2 and FC2 layers from AlexNet (
feat_experiment.py
) - Early and Late Fusion: Classification using Early and late fusion approaches (
early_fusion_experiment.py
andlate_fusion_experiment.py
) - Ensemble: Diversity tests using Random Forest, Majority Vote and Bagging (
ensemble_experiment.py
)
To see the final results and analysis check the report file: report.pdf
The best deep representation level is the FC2, contain the most relevant features to classify. Early fusion and Late fusion can't better than only FC2 deep feature. Lastly the Bagging approach achieved the best results of all experiments executed.
The domain of food is hard to classify, even for humans. This happens due to similarity between food made with the same ingredients wich belongs to different classes. Even so the deep feature FC2 showed up a good option and the Bagging technique is the best model.