This practical covers the exploration of various machine learning practices used for classifying different types of seal pups found in cropped aerial imagery obtained during seasonal surveys of islands. Various data visualisation techniques, data processing steps and classification models are experimented with to design a final pipeline for making predictions on the types of seals observed in the images. The predictions are made by training multiple classification models, which are all evaluated to determine their performance.
You can read the full report here.
The data comes in two flavours: binary and multi, where:
-
the "binary" data contains two labels, one for images of backgrounds and the other images of seals
-
the "multi" data contains five labels, one for images of backgrounds and the four others for types of seals (whitecoat, moulted pup, dead pup and juvenile).
Screenshot of the 5 different types of classification labels (top: HoG features, bottom: actual images):
The dataset is not provided in this repository due to its large size. If you wish to experiment with the dataset on your own, please contact me and I will provide you a link to download the data.
These results are achieved on the testing dataset only once, unseen until this point:
-
Binary: 98.21% accuracy
-
Multi: 97.58% accuracy
Create a new virtual environment and install the Python libraries used in the code by running the following command:
pip install -r requirements.txt
To run the program, move to the “src” directory and run the following command:
python3 main.py -s -d [-m ] [-gs] [-rs] [-v]
where:
- "-s section": is a setting that executes different parts of the program. It must be one ofthe following: ‘datavis’, ‘train’ or ‘test’.
- "-d dataset": selects the dataset to use. It must be set to either ‘binary’ or ‘multi’.
- "-m model": is an optional setting that selects the classification model to use for training. It must be one of the following: ‘sgd’, ‘logistic’, ‘svc_lin’, ‘svc_poly’.
- "-gs" and "-rs": are optional flags to run the hyperparameter tuning algorithms (eithergrid search or randomised search algorithms) for the selected classification model. The flag only takes effect when using multi-layer perceptron classifiers (neural networks).
- "-v": is an optional flag that enters verbose (debugging) mode, printing additional statements on the command line.
- see LICENSE file.
- Email: [email protected]
- Website: www.adam.jaamour.com
- LinkedIn: linkedin.com/in/adamjaamour
- Twitter: @Adamouization