This repository contains a code for training multimodal gender classification model using audio, visual (rgb) and thermal images.
The baseline model is built using SpeakingFaces database which contains over 100 subjects and each subject utteraring around 100 short commands.