A dataset was collected for neural network training. It was based on the latin names of birds from european part of Russia and included 350 species in total.
At least 100 images per class.
Bird dataset from Google Drive
- the latin names of the birds were taken from here
- images were collected from google images by latin names using libraries selenium; urllib; user_agent; logging
- dataset was manually cleared
As a model pretrained EfficientNet-b0 was used
350 classes on the last layer
Loss function CrossEntropyLoss
dataset part | CrossEntropyLoss | Accuracy |
---|---|---|
train | 0.665 | 82.559685% |
validation | 0.693 | 81.350510% |
@bird_species_bot link
- aiogram==2.17.1
- torch==1.10.0+cpu
- torchvision==0.11.1+cpu
- efficientnet-pytorch==0.7.1
- wikipedia==1.4.0
- 1 Telegram bot bot.py gets the image
- 2 Saves to folder
- 3 app.py gets img
- 3.1 Transform
- 3.2 Pretrained efficientnet-b0 predicts labels and probability
- 4 Bot gives to the user top-3 predicted classes and probability
- 5 Deletes the image
- 6 wiki_parser.py searches wikipedia page for latin name of species and give it to user