Skip to content

Latest commit

 

History

History
18 lines (15 loc) · 1.35 KB

File metadata and controls

18 lines (15 loc) · 1.35 KB

Dog Breed Classification

This project focuses on building a robust and accurate classification system for identifying different breeds of dogs using cutting-edge vision transformer models. The models used include ViT, Swin, BEiT, DeiT, and LeViT.

Models Used

This project leverages several transformer-based models known for their capabilities in image recognition tasks:
  • ViT (Vision Transformer): A pioneering model that applies transformers directly to image patches.
  • Swin Transformer: A hierarchical vision transformer with a shifted window mechanism for capturing contextual information.
  • BEiT (Bidirectional Encoder representation from Image Transformers): Utilizes a transformer-based self-supervised framework for image tasks.
  • DeiT (Data-efficient Image Transformer): A robust and data-efficient variant of ViT.
  • LeViT: Optimized for low latency and efficiency, well-suited for smaller devices.

Dataset

The dataset used for this project was collected through web scraping to compile a comprehensive set of dog breed images.


Contributions are welcome! If you have suggestions, improvements, or bug fixes, feel free to create a pull request or open an issue.