Skip to content
@SHI-Labs

SHI Labs

Computer Vision, Machine Learning, and AI Systems & Applications

Pinned Loading

  1. Neighborhood-Attention-Transformer Public

    Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022

    Python 1.1k 86

  2. Versatile-Diffusion Public

    Versatile Diffusion: Text, Images and Variations All in One Diffusion Model, arXiv 2022 / ICCV 2023

    Python 1.3k 85

  3. OneFormer Public

    [CVPR 2023] OneFormer: One Transformer to Rule Universal Image Segmentation

    Jupyter Notebook 1.6k 136

  4. Prompt-Free-Diffusion Public

    Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models, arxiv 2023 / CVPR 2024

    Python 745 37

  5. Smooth-Diffusion Public

    Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models arXiv 2023 / CVPR 2024

    Python 330 9

  6. VCoder Public

    [CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Models

    Python 272 17

Repositories

Showing 10 of 59 repositories
  • NATTEN Public

    Neighborhood Attention Extension. Bringing attention to a neighborhood near you!

    Cuda 406 33 19 2 Updated Jan 30, 2025
  • OLA-VLM Public

    OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024

    Python 49 2 1 1 Updated Jan 27, 2025
  • Compact-Transformers Public

    Escaping the Big Data Paradigm with Compact Transformers, 2021 (Train your Vision Transformers in 30 mins on CIFAR-10 with a single GPU!)

    Python 510 Apache-2.0 81 8 2 Updated Nov 5, 2024
  • Diffusion-Driven-Test-Time-Adaptation-via-Synthetic-Domain-Alignment Public

    Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment

    Python 22 2 0 0 Updated Oct 29, 2024
  • OneFormer Public

    [CVPR 2023] OneFormer: One Transformer to Rule Universal Image Segmentation

    Jupyter Notebook 1,558 MIT 136 36 4 Updated Oct 3, 2024
  • Smooth-Diffusion Public

    Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models arXiv 2023 / CVPR 2024

    Python 330 MIT 9 11 0 Updated Sep 24, 2024
  • FineStyle Public

    FineStyle: Fine-grained Controllable Style Personalization for Text-to-image Models

    6 MIT 0 0 0 Updated Sep 4, 2024
  • Agriculture-Vision Public

    [CVPR 2020 & 2021 & 2022 & 2023] Agriculture-Vision Dataset, Prize Challenge and Workshop: A joint effort with many great collaborators to bring Agriculture and Computer Vision/AI communities together to benefit humanity!

    211 34 2 1 Updated Jul 27, 2024
  • CuMo Public

    CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

    Python 141 Apache-2.0 9 0 0 Updated Jun 8, 2024
  • StyleNAT Public

    New flexible and efficient image generation framework that sets new SOTA on FFHQ-256 with FID 2.05, 2022

    Python 100 MIT 12 0 0 Updated Jun 4, 2024