GitHub - YuzheZhang-1999/DiffTSR: [CVPR2024] Diffusion-based Blind Text Image Super-Resolution (Official)

Diffusion-based Blind Text Image Super-Resolution (CVPR2024)

¹Beijing Institute of Technology, ²SenseTime Research, ³The University of Hong Kong

📢 News

2024.05 🚀Inference code has been released, enjoy.
2024.04 🚀Official repository of DiffTSR.
2024.03 🌟The implementation code will be released shortly.
2024.03 ❤️Accepted by CVPR2024.

🔥 TODO

Attach the detailed implementation and supplementary material.
Add inference code and checkpoints for blind text image SR.
Add training code and scripts.

👁️ Gallery

🛠️ Try

Dependencies and Installation

Pytorch >= 1.7.0
CUDA >= 11.0

# git clone this repository
git clone https://github.com/YuzheZhang-1999/DiffTSR
cd DiffTSR

# create new anaconda env
conda env create -f environment.yaml
conda activate DiffTSR

Download the checkpoint

Please download the checkpoint file from the URL below to the ./ckpt/ folder.

[GoogleDrive]
[BaiduDisk] [Password: vk9n]

Inference

python inference_DiffTSR.py
# check the code for more detail

🔎 Overview of DiffTSR

Abstract

Recovering degraded low-resolution text images is challenging, especially for Chinese text images with complex strokes and severe degradation in real-world scenarios. Ensuring both text fidelity and style realness is crucial for high-quality text image super-resolution. Recently, diffusion models have achieved great success in natural image synthesis and restoration due to their powerful data distribution modeling abilities and data generation capabilities In this work, we propose an Image Diffusion Model (IDM) to restore text images with realistic styles. For diffusion models, they are not only suitable for modeling realistic image distribution but also appropriate for learning text distribution. Since text prior is important to guarantee the correctness of the restored text structure according to existing arts, we also propose a Text Diffusion Model (TDM) for text recognition which can guide IDM to generate text images with correct structures. We further propose a Mixture of Multi-modality module (MoM) to make these two diffusion models cooperate with each other in all the diffusion steps. Extensive experiments on synthetic and real-world datasets demonstrate that our Diffusion-based Blind Text Image Super-Resolution (DiffTSR) can restore text images with more accurate text structures as well as more realistic appearances simultaneously.

Visual performance comparison overview

Blind text image super-resolution results between different methods on synthetic and real-world text images. Our method can restore text images with high text fidelity and style realness under complex strokes, severe degradation, and various text styles.

📷 More Visual Results

🎓Citations

@inproceedings{zhang2024diffusion,
  title={Diffusion-based Blind Text Image Super-Resolution},
  author={Zhang, Yuzhe and Zhang, Jiawei and Li, Hao and Wang, Zhouxia and Hou, Luwei and Zou, Dongqing and Bian, Liheng},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={25827--25836},
  year={2024}
}

🎫 License

This project is released under the Apache 2.0 license.

Acknowledgement

Thanks to these awesome work：

Statistics

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Repo_image		Repo_image
ckpt		ckpt
model		model
testset		testset
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
inference_DiffTSR.py		inference_DiffTSR.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📢 News

🔥 TODO

👁️ Gallery

🛠️ Try

Dependencies and Installation

Download the checkpoint

Inference

🔎 Overview of DiffTSR

Abstract

Visual performance comparison overview

🎓Citations

🎫 License

Acknowledgement

About

Releases 1

Packages

Languages

License

YuzheZhang-1999/DiffTSR

Folders and files

Latest commit

History

Repository files navigation

📢 News

🔥 TODO

👁️ Gallery

🛠️ Try

Dependencies and Installation

Download the checkpoint

Inference

🔎 Overview of DiffTSR

Abstract

Visual performance comparison overview

🎓Citations

🎫 License

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages