Hongming Luo, Fei Zhou, Kim man Lam, Guoping Qiu
This repository is the official PyTorch implementation of Resotration of User Videos Shared on Social Media (arxiv). This paper has been accepted by ACMMM 2022.
User videos shared on social media platforms usually suffer from degradations caused by unknown proprietary processing procedures, which means that their visual quality is poorer than that of the originals. This paper presents a new general video restoration framework for the restoration of user videos shared on social media platforms. In contrast to most deep learning-based video restoration methods that perform end-to-end mapping, where feature extraction is mostly treated as a black box, in the sense that what role a feature plays is often unknown, our new method, termed Video restOration through adapTive dEgradation Sensing (VOTES), introduces the concept of a degradation feature map (DFM) to explicitly guide the video restoration process. Specifically, for each video frame, we first adaptively estimate its DFM to extract features representing the difficulty of restoring its different regions. We then feed the DFM to a convolutional neural network (CNN) to compute hierarchical degradation features to modulate an end-to-end video restoration backbone network, such that more attention is paid explicitly to potentially more difficult to restore areas, which in turn leads to enhanced restoration performance. We will explain the design rationale of the VOTES framework and present extensive experimental results to show that the new VOTES method outperforms various state-of-the-art techniques both quantitatively and qualitatively. In addition, we contribute a large scale real-world database of user videos shared on different social media platforms.
- Python 3.7
- PyTorch >= 1.8.0
- NVIDIA GPU + CUDA
-
Clone repo
git clone https://github.com/luohongming/VOTES.git
-
Install dependent packages
cd VOTES pip install -r requirements.txt
-
Please run the following commands in the VOTES root path
python setup.py develop
You need to download the UVSSM dataset or REDS dataset.
For UVSSM dataset, you can download it in Baidu Netdisk (access code: rsqw) The dataset folder should be:
--UVSSM
--WeChat
--HQ_frames
--001
--00000000.png
--LQ_frames
--001
--twitter
--bilibili
The HQ_frames folder in the twitter, bilibili and Youtube folders are the same, therefore, you can download the HQ_frames folder in the twitter folder, then copy this folder to the bilibili and Youtube folders. But the LQ_frames folder in these folders are different, you still need to download LQ_frames in these three folders.
For REDS dataset, you can download it in REDS. We only used two subsets of REDS, i.e., train_sharp, val_sharp.
We move the 30 folders of val_sharp to train_sharp by renaming the folder_num (from 000-029 to 240-269). And we need to compress and downscale these frames into videos using FFmpeg.
ffmpeg -f image2 -i $input_dir/$folder/%08d.png -vcodec libx264 -r 25 -qp 33 -y -s 640x360 -pix_fmt yuv420p $folder.mp4
Then we extract the compressed videos to frames again.
ffmpeg -i $folder -f image2 -start_number 0 $output_dir/${folder: 0:3}/%08d.png
The dataset folder should be:
--REDS_compressed
--HQ_frames
--000
--00000000.png
--001
--LQ_frames
Our pretrained models can be downloaded via Google Drive
or Baidu Netdisk (access code: qb5v). After you download the pretrained models,
please put them into the $ROOT/experiments
folder.
And the pretrained bin model (bin_100000.pth
) is necessary for training VOTES.
Our training settings in the paper can be found at $ROOT/options/train/VOTES/xxx.yml
You can train the VOTES using the following commands:
(Modify the file train_sh.sh)
bash train_sh.sh
or (Using only one gpu)
bash dist_train.sh 1 ./options/train/VOTES/train_VOTES_L_x2_SR_REDS_QP33.yml 4321
bash dist_train.sh 1 ./options/train/VOTES/train_VOTES_L_x2_SR_UVSSMWeChat.yml 4321
or (Using several gpus)
bash dist_train.sh 2 ./options/train/VOTES/train_VOTES_L_x2_SR_REDS_QP33.yml 4321
bash dist_train.sh 3 ./options/train/VOTES/train_VOTES_L_x2_SR_UVSSMWeChat.yml 4321
You can test the VOTES using the following commands:
(Modify the file test_sh.sh)
bash test_sh.sh
or (Using only one gpu)
bash dist_train.sh 1 ./options/train/VOTES/train_VOTES_L_x2_SR_REDS_QP33.yml 4321
bash dist_train.sh 1 ./options/train/VOTES/train_VOTES_L_x2_SR_UVSSMWeChat.yml 4321
or (Using several gpus)
bash dist_train.sh 2 ./options/train/VOTES/train_VOTES_L_x2_SR_REDS_QP33.yml 4321
bash dist_train.sh 3 ./options/train/VOTES/train_VOTES_L_x2_SR_UVSSMWeChat.yml 4321
We achieve the best performance comparing other SOTA methods.
The visual comparisons on the UVSSM dataset. The upscale factor is x2.
The visual comparisons on the REDS dataset. The upscale factor is x2 and the QP is 28.
The visual comparisons on the REDS dataset. The upscale factor is x4 and the QP is 33.
More results can be found in the supplementary materials.
@InProceedings{Luo2022VOTES,
author = {Hongming Luo, Fei Zhou, Kin-man Lam and Guoping Qiu},
title = {Resotration of User Videos Shared on Social Media},
booktitle = {ACMMM},
year = {2022},
}
Our code is built on EDVR. We thank the authors for sharing their codes.
The code and UVSSM dataset are released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License for NonCommercial use only. Any commercial use should get formal permission first.