This repository contains the implementation of the PixelWarrior model, a deep learning model designed for enhancing low-light images. This project combines the strengths of the UNet architecture with Multi-Layer Perceptron (MLP) blocks to improve image quality and visibility in low-light conditions.
The PixelWarrior Architecture consists of the following components:
- UNet Architecture: A symmetric encoder-decoder structure with skip connections to capture both spatial and contextual information.
- MLP Blocks: Enhance feature representation with convolutional layers followed by GELU and Sigmoid activations.
- Downsampler and Upsampler: Reduce and restore spatial dimensions of the input image to facilitate high-level feature extraction and apply these features to the original resolution.
The model is trained with the following setup:
- Loss Function: Mean Squared Error (MSE)
- Optimizer: Adam with an initial learning rate of 0.0001
- Learning Rate Scheduler: CosineAnnealingLR
- Mixed Precision Training: Utilizes PyTorch’s autocast and GradScaler for faster computations and reduced memory usage.
The PixelWarrior Model successfully enhances low-light images, providing high-quality outputs with improved visibility. The combination of UNet architecture and MLP blocks allows the model to capture intricate details and enhance images effectively. Achieved an average psnr of 19 on training dataset with a batch size of 4.
Here is the drive for the weights
The model architecture and training methodology are based on the research presented in the following paper: