This repository contains solutions for the "Deep Learning Foundations" course project, focusing on implementing and training Deep Convolutional Generative Adversarial Networks (DCGANs) and Variational Autoencoders (VAEs) for image synthesis tasks. The project is structured as follows:
DC_GAN.ipynb
: Jupyter Notebook implementing the DCGAN architecture, training process, and evaluation on the Fashion-MNIST dataset.VAE.ipynb
: Jupyter Notebook implementing the VAE architecture, training process, and evaluation on the CIFAR-10 dataset.tensorboard.sh
: Shell script to launch TensorBoard for monitoring training progress.DC-GAN-LOSS.png
,GAN.png
,VAE-loss-function.png
: Images depicting loss functions and model architectures used in the project.data/
: Directory containing datasets used for training and evaluation.runs/
: Directory storing TensorBoard logs for training visualization.
-
Data Loading and Preprocessing:
- Utilized the Fashion-MNIST dataset, a collection of Zalando’s article images.
- Implemented a data loading pipeline for DCGAN training, including normalization and preparation steps.
- Visualized samples from different classes to understand data format and dimensions.
-
DCGAN Architecture:
- Designed and implemented the generator and discriminator architectures for image synthesis.
- Discussed design choices, including layer configurations and activation functions.
-
Training:
- Experimented with different weight initialization methods to assess their impact on training.
- Implemented the training loop for DCGAN and trained the model.
- Explained the adversarial loss function used for GAN training.
- Monitored training progress using TensorBoard:
- Plotted generator and discriminator losses over training iterations.
- Visualized generator outputs on a fixed noise batch for each epoch.
-
Evaluation:
- Qualitatively evaluated the generative performance by visualizing newly generated samples.
- Addressed challenges encountered during training, such as mode collapse and training instability.
-
Data Loading and Visualization:
- Loaded the CIFAR-10 dataset, comprising 32x32 color images across 10 classes.
- Displayed sample images to understand data characteristics.
-
VAE Architecture:
- Implemented encoder and decoder networks to form the VAE, adapting architectures to accommodate CIFAR-10's image size and color channels.
- Selected an appropriate number of latent dimensions and justified design choices.
-
Loss Function:
- Explained the combined reconstruction and KL divergence loss function used for VAE training.
-
Training and Evaluation:
- Trained the VAE over multiple epochs.
- Compared input and reconstructed images after each epoch to assess model performance.
-
Reparameterization Trick:
- Explained the reparameterization trick and its role in enabling gradient-based optimization in VAEs.
-
Challenges in GAN Optimization:
- Discussed difficulties in GAN training, such as convergence issues and mode collapse.
- Explored common techniques to improve training stability, including learning rate adjustments and architectural modifications.
-
Comparison of GANs and VAEs:
- Compared the strengths and weaknesses of GANs and VAEs.
- Discussed scenarios where one model may be preferred over the other.
- Python 3.x
- PyTorch
- torchvision
- TensorBoard
- Jupyter Notebook
-
Clone the Repository:
git clone https://github.com/DevNerds2020/Generator_Deep_Learning_Models.git cd Generator_Deep_Learning_Models
-
Set Up the Environment:
Ensure all required packages are installed:
pip install torch torchvision tensorboard jupyter
-
Run Jupyter Notebooks:
Launch Jupyter Notebook:
jupyter notebook
Open and execute the
DC_GAN.ipynb
andVAE.ipynb
notebooks to explore the implementations. -
Monitor Training with TensorBoard:
Start TensorBoard to visualize training progress:
./tensorboard.sh
Access TensorBoard at
http://localhost:6006/
.
Winter Semester 2024/25
This project is licensed under the MIT License. See the LICENSE file for details.