Paper | Video | Project Page
Tobias Kirschstein, Simon Giebenhain, Jiapeng Tang, Markos Georgopoulos,
and Matthias Nießner
Siggraph Asia 2024
- Create conda environment
gghead
with newest PyTorch and CUDA 11.8:conda env create -f environment.yml
- Ensure that
nvcc.exe
is taken from the conda environment and includes can be found:- [Linux]
conda activate gghead conda env config vars set CUDA_HOME=$CONDA_PREFIX conda activate base conda activate gghead
- [Windows]
conda activate gghead conda env config vars set CUDA_HOME=$Env:CONDA_PREFIX conda env config vars set NVCC_PREPEND_FLAGS="-I$Env:CONDA_PREFIX\Library\include" conda activate base conda activate gghead
- [Linux]
- Check whether the correct
nvcc
can be found on the path via:which should say something likenvcc --version
release 11.8
. - Install Gaussian Splatting (which upon installation will compile CUDA kernels with
nvcc
):pip install gaussian_splatting@git+https://github.com/tobias-kirschstein/gaussian-splatting.git
- [Optional] If you compile the CUDA kernels on a different machine than the one you use for running code, you may need to manually specify the target
GPU compute architecture for the compilation process via the
TORCH_CUDA_ARCH_LIST
environment variable:Choose the correct compute architecture(s) that match your setup. Consult this website if unsure about the compute architecture of your graphics card.TORCH_CUDA_ARCH_LIST="8.0" pip install gaussian_splatting@git+https://github.com/tobias-kirschstein/gaussian-splatting.git
- [Troubleshooting]
On a Linux machine, if you run into
or
gcc: fatal error: cannot execute 'cc1plus': posix_spawnp: No such file or directory
tryx86_64-conda_cos6-linux-gnu-cc: error trying to exec 'cc1plus': execvp: No such file or directory
conda install gxx_linux-64 gcc_linux-64
- [Optional] If you compile the CUDA kernels on a different machine than the one you use for running code, you may need to manually specify the target
GPU compute architecture for the compilation process via the
- Finally install the
gghead
module via:pip install -e .
All paths to data / models / renderings are defined by environment variables.
Please create a file in your home directory in ~/.config/gghead/.env
with the following content:
GGHEAD_DATA_PATH = "..."
GGHEAD_MODELS_PATH = "..."
GGHEAD_RENDERINGS_PATH = "..."
Replace the ... with the locations where data / models / renderings should be located on your machine.
GGHEAD_DATA_PATH
: Location of the FFHQ dataset and foreground masks. Only needed for training. See Section 2 for how to obtain the datasets.GGHEAD_MODELS_PATH
: During training, model checkpoints and configs will be saved here. See Section 4 for downloading pre-trained models.GGHEAD_RENDERINGS_PATH
: Video renderings of trained models will be stored here
If you do not like creating a config file in your home directory, you can instead hard-code the paths in the env.py.
Only needed for training. Download the zip files of the respective datasets and put them into ${GGHEAD_DATA_PATH}
.
Dataset | Images + Cameras | Masks |
---|---|---|
FFHQ-512 | FFHQ_png_512.zip | FFHQ_png_512_masks_modnet.zip |
FFHQ-1024 | FFHQ_png_1024.zip | FFHQ_png_1024_masks_modnet.zip |
AFHQ-512 | afhq_v2_processed.zip | afhq_v2_processed_masks_modnet.zip |
The .zip files for "Images + Cameras" were created with the dataset creation script of EG3D at the respective resolution. The .zip files for "Masks" were obtained by running background matting module MODNet on each image.
The dataset files are under a Creative Commons BY-NC-SA 4.0 license being derivatives of the FFHQ Dataset. This means, you can use, redistribute, and adapt it for non-commercial purposes, as long as you (a) give appropriate credit by citing the StyleGAN paper, (b) indicate any changes that you've made, and (c) distribute any derivative works under the same license (https://creativecommons.org/licenses/by-nc-sa/4.0/)
BW_IMPLEMENTATION=1 python scripts/train_gghead.py ffhq FFHQ_png_512.zip 1 32 --kimg 6400
will start training GGHead on 1 GPU with a batch-size of 32 for 6400k images. To speed up training, you can use more GPUs.
E.g., python scripts/train_gghead.py ffhq FFHQ_png_512.zip 4 32
will train on 4 GPUs with a batch size of 8 per GPU instead.
Assets produced during training:
- Loss curves will be logged to weights&biases into a project called
generative-gaussian-heads
- Generated images will be logged periodically to
${GGHEAD_MODELS_PATH}/gghead/GGHEAD-xxx
every 200k train images - Checkpoints are stored in
${GGHEAD_MODELS_PATH}/gghead/GGHEAD-xxx/checkpoints
every 200k train images - Evaluation result with FID scores are stored in
${GGHEAD_MODELS_PATH}/gghead/GGHEAD-xxx/evaluations
every 200k train images
BW_IMPLEMENTATION=1 python scripts/train_gghead.py ffhq FFHQ_png_512.zip 1 32 --kimg 25000 --resume_run GGHEAD-xxx --overwrite_resolution 512 --overwrite_n_uniform_flame_vertices 512 --overwrite_lambda_tv_uv_rendering 100 --overwrite_lambda_beta_loss 1
Replace GGHEAD-xxx
with the name of the run from stage 1, e.g., GGHEAD-4
(everything after the number can be omitted).
use_vis_window
: for local debugging. Opens a dearpygui window showing live training progressimage_snapshot_ticks
: How often debug images will be stored during training. Default: every 50 ticks = every 200k train images (1 tick = 4k images)metrics
: Which FID scores to compute during training. Default:fid100,fid1k,fid10k
. Always computing FID score with 50k generated samples is expensive during training. In our experience, only generating 10k images is already enough to assess which run performs better.
BW_IMPLEMENTATION=1 python scripts/train_gghead.py ffhq FFHQ_png_1024.zip 1 32 --kimg 27000 --resume_run GGHEAD-xxx --overwrite_resolution 1024
Replace GGHEAD-xxx
with the name of the run from stage 2.
BW_IMPLEMENTATION=1 python scripts/train_gghead.py ffhq afhq_v2_processed.zip 1 32 --kimg 30000 --resume_run GGHEAD-xxx
Replace GGHEAD-xxx
with the name of the run from stage 2.
From a trained model GGHEAD-xxx
, render short videos of randomly sampled 3D heads via:
python scripts/sample_heads.py GGHEAD-xxx
Replace xxx
with the actual ID of the model.
The generated videos will be placed into ${GGHEAD_RENDERINGS_PATH}/sampled_heads/
From a trained model GGHEAD-xxx
, render interpolation videos that morph between randomly sampled 3D heads via:
python scripts/render_interpolation.py GGHEAD-xxx
Replace xxx
with the actual ID of the model.
The generated videos will be placed into ${GGHEAD_RENDERINGS_PATH}/interpolations/
python scripts/evaluate_fid.py GGHEAD-xxx
Calculates FID score between generated images and the dataset images for the model GGHEAD-xxx
(Replace xxx
with the specific run ID that you want to evaluate).
The default number of generated samples for FID calculation is 50000 which can be changed via --fid
.
The evaluation result will be printed in the terminal and also stored as a JSON file in ${GGHEAD_MODELS_PATH}/gghead/GGHEAD-xxx/evaluations
.
The notebooks folder contains minimal examples on how to:
- Load a trained model, generate a 3D head and render it from an arbitrary viewpoint (inference.ipynb)
You can start the excellent GUI from EG3D and StyleGAN by running:
python visualizer.py
In the visualizer, you can select all checkpoints found in ${GGHEAD_MODELS_PATH}/gghead
and freely explore the generated heads in 3D.
Put pre-trained models into ${GGHEAD_MODELS_PATH}/gghead
.
Dataset | GGHead model |
---|---|
FFHQ-512 | GGHEAD-1_ffhq512 |
FFHQ-1024 | GGHEAD-2_ffhq1024 |
AFHQ-512 | GGHEAD-3-afhq512 |
@article{kirschstein2024gghead,
title={GGHead: Fast and Generalizable 3D Gaussian Heads},
author={Kirschstein, Tobias and Giebenhain, Simon and Tang, Jiapeng and Georgopoulos, Markos and Nie{\ss}ner, Matthias},
journal={arXiv preprint arXiv:2406.09377},
year={2024}
}
Contact Tobias Kirschstein for questions, comments and reporting bugs, or open a GitHub issue.