example.mp4
- Support 2D-GS
- Long sequence cross window alignment
- Support Mip-Splatting
Make your life a lot easier, follow along with the tutorail video I created.
Watch it here
- Clone InstantSplat and download pre-trained model.
git clone --recursive https://github.com/NVlabs/InstantSplat.git
cd InstantSplat
if not exist "mast3r\checkpoints" mkdir "mast3r\checkpoints"
curl -o mast3r\checkpoints\MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth ^
https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth
- Create the environment (or use pre-built docker), here we show an example using conda.
conda create -n instantsplat python=3.10.13 cmake=3.14.0 -y
conda activate instantsplat
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia # use the correct version of cuda for your system
pip install -r requirements.txt
pip install submodules/simple-knn
pip install submodules/diff-gaussian-rasterization
pip install submodules/fused-ssim
pip install plyfile
pip install open3d
pip install "imageio[ffmpeg]"
- Optional but highly suggested, compile the cuda kernels for RoPE (as in CroCo v2).
# DUST3R relies on RoPE positional embeddings for which you can compile some cuda kernels for faster runtime.
cd croco/models/curope/
python setup.py build_ext --inplace
cd ../../..
- Download the run_infer.py and instantsplat_gradio.py and place them in the root folder
C:/user/<username>/InstantSplat
If you have CUDA Toolkit 12.6, I ran into issues running:
conda install pytorch torchvision pytorch-cuda=12.6 -c pytorch -c nvidia
I downloaded and installed CUDA Toolkit 11.8. Then set CUDA Toolkit 11.8 for your command session using:
set CUDA_HOME=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8
set PATH=%CUDA_HOME%\bin;%PATH%
set LD_LIBRARY_PATH=%CUDA_HOME%\lib64;%LD_LIBRARY_PATH%
Then running
conda install pytorch torchvision pytorch-cuda=11.8 -c pytorch -c nvidia
You can check what version of CUDA Toolkit you are running by using nvcc --version
The original project provides a few examples to try, you can also download their pre-processed data: link
Place 3, 6, or 12 photos in an images folder nested in a project foler. Here is an example of what it should look like:
Projects/
├── Scene
│ ├── image1.jpg
│ ├── image2.jpg
│ └── image3.jpg
InstantSplat comes with example data to use as a test located at:
assets/
└── sora/
└── Santorini/
└── images/
├── image1.jpg
├── image2.jpg
└── image3.jpg
└── Art/
└── images/
├── image1.jpg
├── image2.jpg
└── image3.jpg
The windows implementation currently only supports inference. If you are looking to run eval, refer to the original project page.
Run python instantsplat_gradio.py
Once launch, navigate to http://127.0.0.1:7860/
in your browser.
Run python run_infer.py /path/to/input/images /path/to/output --n_views 3 --iterations 1000
Command line arguments:
--n_views Number of input views. Must be 3, 6, or 9
--iterations 1000 Number of training iterations, can be set from 1000 to 30000. Suggested increasing in increments of 1000.
This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!
If you find our work useful in your research, please consider giving a star ⭐ and citing the following paper 📝.
@misc{fan2024instantsplat,
title={InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds},
author={Zhiwen Fan and Wenyan Cong and Kairun Wen and Kevin Wang and Jian Zhang and Xinghao Ding and Danfei Xu and Boris Ivanovic and Marco Pavone and Georgios Pavlakos and Zhangyang Wang and Yue Wang},
year={2024},
eprint={2403.20309},
archivePrefix={arXiv},
primaryClass={cs.CV}
}