Skip to content

Latest commit

 

History

History
303 lines (227 loc) · 10.4 KB

INSTALL.md

File metadata and controls

303 lines (227 loc) · 10.4 KB

Installing Faiss via conda

The supported way to install Faiss is through conda. Stable releases are pushed regularly to the pytorch conda channel, as well as pre-release nightly builds.

  • The CPU-only faiss-cpu conda package is currently available on Linux (x86-64 and aarch64), OSX (arm64 only), and Windows (x86-64)
  • faiss-gpu, containing both CPU and GPU indices, is available on Linux (x86-64 only) for CUDA 11.4 and 12.1
  • faiss-gpu-raft containing both CPU and GPU indices provided by NVIDIA RAFT, is available on Linux (x86-64 only) for CUDA 11.8 and 12.1.

To install the latest stable release:

# CPU-only version
$ conda install -c pytorch faiss-cpu=1.9.0

# GPU(+CPU) version
$ conda install -c pytorch -c nvidia faiss-gpu=1.9.0

# GPU(+CPU) version with NVIDIA RAFT
$ conda install -c pytorch -c nvidia -c rapidsai -c conda-forge faiss-gpu-raft=1.9.0

# GPU(+CPU) version using AMD ROCm not yet available

For faiss-gpu, the nvidia channel is required for CUDA, which is not published in the main anaconda channel.

For faiss-gpu-raft, the nvidia, rapidsai and conda-forge channels are required.

Nightly pre-release packages can be installed as follows:

# CPU-only version
$ conda install -c pytorch/label/nightly faiss-cpu

# GPU(+CPU) version
$ conda install -c pytorch/label/nightly -c nvidia faiss-gpu=1.9.0

# GPU(+CPU) version with NVIDIA RAFT
conda install -c pytorch -c nvidia -c rapidsai -c conda-forge faiss-gpu-raft=1.9.0 pytorch pytorch-cuda numpy

# GPU(+CPU) version using AMD ROCm not yet available

In the above commands, pytorch-cuda=11 or pytorch-cuda=12 would select a specific CUDA version, if it’s required.

A combination of versions that installs GPU Faiss with CUDA and Pytorch (as of 2024-05-15):

conda create --name faiss_1.8.0
conda activate faiss_1.8.0
conda install -c pytorch -c nvidia faiss-gpu=1.8.0 pytorch=*=*cuda* pytorch-cuda=11 numpy

Installing from conda-forge

Faiss is also being packaged by conda-forge, the community-driven packaging ecosystem for conda. The packaging effort is collaborating with the Faiss team to ensure high-quality package builds.

Due to the comprehensive infrastructure of conda-forge, it may even happen that certain build combinations are supported in conda-forge that are not available through the pytorch channel. To install, use

# CPU version
$ conda install -c conda-forge faiss-cpu

# GPU version
$ conda install -c conda-forge faiss-gpu

# AMD ROCm version not yet available

You can tell which channel your conda packages come from by using conda list. If you are having problems using a package built by conda-forge, please raise an issue on the conda-forge package "feedstock".

Building from source

Faiss can be built from source using CMake.

Faiss is supported on x86-64 machines on Linux, OSX, and Windows. It has been found to run on other platforms as well, see other platforms.

The basic requirements are:

  • a C++17 compiler (with support for OpenMP support version 2 or higher),
  • a BLAS implementation (on Intel machines we strongly recommend using Intel MKL for best performance).

The optional requirements are:

  • for GPU indices:
    • nvcc,
    • the CUDA toolkit,
  • for Intel®-AMX/oneDNN acceleration:
    • oneDNN,
    • 4th+ Gen Intel® Xeon® Scalable processor.
  • for AMD GPUs:
    • AMD ROCm,
  • for the python bindings:
    • python 3,
    • numpy,
    • and swig.

Indications for specific configurations are available in the troubleshooting section of the wiki.

Step 1: invoking CMake

$ cmake -B build .

This generates the system-dependent configuration/build files in the build/ subdirectory.

Several options can be passed to CMake, among which:

  • general options:
    • -DFAISS_ENABLE_GPU=OFF in order to disable building GPU indices (possible values are ON and OFF),
    • -DFAISS_ENABLE_DNNL=OFF in order to support for Intel®-AMX/oneDNN to accelerate indexflat(inner_product) search (possible values are ON and OFF, before invoking CMake and setting this option to ON, you can refer to this link for installing oneDNN),
    • -DFAISS_ENABLE_PYTHON=OFF in order to disable building python bindings (possible values are ON and OFF),
    • -DFAISS_ENABLE_RAFT=ON in order to enable building the RAFT implementations of the IVF-Flat and IVF-PQ GPU-accelerated indices (default is OFF, possible values are ON and OFF)
    • -DBUILD_TESTING=OFF in order to disable building C++ tests,
    • -DBUILD_SHARED_LIBS=ON in order to build a shared library (possible values are ON and OFF),
    • -DFAISS_ENABLE_C_API=ON in order to enable building C API (possible values are ON and OFF),
  • optimization-related options:
    • -DCMAKE_BUILD_TYPE=Release in order to enable generic compiler optimization options (enables -O3 on gcc for instance),
    • -DFAISS_OPT_LEVEL=avx2 in order to enable the required compiler flags to generate code using optimized SIMD/Vector instructions. Possible values are below:
      • On x86-64, generic, avx2 and avx512, by increasing order of optimization,
      • On aarch64, generic and sve, by increasing order of optimization,
    • -DFAISS_USE_LTO=ON in order to enable Link-Time Optimization (default is OFF, possible values are ON and OFF).
  • BLAS-related options:
    • -DBLA_VENDOR=Intel10_64_dyn -DMKL_LIBRARIES=/path/to/mkl/libs to use the Intel MKL BLAS implementation, which is significantly faster than OpenBLAS (more information about the values for the BLA_VENDOR option can be found in the CMake docs),
  • GPU-related options:
    • -DCUDAToolkit_ROOT=/path/to/cuda-10.1 in order to hint to the path of the CUDA toolkit (for more information, see CMake docs),
    • -DCMAKE_CUDA_ARCHITECTURES="75;72" for specifying which GPU architectures to build against (see CUDA docs to determine which architecture(s) you should pick),
    • -DFAISS_ENABLE_ROCM=ON in order to enable building GPU indices for AMD GPUs. -DFAISS_ENABLE_GPU must be ON when using this option. (possible values are ON and OFF),
  • python-related options:
    • -DPython_EXECUTABLE=/path/to/python3.7 in order to build a python interface for a different python than the default one (see CMake docs).

Step 2: Invoking Make

$ make -C build -j faiss

This builds the C++ library (libfaiss.a by default, and libfaiss.so if -DBUILD_SHARED_LIBS=ON was passed to CMake).

The -j option enables parallel compilation of multiple units, leading to a faster build, but increasing the chances of running out of memory, in which case it is recommended to set the -j option to a fixed value (such as -j4).

If making use of optimization options, build the correct target before swigfaiss.

For AVX2:

$ make -C build -j faiss_avx2

For AVX512:

$ make -C build -j faiss_avx512

This will ensure the creation of neccesary files when building and installing the python package.

Step 3: Building the python bindings (optional)

$ make -C build -j swigfaiss
$ (cd build/faiss/python && python setup.py install)

The first command builds the python bindings for Faiss, while the second one generates and installs the python package.

Step 4: Installing the C++ library and headers (optional)

$ make -C build install

This will make the compiled library (either libfaiss.a or libfaiss.so on Linux) available system-wide, as well as the C++ headers. This step is not needed to install the python package only.

Step 5: Testing (optional)

Running the C++ test suite

To run the whole test suite, make sure that cmake was invoked with -DBUILD_TESTING=ON, and run:

$ make -C build test

Running the python test suite

$ (cd build/faiss/python && python setup.py build)
$ PYTHONPATH="$(ls -d ./build/faiss/python/build/lib*/)" pytest tests/test_*.py

Basic example

A basic usage example is available in demos/demo_ivfpq_indexing.cpp.

It creates a small index, stores it and performs some searches. A normal runtime is around 20s. With a fast machine and Intel MKL's BLAS it runs in 2.5s.

It can be built with

$ make -C build demo_ivfpq_indexing

and subsequently ran with

$ ./build/demos/demo_ivfpq_indexing

Basic GPU example

$ make -C build demo_ivfpq_indexing_gpu
$ ./build/demos/demo_ivfpq_indexing_gpu

This produce the GPU code equivalent to the CPU demo_ivfpq_indexing. It also shows how to translate indexes from/to a GPU.

A real-life benchmark

A longer example runs and evaluates Faiss on the SIFT1M dataset. To run it, please download the ANN_SIFT1M dataset from http://corpus-texmex.irisa.fr/ and unzip it to the subdirectory sift1M at the root of the source directory for this repository.

Then compile and run the following (after ensuring you have installed faiss):

$ make -C build demo_sift1M
$ ./build/demos/demo_sift1M

This is a demonstration of the high-level auto-tuning API. You can try setting a different index_key to find the indexing structure that gives the best performance.

Real-life test

The following script extends the demo_sift1M test to several types of indexes. This must be run from the root of the source directory for this repository:

$ mkdir tmp  # graphs of the output will be written here
$ python demos/demo_auto_tune.py

It will cycle through a few types of indexes and find optimal operating points. You can play around with the types of indexes.

Real-life test on GPU

The example above also runs on GPU. Edit demos/demo_auto_tune.py at line 100 with the values

keys_to_test = keys_gpu
use_gpu = True

and you can run

$ python demos/demo_auto_tune.py

to test the GPU code.