- Python == 3.10
- GPU with Ampere or Hopper architecture (such as H100, A100)
- Linux OS
It is recommended to build a Python-3.10 virtual environment using conda, command is as follows:
conda create --name internevo python=3.10 -y
conda activate internevo
First, install the specified versions of torch, torchvision, torchaudio, and torch-scatter:
pip install --extra-index-url https://download.pytorch.org/whl/cu118 torch==2.1.0+cu118 torchvision==0.16.0+cu118 torchaudio==2.1.0+cu118
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.1.0+cu118.html
Install InternEvo:
pip install InternEvo
Install flash-attention (version v2.2.1):
If you need to use flash-attention to accelerate training, and it is supported in your environment, install as follows:
pip install flash-attn==2.2.1
Install Apex (version 23.05):
Apex is an optional package; If you choose to install it, follow the instructions in Install through source code.
Install megablocks(version 0.3.2):
If the model you are training is MoE releated, install as follows:
pip install git+https://github.com/databricks/[email protected] # MOE相关
The required packages and corresponding version are shown as follows:
- GCC == 10.2.0
- MPFR == 4.1.0
- CUDA >= 11.8
- Pytorch >= 2.1.0
- Transformers >= 4.28.0
After installing the above dependencies, some system environment variables need to be updated:
export CUDA_PATH={path_of_cuda_11.8}
export GCC_HOME={path_of_gcc_10.2.0}
export MPFR_HOME={path_of_mpfr_4.1.0}
export LD_LIBRARY_PATH=${GCC_HOME}/lib64:${MPFR_HOME}/lib:${CUDA_PATH}/lib64:$LD_LIBRARY_PATH
export PATH=${GCC_HOME}/bin:${CUDA_PATH}/bin:$PATH
export CC=${GCC_HOME}/bin/gcc
export CXX=${GCC_HOME}/bin/c++
Clone the project InternEvo
and its dependent submodules from the github repository, as follows:
git clone [email protected]:InternLM/InternEvo.git --recurse-submodules
It is recommended to build a Python-3.10 virtual environment using conda and install the required dependencies based on the requirements/
files:
conda create --name internevo python=3.10 -y
conda activate internevo
cd InternEvo
pip install -r requirements/torch.txt
pip install -r requirements/runtime.txt
Install flash-attention (version v2.2.1):
cd ./third_party/flash-attention
python setup.py install
cd ./csrc
cd xentropy && pip install -v .
cd ../rotary && pip install -v .
cd ../../../../
Install Apex (version 23.05):
cd ./third_party/apex
# if pip >= 23.1 (ref: https://pip.pypa.io/en/stable/news/#v23-1) which supports multiple `--config-settings` with the same key...
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
# otherwise
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./
cd ../../
Install megablocks(version 0.3.2):
pip install git+https://github.com/databricks/[email protected] # MOE releated
Users can use the provided dockerfile combined with docker.Makefile to build their own images, or obtain images with InternEvo runtime environment installed from https://hub.docker.com/r/internlm/internevo/tags.
The configuration and build of the Dockerfile are implemented through the docker.Makefile. To build the image, execute the following command in the root directory of InternEvo:
make -f docker.Makefile BASE_OS=centos7
In docker.Makefile, you can customize the basic image, environment version, etc., and the corresponding parameters can be passed directly through the command line. The default is the recommended environment version. For BASE_OS, ubuntu20.04 and centos7 are respectively supported.
The standard image based on ubuntu and centos has been built and can be directly pulled:
# ubuntu20.04
docker pull internlm/internevo:torch2.1.0-cuda11.8.0-flashatten2.2.1-ubuntu20.04
# centos7
docker pull internlm/internevo:torch2.1.0-cuda11.8.0-flashatten2.2.1-centos7
For the local standard image built with dockerfile or pulled, use the following command to run and enter the container:
docker run --gpus all -it -m 500g --cap-add=SYS_PTRACE --cap-add=IPC_LOCK --shm-size 20g --network=host --name internevo_centos internlm/internevo:torch2.1.0-cuda11.8.0-flashatten2.2.1-centos7 bash
The default directory in the container is /InternEvo
, please start training according to the Usage. The default 7B model starts the single-machine with 8-GPU training command example as follows:
torchrun --nproc_per_node=8 --nnodes=1 train.py --config configs/7B_sft.py --launcher torch
For machines with NPU, the version of the installation environment can refer to that of GPU. Use Ascend's torch_npu instead of torch on NPU machines. Additionally, Flash-Attention and Apex are no longer supported for installation on NPU. The corresponding functionalities have been internally implemented in the InternEvo codebase. The following tutorial is only for installing torch_npu.
Official documentation for torch_npu: https://gitee.com/ascend/pytorch
- Linux OS
- torch_npu: v2.1.0-6.0.rc1
- NPU card: 910B
Refer to the documentation: https://gitee.com/ascend/pytorch/tree/v2.1.0-6.0.rc1/
You can try installing according to the methods in the documentation or download the specified version of torch_npu from https://gitee.com/ascend/pytorch/releases for installation, as shown below:
pip3 install torch==2.1.0+cpu --index-url https://download.pytorch.org/whl/cpu
pip3 install pyyaml
pip3 install setuptools
wget https://gitee.com/ascend/pytorch/releases/download/v6.0.rc1-pytorch2.1.0/torch_npu-2.1.0.post3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
pip install torch_npu-2.1.0.post3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl