Skip to content

Instantly share code, notes, and snippets.

@wk989898
Last active January 21, 2024 10:54
Show Gist options
  • Save wk989898/22b2d037d94c9c89c8bd95544d20a0a5 to your computer and use it in GitHub Desktop.
Save wk989898/22b2d037d94c9c89c8bd95544d20a0a5 to your computer and use it in GitHub Desktop.
deepspeed install solution

Install conda cuda

For solving pytorch-version not match system's cuda.

e.g.

ld: cannot find -lcurand: 没有那个文件或目录
collect2: error: ld returned 1 exit status
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

1. conda install cuda-toolkit and cudnn

  1. go to https://anaconda.org/nvidia/cuda-toolkit, select special version download
  2. install cudnn by default

2. verify cuda lib

ll xxx/miniconda/lib/libcurand.so or ll xxx/miniconda/envs/xx/lib/libcurand.so

3. export

export LIBRARY_PATH=xxx/miniconda/envs/xxx/lib/                                                                               
export LD_LIBRARY_PATH=xxx/miniconda/envs/xxx/lib/

notice

If install deepspeed failed, try CUDA_HOME=xxx/miniconda3 pip install deepspeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment