-
-
Save Mahedi-61/2a2f1579d4271717d421065168ce6a73 to your computer and use it in GitHub Desktop.
#!/bin/bash | |
### steps #### | |
# Verify the system has a cuda-capable gpu | |
# Download and install the nvidia cuda toolkit and cudnn | |
# Setup environmental variables | |
# Verify the installation | |
### | |
### to verify your gpu is cuda enable check | |
lspci | grep -i nvidia | |
### If you have previous installation remove it first. | |
sudo apt-get purge nvidia* | |
sudo apt remove nvidia-* | |
sudo rm /etc/apt/sources.list.d/cuda* | |
sudo apt-get autoremove && sudo apt-get autoclean | |
sudo rm -rf /usr/local/cuda* | |
# system update | |
sudo apt-get update | |
sudo apt-get upgrade | |
# install other import packages | |
sudo apt-get install g++ freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev | |
# first get the PPA repository driver | |
sudo add-apt-repository ppa:graphics-drivers/ppa | |
sudo apt update | |
# install nvidia driver with dependencies | |
sudo apt install libnvidia-common-470 | |
sudo apt install libnvidia-gl-470 | |
sudo apt install nvidia-driver-470 | |
# installing CUDA-11.8 | |
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-keyring_1.0-1_all.deb | |
sudo dpkg -i cuda-keyring_1.0-1_all.deb | |
sudo apt-get update | |
sudo apt-get -y install cuda | |
# setup your paths | |
echo 'export PATH=/usr/local/cuda-11.8/bin:$PATH' >> ~/.bashrc | |
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc | |
source ~/.bashrc | |
sudo ldconfig | |
# install cuDNN v8.9.7 | |
# First register here: https://developer.nvidia.com/developer-program/signup | |
CUDNN_TAR_FILE="cudnn-linux-x86_64-8.9.7.29_cuda11-archive.tar.xz" | |
wget https://developer.nvidia.com/downloads/compute/cudnn/secure/8.9.7/local_installers/11.x/cudnn-linux-x86_64-8.9.7.29_cuda11-archive.tar.xz | |
tar -xvf ${CUDNN_TAR_FILE} | |
# copy the following files into the cuda toolkit directory. | |
sudo cp cudnn-*-archive/include/cudnn*.h /usr/local/cuda/include | |
$ sudo cp -P cudnn-*-archive/lib/libcudnn* /usr/local/cuda/lib64 | |
$ sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn* | |
# Finally, to verify the installation, check | |
nvidia-smi | |
nvcc -V | |
# install Pytorch (an open source machine learning framework) | |
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 |
Can someone tell me is sudo ubuntu-drivers autoinstall
the same as three following commands? Do they do the same job?
sudo apt install libnvidia-common-470
sudo apt install libnvidia-gl-470
sudo apt install nvidia-driver-470
After installing this I was getting the following (non-fatal) warning
>>> import tensorflow as tf
>>> print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
2021-11-24 09:01:58.877869: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-24 09:01:58.899255: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-24 09:01:58.900051: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
Num GPUs Available: 1
I resolved it by following tensorflow/tensorflow#53184
for a in /sys/bus/pci/devices/*; do echo 0 | sudo tee -a $a/numa_node; done
Thanks for nice repo
I have installed using your instruction. but when type nvidia-smi it shows 11.5. Why, how can I install 11.2?
Works great up till cuDNN, and then I get the following
$ wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.1.1.33/11.2_20210301/cudnn-11.2-linux-x64-v8.1.1.33.tgz
--2022-02-13 13:24:40-- https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.1.1.33/11.2_20210301/cudnn-11.2-linux-x64-v8.1.1.33.tgz
Resolving developer.nvidia.com (developer.nvidia.com)... 152.195.19.142
Connecting to developer.nvidia.com (developer.nvidia.com)|152.195.19.142|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2022-02-13 13:24:41 ERROR 403: Forbidden.
EDIT: This link worked: wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.1.1/cudnn-11.2-linux-x64-v8.1.1.33.tgz
I want to install CUDA 11.3 or higher version on Ubuntu 18.04 (which is installed using a Virtual Machine). Which instructions should I follow?
Thanks for nice repo I have installed using your instruction. but when type nvidia-smi it shows 11.5. Why, how can I install 11.2?
I had to implement the end of this tutorial:
https://towardsdatascience.com/installing-multiple-cuda-cudnn-versions-in-ubuntu-fcb6aa5194e2
I used his edit of bash so tensorflow (in my case) can choose what cuda toolkit use, and it worked.
Thank you very much @Mahedi-61, much appreciated
RTX 3090 requires driver version of 515 (not 470).
# install nvidia driver with dependencies
sudo apt install libnvidia-common-515
sudo apt install libnvidia-gl-515
sudo apt install nvidia-driver-515
I am wondering whether these work for installing cuda 11.3 on ubuntu 22.04 also?
Will it work for nvidia-server on ubuntu 20.04 server ?
install nvidia driver with dependencies
sudo apt install libnvidia-common-470-server
sudo apt install libnvidia-gl-470-server
sudo apt install nvidia-driver-470-server
Will it work for nvidia-server on ubuntu 20.04 server ?
install nvidia driver with dependencies
sudo apt install libnvidia-common-470-server sudo apt install libnvidia-gl-470-server sudo apt install nvidia-driver-470-server
@saravananpsg It's works for server. I tested. I also changed 470 to 515 to support 3090.
I also had to change the version from 470 to 515 for a 1070 TI.
sudo apt install libnvidia-common-515
sudo apt install libnvidia-gl-515
sudo apt install nvidia-driver-515
After installing, if nvidia-smi
gives a kernel/client version mismatch error, reboot.
This helped A LOT! Thanks!
Thank you! It was veeery helpful!
Thank you verry much you just forgotten a star character after cudnn here :
sudo cp -P cuda/include/cudnn*.h /usr/local/cuda-11.3/include
Verry important because else an error can be encountered while compiling for example pytorch "cudnn_version.h" not found.
Regards
tar -xvf cudnn-linux-x86_64-8.9.7.29_cuda11-archive.tar.xz
xz: (stdin): File format not recognized
tar: Child returned status 1
tar: Error is not recoverable: exiting now
I get this error
I have my tar and xz installed
with this command :
$ sudo cp cudnn--archive/include/cudnn.h /usr/local/cuda/include
message error :
cp: cannot stat 'cudnn--archive/include/cudnn.h': No such file or directory
the same with the other commands :
$ sudo cp cudnn--archive/include/cudnn.h /usr/local/cuda/include
====> cp: cannot stat 'cudnn--archive/include/cudnn.h': No such file or directory
$sudo cp cudnn--archive/include/cudnn.h /usr/local/cuda/include
====> cp: cannot stat 'cudnn--archive/include/cudnn.h': No such file or directory
@yummyKnight Thanks for your correction.