0. Remove CUDA files

$ sudo apt-get remove --purge '^nvidia-.*'

$ sudo apt-get remove --purge 'cuda*'
$ sudo apt-get autoremove --purge 'cuda*'

$ sudo rm -rf /usr/local/cuda
$ sudo rm -rf /usr/local/cuda-#.#

 

 

1. Setting CUDA Toolkit on WSL2

$ wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
$ sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600

$ wget https://developer.download.nvidia.com/compute/cuda/11.4.0/local_installers/cuda-repo-wsl-ubuntu-11-4-local_11.4.0-1_amd64.deb
$ sudo dpkg -i cuda-repo-wsl-ubuntu-11-4-local_11.4.0-1_amd64.deb
$ sudo apt-key add /var/cuda-repo-wsl-ubuntu-11-4-local/7fa2af80.pub

$ sudo apt-get update
$ sudo apt-get -y install cuda

설치 확인

$ cd /usr/local/cuda-11.4/samples/4_Finance/BlackScholes
$ sudo make BlackScholes
$ ./BlackScholes
[./BlackScholes] - Starting...
GPU Device 0: "Pascal" with compute capability 6.1

Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
...generating input data in CPU mem.
...copying input data to GPU mem.
Data init done.

Executing Black-Scholes GPU kernel (512 iterations)...
Options count             : 8000000
BlackScholesGPU() time    : 0.227898 msec
Effective memory bandwidth: 351.033566 GB/s
Gigaoptions per second    : 35.103357

BlackScholes, Throughput = 35.1034 GOptions/s, Time = 0.00023 s, Size = 8000000 options, NumDevsUsed = 1, Workgroup = 128

Reading back GPU results...
Checking the results...
...running CPU calculations.

Comparing the results...
L1 norm: 1.741792E-07
Max absolute error: 1.192093E-05

Shutting down...
...releasing GPU memory.
...releasing CPU memory.
Shutdown done.

[BlackScholes] - Test Summary

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Test passed

Chane Repository

$ sudo nano /etc/apt/sources.list

Replace: Ctrl + \

Search (to replace): archive.ubuntu.com

Replace with: mirror.kakao.com

Save: Ctrl + s

Exit: Ctrl + x

확인

$ sudo apt update

 

개발환경 설정

1. PIP install

$ sudo apt-get install python3-pip
$ pip install --upgrade pip

 

2. Pytorch, Torchvision install

$ pip3 install torch torchvision torchaudio

 

3. OpenCV

$ pip install opencv-python

 

4. TensorRT

 - CUDA toolkit, PyCUDA

$ pip install numpy cupy

Kepler architecture 이상의 GPU 필요

 

- TensorRT C++

https://developer.nvidia.com/tensorrt

 

NVIDIA TensorRT

An SDK with an optimizer for high-performance deep learning inference.

developer.nvidia.com

$ wget https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/8.4.3/local_repos/nv-tensorrt-repo-ubuntu1804-cuda11.6-trt8.4.3.1-ga-20220813_1-1_amd64.deb
$ sudo dpkg -i nv-tensorrt-repo-ubuntu1804-cuda11.6-trt8.4.3.1-ga-20220813_1-1_amd64.deb
(Reading database ... 71998 files and directories currently installed.)
Preparing to unpack nv-tensorrt-repo-ubuntu1804-cuda11.6-trt8.4.3.1-ga-20220813_1-1_amd64.deb ...
Unpacking nv-tensorrt-repo-ubuntu1804-cuda11.6-trt8.4.3.1-ga-20220813 (1-1) over (1-1) ...
Setting up nv-tensorrt-repo-ubuntu1804-cuda11.6-trt8.4.3.1-ga-20220813 (1-1) ...

위와 같이 설치하면 header 파일 및 lib 파일을 찾을 수 없음.

아래 링크의 TAR Package를 받아야 함

https://developer.nvidia.com/compute/machine-learning/tensorrt/secure/8.4.3/tars/tensorrt-8.4.3.1.linux.x86_64-gnu.cuda-11.6.cudnn8.4.tar.gz

$ tar -xvf TensorRT-8.4.3.1.Linux.x86_64-gnu.cuda-11.6.cudnn8.4.tar.gz
$ mv TensorRT-8.4.3.1 ~/dev/

include, lib 경로를 환경변수에 등록

$ sudo nano ~/.bashrc

...

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/lib64:$LD_LIBRARY_PATH:~/dev/TensorRT-8.4.3.1/lib

bashrc 변경 적용 또는 재시작

$ source ~/.bashrc

TensorRT-8.4.3.1/python/ 폴더에서 현재 파이썬 버전에 맞는 패키지 설치

$ pip install tensorrt-8.4.3.1-cp36-none-linux_x86_64.whl
Defaulting to user installation because normal site-packages is not writeable
Processing ./tensorrt-8.4.3.1-cp36-none-linux_x86_64.whl
Installing collected packages: tensorrt
Successfully installed tensorrt-8.4.3.1

TensorRT-8.4.3.1/uff/ 폴더에서도 설치

$ cd ../uff/
$ pip install uff-0.6.9-py2.py3-none-any.whl
Defaulting to user installation because normal site-packages is not writeable
Processing ./uff-0.6.9-py2.py3-none-any.whl
Requirement already satisfied: protobuf>=3.3.0 in /home/ym/.local/lib/python3.6/site-packages (from uff==0.6.9) (3.17.3)
Requirement already satisfied: numpy>=1.11.0 in /home/ym/.local/lib/python3.6/site-packages (from uff==0.6.9) (1.19.5)
Requirement already satisfied: six>=1.9 in /home/ym/.local/lib/python3.6/site-packages (from protobuf>=3.3.0->uff==0.6.9) (1.15.0)
Installing collected packages: uff
Successfully installed uff-0.6.9

$ cd ../graphsurgeon/
$ pip install graphsurgeon-0.4.6-py2.py3-none-any.whl
Defaulting to user installation because normal site-packages is not writeable
Processing ./graphsurgeon-0.4.6-py2.py3-none-any.whl
Installing collected packages: graphsurgeon
Successfully installed graphsurgeon-0.4.6

$ cd ../onnx_graphsurgeon/
$ pip install onnx_graphsurgeon-0.3.12-py2.py3-none-any.whl
Defaulting to user installation because normal site-packages is not writeable
Processing ./onnx_graphsurgeon-0.3.12-py2.py3-none-any.whl
Collecting onnx
  Using cached onnx-1.12.0.tar.gz (10.1 MB)
  Preparing metadata (setup.py) ... done
Requirement already satisfied: numpy in /home/ym/.local/lib/python3.6/site-packages (from onnx-graphsurgeon==0.3.12) (1.19.5)
Requirement already satisfied: protobuf<=3.20.1,>=3.12.2 in /home/ym/.local/lib/python3.6/site-packages (from onnx->onnx-graphsurgeon==0.3.12) (3.17.3)
Requirement already satisfied: typing-extensions>=3.6.2.1 in /home/ym/.local/lib/python3.6/site-packages (from onnx->onnx-graphsurgeon==0.3.12) (3.7.4.3)
Requirement already satisfied: six>=1.9 in /home/ym/.local/lib/python3.6/site-packages (from protobuf<=3.20.1,>=3.12.2->onnx->onnx-graphsurgeon==0.3.12) (1.15.0)
Building wheels for collected packages: onnx
  Building wheel for onnx (setup.py) ... error

마지막 오류 해결해야 함...

- Protobuf 설치

$ pip3 install "protobuf>=3.11.0,<=3.20.1"

여전히 오류가 나지만... python에서 import tensorrt 문제 없음

+ Recent posts