jetson xavier에서 pytorch/vision CUDA/GPU사용 방법
updated 24.7.7
굳이 어렵게 하지 말자. 쉽게 설치하는 방법을 재등록하였다.
----------------------------------------------------------------------------------------------------
xavier에 우분투20.04으로 설치를 하고, jetpack은 5.1.3이 설치되어있다. 그런데, jetpack을 설치하면 당연히, GPU가 인식되는 것이 당연하다고 생각했는데, 안되어있는 것이 많다.
opencv도 CUDA가 지원 안되고, torch도 CUDA지원이 안되어서 실제로 GPU사용이 안되는 상황이었다.
필요한 패키지를 설치해보고 해봤는데, 정상적으로 작동하지 않아서, 소스를 받아서 컴파일 하는 것으로 해결을 하였다. 아래에 그 과정을 기록한다. 해소를 한 후에, 기억을 갖고 순서를 적은 것이라, 중간중간에 생략된 것이 있다.
컴파일을 하다가 워닝이 있는 건에 대해서, 필요하지 않아도 그냥 설치를 해봤던 것 같다.
0. 처음 해야 할 것
$ sudo apt-get update
$ sudo apt-get upgrade
1. cmake 설치
기존에 설치된 cmake버전이 낮은 것이어서, 높은 버전으 cmake를 설치하였다. 패키지로 제공되지 않는 것으로 보여서, cmake홈페이지에 접속해서 소스를 다운받아서 설치하였다.( https://cmake.org/download/ )
Unix/Linux Source 3.29.6 버전을 다운받아서 설치를 하였다 . (https://github.com/Kitware/CMake/releases/) 에는 설치를 위한 여러가지 자료가 제공된다.
나열된 것 중에서, cmake-3.29.6-linux-aarch64.tar.gz을 다운로드 받아서 설치를 하였다.
$tar xvf cmake-3.29.6-linux-aarch64.tar.gz
$sudo mkdir /usr/local/cmake
$sudo cp -rf cmake-3.29.6-linux-aarch64/* /usr/local/.
이렇게 하면 된다. 위와 같이 쉽게 하거나, 시간을 두고 아래와 같이 직접 컴파일 하는 방법도 있기는 하다. 컴파일에 시간이 좀 오래 걸린다.
$ tar xvf cmake-3.29.6.tar.gz
$ cd cmake-3.29.6
$ mkdir build
$ ../bootstrap
$ make
$ sudo make install
$ cmake --version
cmake version 3.29.6
2. pytorch 설치 전에 수행한 것
기존에 설치되어 있는 torch 패키지를 삭제하고, 설치를 해야 한다. 삭제할 때, dependancy가 있는 패키지를 찾아서 삭제를 했던 기억이 있는데, 어쨋든, torch와 torchvision 패키지를 삭제하였다.
$ sudo apt-get remove torch
$ sudo apt-get remove torchvision
pytorch에서 사용될 필요성이 있는지 모르겠지만, 설치한 패키지들을 나열한다.(생각나는 것만)
$ sudo apt-get install libjpeg-dev zlib1g-dev libpython3-dev libavcodec-dev libavformat-dev libswscale-dev
$ sudo apt-get install python3-pip libjpeg-dev libopenblas-dev libopenmpi-dev libomp-dev
$ sudo apt-get install python3-pip pip python-wheel-common python-setuptools
$ suto apt-get install ccache
$ sudo apt-get install libjpeg-dev zlib1g-dev
$ sudo apt-get install -y build-essential libssl-dev
$ sudo pip3 uninstall numpy
$ sudo pip3 uninstall onnx-graphsurgeon
$ sudo pip3 install numpy
$ sudo pip3 install onnx-graphsurgeon
3. UCX 설치
torch 설치를 진행하다가 보니, 공식 페이지에 이것을 설치하라는 내용이 있었다. 그래서 패키지로 설치를 해보다가 잘 안되어서, 이것도 소스로 받아서 컴파일해서 적용했다.
$ git clone https://github.com/openucx/ucx.git
$ cd ucx
$ export PYTHON_EXECUTABLE=$(which python3)
$ ./autogen.sh
$ ./contrib/configure-release --prefix=/usr/local/ --with-cuda=/usr/local/cuda --with-xpmem=/usr/local
$ make -j8
$ sudo make install
4. 설치하다가 보니, 눈에 띄는 패키지가 없다는 메시지가 나와서 그냥 설치해봄.
$ sudo apt install openjdk-21-*
$ sudo apt install golang-go
$ sudo apt-get install libfuse3-dev
$ vi ~/.bashrc
export PATH=/usr/local/cuda/bin:/home/igi/.local/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/cuda/lib64/:$LD_LIBRARY_PATH
export PYTHONPATH=/usr/local/lib/python3.8/site-packages/:$PYTHONPATH
export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libgomp.so.1:/usr/local/cuda/lib64/libcudart.so
export PKG_CONFIG_PATH=/usr/lib/aarch64-linux-gnu/pkgconfig:$PKG_CONFIG_PATH
export CPATH=/usr/local/cuda/include:$CPATH
export CUDA_HOME=/usr/local/cuda
$ source ~/.bashrc
5. openmpi 설치
$ wget https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.3.tar.gz
$ tar xvf openmpi-5.0.3.tar.gz
$ cd openmpi-5.3.0
$ ./configure --with-cuda=/usr/local/cuda
$ make -j$(nproc)
$ sudo make install
$ sudo ldconfig
6. knem 설치
$ sudo apt-get install -y autoconf automake libtool pkg-config
$ git clone https://gitlab.inria.fr/knem/knem.git
$ cd knem
$ ./autogen.sh
$ ./configure --prefix=/usr/local
$ make -j8
$ sudo make install
$ pkg-config --cflags knem
-I/usr/local/include
7. pytorch 설치
pytorch의 github의 안내에는 torch와 vision의 버전에 대한 정보가 있다.
torch | torchvision | python |
2.3 | 0.18 | >=3.8, <=3.12 |
2.2 | 0.17 | >=3.8, <=3.11 |
2.1 | 0.16 | >=3.8, <=3.11 |
2.0 | 0.15 | >=3.8, <=3.11 |
pytorch소스를 다운로드 받는다.
$ git clone --recursive https://github.com/pytorch/pytorch
$ cd pytorch
$ git submodule sync
$ git submodule update --init --recursive
컴파일을 한다.
$ tar xvf pytorch-v2.3.0.tar.gz
$ cd pytorch
$ ./configure --with-cuda=/usr/local/cuda --prefix=/usr/local/
$ mkdir build
$ cd build
이것인지?
$ cmake ../ -DPYTHON_EXECUTABLE=$(which python3)
이것인지?
$ cmake -DBLAS=OpenBLAS -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=./install -DUSE_NNPACK=0 -DUSE_FBGEMM=0 -DUSE_FAKELOWP=0 -Wno-dev ..
아니면, 이것인지?
$ cmake .. -DWITH_CUDA=on
정상적으로 설치되었는지를 확인하였다. 기존에는 False로 나오던 것이 True로 출력된다.
$ python3
Python 3.8.10 (default, Nov 22 2023, 10:22:35)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
8. torchvision 설치
소스를 다운로드 한다. torch버전에 맞게 0.18.0 버전을 받았다.
$ git clone --branch v0.18.0 https://github.com/pytorch/vision torchvision
Cloning into 'torchvision'...
remote: Enumerating objects: 531477, done.
remote: Counting objects: 100% (45087/45087), done.
remote: Compressing objects: 100% (2079/2079), done.
remote: Total 531477 (delta 43041), reused 44913 (delta 42935), pack-reused 486390
Receiving objects: 100% (531477/531477), 1023.04 MiB | 20.76 MiB/s, done.
Resolving deltas: 100% (496288/496288), done.
Note: switching to '6043bc250768b129e90a5321e318c1d51ee48a5c'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:
git switch -c <new-branch-name>
Or undo this operation with:
git switch -
Turn off this advice by setting config variable advice.detachedHead to false
이것도 설치했다.
$ sudo pip install expecttest flake8 typing mypy pytest pytest-mock scipy requests
Requirement already satisfied: expecttest in /usr/local/lib/python3.8/dist-packages (0.2.1)
Collecting flake8
Downloading flake8-7.1.0-py2.py3-none-any.whl.metadata (3.8 kB)
Collecting typing
Downloading typing-3.7.4.3.tar.gz (78 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.6/78.6 kB 3.9 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Collecting mypy
Downloading mypy-1.10.0-py3-none-any.whl.metadata (1.9 kB)
Collecting pytest
Downloading pytest-8.2.2-py3-none-any.whl.metadata (7.6 kB)
Collecting pytest-mock
Downloading pytest_mock-3.14.0-py3-none-any.whl.metadata (3.8 kB)
Requirement already satisfied: scipy in /usr/lib/python3/dist-packages (1.3.3)
Requirement already satisfied: requests in /usr/lib/python3/dist-packages (2.22.0)
Collecting mccabe<0.8.0,>=0.7.0 (from flake8)
Downloading mccabe-0.7.0-py2.py3-none-any.whl.metadata (5.0 kB)
Collecting pycodestyle<2.13.0,>=2.12.0 (from flake8)
Downloading pycodestyle-2.12.0-py2.py3-none-any.whl.metadata (4.5 kB)
Collecting pyflakes<3.3.0,>=3.2.0 (from flake8)
Downloading pyflakes-3.2.0-py2.py3-none-any.whl.metadata (3.5 kB)
Requirement already satisfied: typing-extensions>=4.1.0 in /usr/local/lib/python3.8/dist-packages (from mypy) (4.12.2)
Collecting mypy-extensions>=1.0.0 (from mypy)
Downloading mypy_extensions-1.0.0-py3-none-any.whl.metadata (1.1 kB)
Collecting tomli>=1.1.0 (from mypy)
Downloading tomli-2.0.1-py3-none-any.whl.metadata (8.9 kB)
Collecting iniconfig (from pytest)
Downloading iniconfig-2.0.0-py3-none-any.whl.metadata (2.6 kB)
Requirement already satisfied: packaging in /usr/local/lib/python3.8/dist-packages (from pytest) (24.1)
Collecting pluggy<2.0,>=1.5 (from pytest)
Downloading pluggy-1.5.0-py3-none-any.whl.metadata (4.8 kB)
Requirement already satisfied: exceptiongroup>=1.0.0rc8 in /usr/local/lib/python3.8/dist-packages (from pytest) (1.2.1)
Downloading flake8-7.1.0-py2.py3-none-any.whl (57 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.6/57.6 kB 3.8 MB/s eta 0:00:00
Downloading mypy-1.10.0-py3-none-any.whl (2.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.6/2.6 MB 21.8 MB/s eta 0:00:00
Downloading pytest-8.2.2-py3-none-any.whl (339 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 339.9/339.9 kB 16.9 MB/s eta 0:00:00
Downloading pytest_mock-3.14.0-py3-none-any.whl (9.9 kB)
Downloading mccabe-0.7.0-py2.py3-none-any.whl (7.3 kB)
Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
Downloading pluggy-1.5.0-py3-none-any.whl (20 kB)
Downloading pycodestyle-2.12.0-py2.py3-none-any.whl (31 kB)
Downloading pyflakes-3.2.0-py2.py3-none-any.whl (62 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.7/62.7 kB 4.3 MB/s eta 0:00:00
Downloading tomli-2.0.1-py3-none-any.whl (12 kB)
Downloading iniconfig-2.0.0-py3-none-any.whl (5.9 kB)
Building wheels for collected packages: typing
Building wheel for typing (setup.py) ... done
Created wheel for typing: filename=typing-3.7.4.3-py3-none-any.whl size=26306 sha256=51b6f4540eef52b20d982d6c22d56e32bd71a1770e9d8d0ac4128f7e432625e5
Stored in directory: /root/.cache/pip/wheels/5e/5d/01/3083e091b57809dad979ea543def62d9d878950e3e74f0c930
Successfully built typing
Installing collected packages: typing, tomli, pyflakes, pycodestyle, pluggy, mypy-extensions, mccabe, iniconfig, pytest, mypy, flake8, pytest-mock
Successfully installed flake8-7.1.0 iniconfig-2.0.0 mccabe-0.7.0 mypy-1.10.0 mypy-extensions-1.0.0 pluggy-1.5.0 pycodestyle-2.12.0 pyflakes-3.2.0 pytest-8.2.2 pytest-mock-3.14.0 tomli-2.0.1 typing-3.7.4.3
이제 설치를 한다.
$ cd pytorchvision
$ mkdir build
$ cd build
$ cmake .. -DWITH_CUDA=on
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- The CUDA compiler identification is NVIDIA 11.4.315
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
CMake Error at CMakeLists.txt:24 (find_package):
By not providing "FindTorch.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "Torch", but
CMake did not find one.
Could not find a package configuration file provided by "Torch" with any of
the following names:
TorchConfig.cmake
torch-config.cmake
Add the installation prefix of "Torch" to CMAKE_PREFIX_PATH or set
"Torch_DIR" to a directory containing one of the above files. If "Torch"
provides a separate development package or SDK, be sure it has been
installed.
-- Configuring incomplete, errors occurred!
오류가 발생하였다. chatgpt님에게 위의 오류를 넣었더니, 아래처럼 처리하라고 한다. 먼저, cmake 설치된 path를 찾아본다.
$ python3 -c "import torch; print(torch.utils.cmake_prefix_path)"
/usr/local/lib/python3.8/dist-packages/torch/share/cmake
위에서 찾은 PATH를 cmake실핼할 때의 파라미터 CMAKE_PREFIX_PATH에 등록한다.
$ vi ~/.bashrc
export CMAKE_PREFIX_PATH=/usr/local/lib/python3.8/dist-packages/torch/share/cmake:$CMAKE_PREFIX_PATH
$ source ~/.bashrc
$ cmake .. -DWITH_CUDA=on
-- Found CUDA: /usr/local/cuda (found version "11.4")
-- Found CUDAToolkit: /usr/local/cuda/include (found version "11.4.315")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Caffe2: CUDA detected: 11.4
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 11.4
-- /usr/local/cuda/lib64/libnvrtc.so shorthash is 7c7201d5
-- USE_CUDNN is set to 0. Compiling without cuDNN support
-- Autodetected CUDA architecture(s): 7.2
-- Added CUDA NVCC flags for: -gencode;arch=compute_72,code=sm_72
-- Found Torch: /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch.so
-- Found ZLIB: /usr/lib/aarch64-linux-gnu/libz.so (found version "1.2.11")
-- Found PNG: /usr/lib/aarch64-linux-gnu/libpng.so (found version "1.6.37")
-- Found JPEG: /usr/lib/aarch64-linux-gnu/libjpeg.so (found version "80")
-- Configuring done (5.2s)
-- Generating done (0.1s)
-- Build files have been written to: /home/igi/Downloads/torchvision/build
$ make -j8
[ 5%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/decode_image.cpp.o
[ 5%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/common_jpeg.cpp.o
[ 7%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/decode_png.cpp.o
[ 10%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/decode_jpeg.cpp.o
[ 12%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/encode_jpeg.cpp.o
[ 15%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/encode_png.cpp.o
[ 17%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cuda/decode_jpeg_cuda.cpp.o
[ 20%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/read_write_file.cpp.o
[ 22%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/image.cpp.o
[ 25%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autocast/deform_conv2d_kernel.cpp.o
[ 27%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autocast/nms_kernel.cpp.o
[ 30%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autocast/ps_roi_align_kernel.cpp.o
[ 32%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autocast/ps_roi_pool_kernel.cpp.o
[ 35%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autocast/roi_align_kernel.cpp.o
[ 37%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autocast/roi_pool_kernel.cpp.o
[ 40%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autograd/deform_conv2d_kernel.cpp.o
[ 42%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autograd/ps_roi_align_kernel.cpp.o
...
[100%] Linking CXX shared library libtorchvision.so
[100%] Built target torchvision
$ sudo make install
torchvision도 시험을 해보자.
import torch
import torchvision.models as models
# Check if CUDA is available
if torch.cuda.is_available():
device = torch.device("cuda")
print("Using CUDA")
else:
device = torch.device("cpu")
print("Using CPU")
# Load a pretrained model from torchvision
model = models.resnet18(pretrained=True)
# Move the model to the appropriate device (GPU or CPU)
model.to(device)
# Create a dummy input tensor and move it to the appropriate device
dummy_input = torch.randn(1, 3, 224, 224).to(device)
# Run the model on the dummy input
output = model(dummy_input)
print("Model output:", output)