Source

First, find out the exact version of rocm you have installed. For 6 that's

cat /opt/rocm/.info/version

which for me is 6.0.0-91.
Then clone the githubs for rocBLAS, Tensile, and rocSPARSE. Then checkout the matching release for each folder.

git checkout release/rocm-rel-6.0

Since the installed rocBLAS has to be matched exactly accidentally checking out 6.0.2 for example would be bad.

I copied the rocSPARSE build commands from arch's PKGBUILD. This will overwrite the installed rocSPARSE with one compiled only for gfx1010.

CXXFLAGS="-fcf-protection=none"
cmake-Wno-dev-B build-DCMAKE_BUILD_TYPE=Release-DCMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc-DCMAKE_INSTALL_PREFIX=/opt/rocm-DAMDGPU_TARGETS=gfx1010
cmake --build build -j 8
sudo cmake --install build

Then for Tensile grab and apply this patch https://aur.archlinux.org/cgit/aur.git/tree/Tensile-fix-fallback-arch-build.patch?h=rocblas-gfx1010-backend
Then create a python environment and install requirements.txt. Then

python setup.py build

python setup.py install

Now run

TensileCreateLibrary --jobs 8 --code-object-version default --library-format msgpack --architecture gfx1010 --merge-files --separate-architectures --lazy-library-loading /rocBLAS/library/src/blas3/Tensile/Logic/asm_full ~/result HIP

with the first part of the first path replaced with where you cloned rocblas, and the second path is any folder to write the results to.
Finally in the ~/result/library folder there is a TensileLibrary_lazy_gfx1010.dat file. sudo cp that into /opt/rocm/lib/rocblas/library

Now llama.cpp should compile & run. Maybe make clean first, and then LLAMA_HIPBLAS=ON GPU_TARGETS=gfx1010 make main

Edit
Pub: 24 Feb 2024 22:10 UTC
Edit: 08 Apr 2024 00:12 UTC
Views: 2856