4

I am one of those miserable creatures who own a AMD GPU(RX 5700, Navi10). I want to use up-to-date PyTorch libraries to do some Deep Learning on my local machine and stop using cloud instances.

I saw all over the internet that AMD is promising Navi10 support in the next 2-4 months(posts that were written 1-2 years back) however, I do not think they released an "official" support.

I installed ROCm on local machine and it actually detects my GPU and everything seems nice, here is rocminfo output.

rocminfo output

I installed the necessary PyTorch ROCm version but when I try to run a code, I get the following error. Error mesage "hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"

I suppose this is because ROCm still does not have a support for gfx1010 or I am lost at this point.

I would be happy if someone can provide a way to make ROCm work(preferable without compiling whole package for gfx1010 again) OR provide way to use an AMD GPU just like a CUDA user.

makesense
  • 55
  • 1
  • 6
  • Take a look here for AMDs guide on what to download, how to install it and how to setup/configure ROCm: https://docs.amd.com/bundle/Deep_learning_Guide_5.2/page/Frameworks_Installation_Guide.html Especially see the section on testing PyTouch to make sure it can see the GPU to know it it was setup correctly before running your project to see if the issue is the project or the library. If you followed that guide, did you encounter any issues or oddities? – sorifiend Aug 04 '22 at 01:44
  • @sorifiend Link dead. – Thegerdfather Oct 28 '22 at 05:28
  • @Thegerdfather Here is an updated link: https://docs.amd.com/bundle/ROCm-Deep-Learning-Guide-v5.3/page/Frameworks_Installation.html You can also find more info on the PyTorch page: https://pytorch.org/get-started/locally/ just select the version, OS, package, and ROCm and it will give you all the info you need to get it installed and working. Not that it currently only seems to be supported on Linux. – sorifiend Oct 29 '22 at 05:14

1 Answers1

8

Add 'HSA_OVERRIDE_GFX_VERSION=10.3.0' before 'python'

for example, in teminal, input:

HSA_OVERRIDE_GFX_VERSION=10.3.0 python launch.py

I used 5700xt to run stable-diffusion for months, it works.

kiron111
  • 96
  • 1
  • 2
  • Wow thanks that works without any pain! Do you have an idea of what that environment variable does and why we set it to 10.3.0?? – makesense Nov 19 '22 at 19:23
  • 2
    Because GFX1030 is the series model name of RDNA2 ( i.e. the series contains 6700xt, 6800xt and 6900xt), and since time of my last comment (only 6900xt of that series is official supported fro ROCM) HSA_OVERRIDE_GFX_VERSION=10.3.0 means you pretends to have a 6900xt, so that ROCM allow you to run the following python file. Further more, for rocm_tensorflow (at least when using opencv), pretend to be a RX580 sometimes usefully (HSA_OVERRIDE_GFX_VERSION=8.3.0), otherwise my 5700xt will get memory error!!!!) – kiron111 Dec 24 '22 at 16:14
  • 1
    Sorry for repling ur comment so late, because I seldom engaged in Rocm (even Python) these days , so I didn't browsing stack overflow (believe it or not, I just created my account for answer your question last month) – kiron111 Dec 24 '22 at 16:22
  • Thanks for the explanation and no worries for late answer. You really saved me from spending 500€ and buying a new Nvidia GPU ;) – makesense Dec 26 '22 at 13:19
  • Welcome!! Hope you enjoy your time wiith rocm!! – kiron111 Dec 29 '22 at 17:51