Installing Pytorch with AMD ROCm on GNU/Linux

24 Mar 2026

Quickly sharing my notes on how to install drivers for ROCm and Pytorch for machine learning on AMD GPUs:

First install ROCm and the AMD GPU driver:

# Install ROCm
wget https://repo.radeon.com/amdgpu-install/7.2.1/ubuntu/noble/amdgpu-install_7.2.1.70201-1_all.deb
sudo apt install ./amdgpu-install_7.2.1.70201-1_all.deb
# Install AMD driver
wget https://repo.radeon.com/amdgpu-install/7.2.1/ubuntu/noble/amdgpu-install_7.2.1.70201-1_all.deb
sudo apt install ./amdgpu-install_7.2.1.70201-1_all.deb
sudo apt update
sudo apt install "linux-headers-$(uname -r)"
sudo apt install amdgpu-dkms

Then as shown in this Reddit post, I installed install triton, torch, torchvision, and torchaudio from https://repo.radeon.com/rocm/manylinux/.

Then I tried the following program to train a neural network to imitate an XOR gate.

import torch
import torch.nn as nn
import torch.optim as optim

# Check if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# XOR data
X = torch.tensor([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=torch.float32).to(device)
Y = torch.tensor([[0], [1], [1], [0]], dtype=torch.float32).to(device)

# Define the neural network
class XORNet(nn.Module):
    def __init__(self):
        super(XORNet, self).__init__()
        self.fc1 = nn.Linear(2, 5)
        self.fc2 = nn.Linear(5, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = self.sigmoid(self.fc2(x))
        return x

# Initialize the network, loss function and optimizer
model = XORNet().to(device)
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)

# Training loop
for epoch in range(10000):
    model.train()
    optimizer.zero_grad()
    outputs = model(X)
    loss = criterion(outputs, Y)
    loss.backward()
    optimizer.step()

    if (epoch+1) % 1000 == 0:
        print(f'Epoch [{epoch+1}/10000], Loss: {loss.item():.4f}')

# Test the model
model.eval()
with torch.no_grad():
    predictions = model(X)
    print("Predictions:", predictions.round())

However I got the following error (using Torch 2.9.1 and ROCm 7.2.0).

RuntimeError: CUDA error: HIPBLAS_STATUS_INVALID_VALUE when calling `hipblasLtMatmulAlgoGetHeuristic( ltHandle, computeDesc.descriptor(), Adesc.descriptor(), Bdesc.descriptor(), Cdesc.descriptor(), Cdesc.descriptor(), preference.descriptor(), 1, &heuristicResult, &returnedResult)`

Then I found AMD’s information on how to install Pytorch with ROCm support. Basically you need to install the nightly build:

pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm7.2

Now the XOR test works!

python3 xor.py
# Epoch [1000/10000], Loss: 0.0342
# Epoch [2000/10000], Loss: 0.0114
# Epoch [3000/10000], Loss: 0.0066
# Epoch [4000/10000], Loss: 0.0046
# Epoch [5000/10000], Loss: 0.0035
# Epoch [6000/10000], Loss: 0.0028
# Epoch [7000/10000], Loss: 0.0024
# Epoch [8000/10000], Loss: 0.0020
# Epoch [9000/10000], Loss: 0.0018
# Epoch [10000/10000], Loss: 0.0016
# Predictions: tensor([[0.],
#         [1.],
#         [1.],
#         [0.]], device='cuda:0')

Enjoy!

Jan Wedekind

Installing Pytorch with AMD ROCm on GNU/Linux