Top 7 Linux GPU Monitoring and Diagnostic Commands Line Tools

A video card is a special circuit board that controls what is displayed on a computer monitor. It is also called a graphics processing unit (GPU), which calculates 3D images and graphics for Linux gaming and other usages. Let us see the top 7 Linux GPU monitoring and diagnostic command-line tools to solve issues.

The following tools work on Linux for GPU monitoring and diagnostic purposes and other operating systems such as FreeBSD. The majority of Linux and FreeBSD users these days use Nvidia, Intel, and AMD GPUs.

Finding information about GPU on Linux

To get the GPU info simply run:
sudo lshw -C display -short
lspci -v | more

Which is output something as follows:

H/W path Device Class Description
===============================================================
/0/100/1/0 display TU117M [GeForce GTX 1650 Mobile / Max-Q]
/0/100/2 /dev/fb0 display UHD Graphics 630 (Mobile)

1. glmark2 – Stress-testing GPU performance on Linux

glmark2 is an OpenGL 2.0 and ES 2.0 benchmark command-line utility. We can install it as follows:
$ sudo apt install glmark2
Now run it as follows:
$ glmark2
Then it will begin the test as follows and would stress test your GPU on Linux:

PIP
$ pip install gpustat
$ pip3 install gpustat

Run it as follows:
$ gpustat
$ gpustat -cp

Here we see name of running process and their PIDs running on Nvidia GPU:

nixcraft-wks01 Tue Nov 24 15:46:37 2020 450.80.02
[0] GeForce GTX 1650 with Max-Q Design | 39'C, ?? %, 2 % | 962 / 3911 MB | Xorg/2454(100M) Xorg/3504(325M) gnome-shell/3689(181M) firefox/4614(1M) firefox/5036(1M) firefox/5143(1M)

See help:
$ gpustat -h

4. intel_gpu_top – Displying a top-like summary of Intel GPU usage on Linux

First install the tool, run:
$ sudo apt install intel-gpu-tools
## CentOS/RHEL/Fedora Linux user try the dnf command ##
$ sudo dnf install intel-gpu-tools

Fedora, RHEL and CentOS Linux user can use the podman command as follows to install the same:
$ podman run --rm --priviledged registry.freedesktop.org/drm/igt-gpu-tools/igt:master
The tool gathers data using perf performance counters (PMU) exposed by i915 and other platform drivers like RAPL (power) and Uncore IMC (memory bandwidth). Run it as follows on Linux system:
$ sudo intel_gpu_top
Linux GPU Monitoring and Diagnostic Commands Line Tools for Intel

Linux GPU Monitoring and Diagnostic Commands Line Tools for Intel

5. nvidia-smi – NVIDIA System Management Interface program

The nvidia-smi provides monitoring and management capabilities for each of NVIDIA’s Tesla, Quadro, GRID and GeForce devices from Fermi and higher architecture families. GeForce Titan series devices are supported for most functions with very limited information provided for the remainder of the Geforce brand. NVSMI is a cross platform tool that supports all standard NVIDIA driver-supported Linux and FreeBSD. Install it as follows once Nvidia driver installed on Ubuntu Linux:
$ apt install nvidia-smi
Open the terminal and then run:
$ nvidia-smi -q -g 0 -d UTILIZATION -l 1
$ sudo nvidia-smi
$ nvidia-smi --help

Here is what we see:

Tue Nov 24 15:57:43 2020 +-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 165... Off | 00000000:01:00.0 On | N/A |
| N/A 40C P8 3W / N/A | 1011MiB / 3911MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2454 G /usr/lib/xorg/Xorg 100MiB |
| 0 N/A N/A 3504 G /usr/lib/xorg/Xorg 357MiB |
| 0 N/A N/A 3689 G /usr/bin/gnome-shell 179MiB |
| 0 N/A N/A 4614 G /usr/lib/firefox/firefox 1MiB |
| 0 N/A N/A 5036 G /usr/lib/firefox/firefox 1MiB |
| 0 N/A N/A 5143 G /usr/lib/firefox/firefox 1MiB |
| 0 N/A N/A 6406 G ...AAAAAAAA== --shared-files 225MiB |
| 0 N/A N/A 14462 G ...AAAAAAAA== --shared-files 131MiB |
+-----------------------------------------------------------------------------+ 

6. nvtop – NVIDIA GPU top

Another fancy but very useful tool for NVIDIA GPU. It is a ncurses-based GPU status viewer for NVIDIA GPUs similarly to the htop command or top command. We can install it as follows:
$ apt install nvtop
## RUN the tool ##
$ nvtop

nvtop

nvtop
The following commands are available while in nvtop is on screen:

  • Up – Select (highlight) the previous process.
  • Down – Select (highlight) the next process.
  • Left / Right – Scroll in the process row.
  • + – Sort increasingly.
  • - – Sort decreasingly.
  • F1 – Select a signal to send to the highlighted process.
  • F2 – Select the field for sorting. The current sort field is highlighted inside the header bar.
  • F3, q, Esc – Exit nvtop and return to your shell

7. radeontop – Tool to show AMD GPU utilization on Linux

View your AMD GPU utilization, both for the total activity percent and individual blocks on Linux. Install it as follows:
$ sudo apt install radeontop
$ sudo radeontop

It works with R600 and up GPUs, even Southern Islands should work fine. Works with both the open source AMD drivers and AMD Catalyst cloused-source drivers:
radontop

radontop

Conclusion

You learned about the various Linux GPU commands and tools for monitoring and diagnostic purposes on Linux and BSD-based systems. Let me know if I missed your favorite tool in the comment section below.

Posted by Contributor