Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux: add GPU meter and process column #1288

Merged
merged 6 commits into from
Mar 27, 2024
Merged

Conversation

cgzones
Copy link
Member

@cgzones cgzones commented Aug 29, 2023

Based on the DRM client usage stats1 add statistics for GPU time spend
and percentage utilization.

Tested with i915 and amdgpu.

@cgzones cgzones force-pushed the linux_gpu branch 4 times, most recently from 53dc054 to 833420f Compare August 29, 2023 12:00
@BenBE
Copy link
Member

BenBE commented Aug 29, 2023

Does this recognize/work with multi-GPU setups?

Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the fopen style API is hard to transform into liburing usage later on, thus try to avoid it. We also have a function xReadfile and xReadfileat to read a whole file at once into a appropriately sized buffer. For fdinfo files this should be sufficient and avoid the overhead of the fopen API.

linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.h Show resolved Hide resolved
linux/GPUMeter.h Show resolved Hide resolved
@BenBE BenBE added Linux 🐧 Linux related issues new feature Completely new feature labels Aug 29, 2023
@BenBE BenBE linked an issue Aug 29, 2023 that may be closed by this pull request
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated
Comment on lines 127 to 118
String_eq(ename, "0") ||
String_eq(ename, "1") ||
String_eq(ename, "2"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - had the same thought, I think this little optimization is not always valid unfortunately (e.g. a program closes one of these three at some point, as happens in some daemons, and then they later get reused).

@Explorer09
Copy link
Contributor

I wish the autoTitleRightAlign flag be split to a separate PR. The reason is simply I don't like the current design of the flag, and I think it deserves more discussion before introducing it into the codebase.

@cgzones cgzones force-pushed the linux_gpu branch 2 times, most recently from 9546934 to 44ec193 Compare September 3, 2023 19:47
linux/LinuxProcess.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPU.c Outdated Show resolved Hide resolved
linux/GPUMeter.c Outdated Show resolved Hide resolved
linux/GPUMeter.c Outdated
Comment on lines 39 to 83
static int humanTimeUnit(char* buffer, size_t size, unsigned long long int value) {

if (value < 1000)
return snprintf(buffer, size, "%3lluns", value);

value /= 1000;

if (value < 1000)
return snprintf(buffer, size, "%3lluus", value);

value /= 1000;

if (value < 1000)
return snprintf(buffer, size, "%3llums", value);

value /= 1000;

if (value < 600)
return snprintf(buffer, size, "%3llus", value);

value /= 60;

if (value < 600)
return snprintf(buffer, size, "%3llum", value);

value /= 60;

if (value < 96)
return snprintf(buffer, size, "%3lluh", value);

value /= 24;

return snprintf(buffer, size, "%3llud", value);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use xSnprintf?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some questions regarding this humanTimeUnit() function.

  1. Is this meant for printing values in meters or data cells or both?
  2. It seems that you are constraining the output to 5 characters (e.g. 999ns). Is this intentional?
  3. Why are there 600s, 600m and 96h thresholds, in particular? As I rewrote the Row_printTime() function in PR Rewrite Row_printTime() with various improvements #1325, I start to think that such thresholds in humanTimeUnit() don't make sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I had in the back of my mind, that there had been a PR regarding the time formatting code recently, I wrote that comment in #1321 so we can somehow focus these efforts.

For 600s and 600m the limits are kinda obvious with 10m and 10h respectively. For 96h (AKA 4d) the reason is likely to stay with whole days below a 3 digit value for hours.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why not format the time like 9.5m, 1h59m and 1d23h respectively. As I wrote #1325, I have an idea on how we can format the time within the 5 character limit.

  0ns - 999ns
1.0us - 9.9us
 10us - 999us
1.0ms - 999ms
 1.0s - 59.9s
1m00s - 9m59s
10.0m - 59.9m
1h00m - 9h59m
10.0h - 23.9h
1d00h - 9d23h
10.0d - 99.9d
 100d -  364d
 1.0y - 99.9y
 100y - 9999y
  inf -   inf

With respect to #1317, I believe the format like 2h30m would be better than 150m.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible alternative, mostly re-uses the ranges from your proposal.

  0ns - 999ns
1.0us - 9.9us
 10us - 999us
1.0ms - 999ms
 1.0s - 59.9s (Or 01.00 - 59.99 compared to the time column)
 1:00 - 59:59
 1h00 - 23h59
1d00h - 9d23h
10.0d - 99.9d
 100d -  364d
1y00m - 9y11m
10.0y - 99.9y
 100y - 9999y
  inf -   inf

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenBE No, the colon is ambiguous where there are hh:mm and mm:ss notations.
59m59 might be okay though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The colon would be consistent with existing formatting though. Yes, 59m59 would work, but it's not the highest option on my list …

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used xSnprintf().

Added formatting for X.Y(us|ms|s).

The input values are difference values to the prior scan so they should be lower than the scan interval (1.5s).

linux/LinuxProcess.c Outdated Show resolved Hide resolved
linux/LinuxProcessTable.c Outdated Show resolved Hide resolved
linux/LinuxProcessTable.c Outdated Show resolved Hide resolved
@cgzones cgzones marked this pull request as ready for review January 9, 2024 21:12
The difference of scans is useful for utilization calculations.

To avoid divisions by 0 on first scan set monotonicMs also on first
scan.
Instead of handling PERCENT_CPU as a special case for whether to align
the title of a dynamically sized column to the right or the left
introduce a new flag, which can be reused by other columns.
Bail out early if the passed time is 0 and shadow the result.
Based on the DRM client usage stats[1] add statistics for GPU time spend
and percentage utilization.

[1]: https://www.kernel.org/doc/html/latest/gpu/drm-usage-stats.html
Instead of ignoring the standard file descriptors 0, 1 and 2 scan for
GPU usage of a process only if that process had activity last cycle or
its last scan was more than five seconds ago.
@cgzones cgzones requested a review from BenBE January 9, 2024 21:30
@BenBE BenBE requested a review from natoscott January 9, 2024 23:19
@cgzones cgzones merged commit 0318589 into htop-dev:main Mar 27, 2024
12 checks passed
@cgzones cgzones deleted the linux_gpu branch March 27, 2024 18:49
@kryptobolt07
Copy link

The GPU meter doesn't seem to work on my intel haswell series processor(Intel Celeron 2955u),at least in the way i expect it to,the gpu utilizationn is displayed to be zero for example when browsing the web whereas using intel's own 'intel-gpu-tools' one can notice the render engines to be in active use.

@BenBE BenBE added this to the 3.4.0 milestone Jun 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Linux 🐧 Linux related issues new feature Completely new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add GPU utilization meter
5 participants