Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: Implement core event counters #859

Open
htejun opened this issue Oct 29, 2024 · 4 comments
Open

kernel: Implement core event counters #859

htejun opened this issue Oct 29, 2024 · 4 comments
Assignees
Labels
help wanted Extra attention is needed kernel Expose a kernel issue

Comments

@htejun
Copy link
Contributor

htejun commented Oct 29, 2024

The kernel sometimes has to take actions that override the BPF scheduler's decisions. Non-exhaustive list:

  • If ops.select_cpu() returns a CPU which can't be used by the task, the core scheduler code silently picks a fallback CPU.
  • When dispatching to a local DSQ, the CPU may have gone offline in the meantime. In this case, the task is bounced to the global DSQ.

In addition, there are common events that can be interesting but not easily visible:

  • If SCX_OPS_ENQ_LAST is not set, the number of times that a task continued to run because there were no other tasks on the CPU.
  • Similar for SCX_OPS_ENQ_EXITING.
  • Similar for bypass mode.

Statistics like the above can be collected and made accessible to the BPF scheduler via kfunc interface for visibility and sanity checks. It may also make sense to implement a threshold mechanism so that e.g. the BPF scheduler is picking an invalid CPU for >2% of the time for some duration, trigger ops error and so on.

@htejun htejun added help wanted Extra attention is needed kernel Expose a kernel issue labels Oct 29, 2024
@multics69 multics69 self-assigned this Oct 31, 2024
@multics69
Copy link
Contributor

The first patchset was posted: https://lore.kernel.org/lkml/[email protected]/

@multics69
Copy link
Contributor

multics69 commented Feb 5, 2025

The following seven events were merged.

@multics69
Copy link
Contributor

In addition, I will add one more useful event, which count how many times the default time slice has been set, and add a filesystem interface to peek the event counters from userspace easily.

@multics69
Copy link
Contributor

multics69 commented Feb 8, 2025

One more event was added:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed kernel Expose a kernel issue
Projects
None yet
Development

No branches or pull requests

2 participants