Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Tracing #191

Open
cvonelm opened this issue Jul 6, 2021 · 4 comments
Open

Python Tracing #191

cvonelm opened this issue Jul 6, 2021 · 4 comments

Comments

@cvonelm
Copy link
Member

cvonelm commented Jul 6, 2021

This is a write-up of what I've discovered about Python tracing until now:

What makes Python tracing (and tracing for a whole host of other applications possible) are User Statically Defined Tracepoints (USDT)

USDT allows developers to define their own static tracepoints in userland code, very much how Kprobes work for Kernel Tracepoints. Python has a whole host of different USDT tracepoints, but the most useful to us is:

function__entry(const char * filename, const char *funcname, int lineno)

Which records function entry.

Recording this tracepoint in perf is possible, but to my knowledge we can not access the adresses of filename and funcname from userspace and there is no guarantee that those references will be still valid when we read the perf ring-buffer anyways.

The most effective way seems to be to read the USDT tracepoint with BPF, and to then write the necessary information to a BPF ringbuffer, which we then can poll from lo2s.

https://github.com/iovisor/bcc/blob/master/docs/reference_guide.md contains everything that we can do with bcc right now, which might be worth a look at for things beyond Python tracing.

As bcc has been superseded by libbpf, the above information is not up to date anymore.

@bmario
Copy link
Member

bmario commented Oct 28, 2021

Maybe we can learn from this:

https://github.com/benfred/py-spy

@cvonelm
Copy link
Member Author

cvonelm commented Jun 10, 2022

Documenting how py-spy does it (roughly):

  1. Read the location of the python binary from /proc
  2. Parse Debug Symbols of the binary for the location of the _PyRuntime symbol. this is the main struct containing the execution state
  3. Read process memory (also from procfs?) of the python interpreter and locate _PyRuntime
  4. cast to InterpreterState struct. From that on the fields can be used to deduce the execution state.

@cvonelm cvonelm changed the title Python Tracing and other BPF shenanigans Python Tracing May 8, 2023
@tilsche
Copy link
Member

tilsche commented Feb 20, 2024

Now that's some promising progress as opposed to the usual hacks: https://docs.python.org/3/howto/perf_profiling.html#perf-profiling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants