Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Otel allocation profiling POC #422

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nsavoire
Copy link
Collaborator

What does this PR do?

Make libdd_profiling.so compatible for use with otel profiler by creating a new mode that does not start ddprof and adding a USDT probe inside sampling path.

(TODO) better description

@pr-commenter
Copy link

pr-commenter bot commented Aug 27, 2024

Benchmark results for collatz

Parameters

Baseline Candidate
config baseline candidate
profiler-version ddprof 0.18.0+5849ebfa.39514389 ddprof 0.18.0+39546354.42964150

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean execution_time
scenario:ddprof -S bench-collatz --preset cpu_only collatz_runner.sh same

@pr-commenter
Copy link

pr-commenter bot commented Aug 27, 2024

Benchmark results for BadBoggleSolver_run

Parameters

Baseline Candidate
config baseline candidate
profiler-version ddprof 0.18.0+5849ebfa.39514389 ddprof 0.18.0+39546354.42964150

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean execution_time
scenario:ddprof -S bench-bad-boggle-solver BadBoggleSolver_run work 1000 unsure
[+1.363ms; +15.533ms] or [+0.056%; +0.642%]

}
PerfClock::init(static_cast<PerfClockSource>(rb.perf_clock_source));
} else {
PerfClock::init(PerfClockSource::kClockMonotonic);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because monotonic is what the full host profiler would expect, and tsc logic is not available

Copy link
Collaborator Author

@nsavoire nsavoire Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On otel profiler mode, PerfClock is not used to timestamp samples (since it's done on eBFP side), I had to initialize it because it's also used to determine when periodically check for newly loaded libraries.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And indeed TSC calibration is done on ddprof side and is not available here.

r1viollet
r1viollet previously approved these changes Aug 27, 2024
Copy link
Collaborator

@r1viollet r1viollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I do not see any issue with merging this. I assume the size impact is minimal for regular ddprof users.
Minor: we could have some form of test that ensures the probe is available.

@r1viollet
Copy link
Collaborator

[ RUN      ] allocation_tracker.start_stop
/go/src/github.com/DataDog/apm-reliability/ddprof-build/ddprof/test/allocation_tracker-ut.cc:141: Failure
Expected: (sample->dyn_size_stack) < (hdr->size - sizeof_allocation_event(0)), actual: 18088 vs 18088
[  FAILED  ] allocation_tracker.start_stop (323 ms)

@nsavoire nsavoire force-pushed the nsavoire/otel_allocation_profiling branch from ccc1c4d to 95c30f3 Compare August 27, 2024 12:58
@nsavoire
Copy link
Collaborator Author

[ RUN      ] allocation_tracker.start_stop
/go/src/github.com/DataDog/apm-reliability/ddprof-build/ddprof/test/allocation_tracker-ut.cc:141: Failure
Expected: (sample->dyn_size_stack) < (hdr->size - sizeof_allocation_event(0)), actual: 18088 vs 18088
[  FAILED  ] allocation_tracker.start_stop (323 ms)

Just saw that, USDT probe seems to have a big impact on stack size in sanitized mode.

@nsavoire nsavoire force-pushed the nsavoire/otel_allocation_profiling branch from 95c30f3 to 3c0d453 Compare August 27, 2024 16:57
@nsavoire nsavoire force-pushed the nsavoire/otel_allocation_profiling branch from 3c0d453 to 3954635 Compare August 27, 2024 18:37
@r1viollet
Copy link
Collaborator

r1viollet commented Aug 27, 2024

I'm not sure if the crash is reproducible and a product of the PR (relaunching the CI)

Copy link
Collaborator

@r1viollet r1viollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@r1viollet
Copy link
Collaborator

I agree on the TODO about a better description 😄

@@ -427,12 +453,12 @@ DDRes AllocationTracker::push_alloc_sample(uintptr_t addr,
const auto *stack_base_ptr = reinterpret_cast<const std::byte *>(&p);
auto stack_size = to_address(tl_state.stack_bounds.end()) - stack_base_ptr;

// stack will be saved in save_context, add some margin to account for call
// stack will be saved in save_context, add some ` to account for call
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor, comment was broken here

@r1viollet
Copy link
Collaborator

@nsavoire do you mind moving this to draft if we do not plan on merging soon ?

@nsavoire nsavoire marked this pull request as draft January 7, 2025 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants