-
-
Notifications
You must be signed in to change notification settings - Fork 469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OpenTelemetry event tracing to connections #3102
Add OpenTelemetry event tracing to connections #3102
Conversation
Honestly, just smoke tests. We'll have to rig up Prometheus or Signoz or something in a test harness to really test it manually. The decorator around the connection lifetime, does that even need all the if/then checks? Shouldn't it be, only apply the decorator if that option is selected in the first place? You want to absolutely minimize the runtime steps because everything is in the hot path there. For event metrics, I assumed we'd use an IDocumentSessionListener anyway |
Yep, I’ll fix that, I wrote the thing bottom up, so the additional checks are not needed.
Also, the LINQ tests are currently failing on CI. I haven’t investigated this as yet, do you want me to raise an issue for this?
Let me have a look at tests later in the week, we might be able to use some of the fakes Microsoft recently added.
Thanks,
Sean.
From: Jeremy D. Miller ***@***.***>
Sent: Tuesday, April 2, 2024 2:14 AM
To: JasperFx/marten ***@***.***>
Cc: Sean Farrow ***@***.***>; Author ***@***.***>
Subject: Re: [JasperFx/marten] Add OpenTelemetry event tracing to connections (PR #3102)
Honestly, just smoke tests. We'll have to rig up Prometheus or Signoz or something in a test harness to really test it manually.
The decorator around the connection lifetime, does that even need all the if/then checks? Shouldn't it be, only apply the decorator if that option is selected in the first place? You want to absolutely minimize the runtime steps because everything is in the hot path there.
For event metrics, I assumed we'd use an IDocumentSessionListener anyway
—
Reply to this email directly, view it on GitHub<#3102 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALDK7WJJPBWDFPMB7VN3TTY3IA63AVCNFSM6AAAAABFQLQCQ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZQHEYDCMRZGY>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
…e SessionOptions class.
…connection events.
…ss and supporting classes.
9abe946
to
72e9668
Compare
public static string MartenTenantId = "MartenTentantId"; | ||
public static string MartenCorelationId = "MartenCorelationId"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these be const
are static readonly
?
When they should be setable, then make them a property.
"Marten", | ||
typeof(MartenTracing).Assembly.GetName().Version!.ToString()); | ||
|
||
public static Activity? StartConnectionActivity(Activity? parentActivity =null, IEnumerable<KeyValuePair<string, object?>>? tags =null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit
public static Activity? StartConnectionActivity(Activity? parentActivity =null, IEnumerable<KeyValuePair<string, object?>>? tags =null) | |
public static Activity? StartConnectionActivity(Activity? parentActivity =null, IEnumerable<KeyValuePair<string, object?>>? tags = null) |
return StartActivity("connection", parentActivity, tags); | ||
} | ||
|
||
public static Activity StartActivity(string spanName, Activity? parentActivity =null, IEnumerable<KeyValuePair<string, object?>>? tags =null, ActivityKind activityKind =ActivityKind.Internal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static Activity StartActivity(string spanName, Activity? parentActivity =null, IEnumerable<KeyValuePair<string, object?>>? tags =null, ActivityKind activityKind =ActivityKind.Internal) | |
public static Activity StartActivity(string spanName, Activity? parentActivity = null, IEnumerable<KeyValuePair<string, object?>>? tags = null, ActivityKind activityKind = ActivityKind.Internal) |
|
||
public static Activity StartActivity(string spanName, Activity? parentActivity =null, IEnumerable<KeyValuePair<string, object?>>? tags =null, ActivityKind activityKind =ActivityKind.Internal) | ||
{ | ||
return ActivitySource.CreateActivity(spanName, activityKind, parentActivity?.ParentId, tags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be
return ActivitySource.CreateActivity(spanName, activityKind, parentActivity?.ParentId, tags); | |
return ActivitySource.StartActivity(spanName, activityKind, parentActivity?.ParentId, tags); |
?
Otherwise an explicit Start
needs to be done on the activity.
Or this method should be called CreateActivity
too, else it's very misleading.
private const string MartenCommandExecutionStarted = "MartenCommandExecutionStarted"; | ||
private const string MartenCommandExecutionFailed = "MartenCommandExecutionFailed"; | ||
private const string MartenBatchExecutionStarted = "MartenBatchExecutionStarted"; | ||
private const string MartenBatchExecutionFailed = "MartenBatchExecutionFailed"; | ||
private const string MartenBatchPagesExecutionStarted = "MartenBatchPagesExecutionStarted"; | ||
private const string MartenBatchPagesExecutionFailed = "MartenBatchPagesExecutionFailed"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According OTel semantics these should have different names which we should follow.
private const string MartenCommandExecutionStarted = "MartenCommandExecutionStarted"; | |
private const string MartenCommandExecutionFailed = "MartenCommandExecutionFailed"; | |
private const string MartenBatchExecutionStarted = "MartenBatchExecutionStarted"; | |
private const string MartenBatchExecutionFailed = "MartenBatchExecutionFailed"; | |
private const string MartenBatchPagesExecutionStarted = "MartenBatchPagesExecutionStarted"; | |
private const string MartenBatchPagesExecutionFailed = "MartenBatchPagesExecutionFailed"; | |
private const string MartenCommandExecutionStarted = "marten.command.execution.started"; | |
private const string MartenCommandExecutionFailed = "marten.command.execution.failed"; | |
private const string MartenBatchExecutionStarted = "marten.batch.execution.started"; | |
private const string MartenBatchExecutionFailed = "marten.batch.execution.failed"; | |
private const string MartenBatchPagesExecutionStarted = "marten.batch.pages.execution.started"; | |
private const string MartenBatchPagesExecutionFailed = "marten.batch.pages.execution.failed"; |
} | ||
catch (Exception e) | ||
{ | ||
if (_openTelemetryOptions.TrackConnectionEvents) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same. I'll stop to comment on these, but there are some more below.
{ | ||
if (_openTelemetryOptions.TrackConnectionEvents) | ||
{ | ||
_databaseActivity.AddEvent(new ActivityEvent("Database command execution started")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Event name should be a constant too, and have a name following OTel semantics (see first comment).
/// <summary> | ||
/// Used to track OpenTelemetry events for a connection, for example when a command or data reader has been executed. This defaults to false. | ||
/// </summary> | ||
public bool TrackConnectionEvents { get; set; } = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public bool TrackConnectionEvents { get; set; } = false; | |
public bool TrackConnectionEvents { get; set; } |
false
is the default by C# language specs, so can be omitted.
@@ -93,6 +101,23 @@ internal IConnectionLifetime Initialize(DocumentStore store, CommandRunnerMode m | |||
throw new DefaultTenantUsageDisabledException(); | |||
} | |||
|
|||
var innerConnectionLifetime =GetInnerConnectionLifetime(store, mode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var innerConnectionLifetime =GetInnerConnectionLifetime(store, mode); | |
var innerConnectionLifetime = GetInnerConnectionLifetime(store, mode); |
@@ -93,6 +101,23 @@ internal IConnectionLifetime Initialize(DocumentStore store, CommandRunnerMode m | |||
throw new DefaultTenantUsageDisabledException(); | |||
} | |||
|
|||
var innerConnectionLifetime =GetInnerConnectionLifetime(store, mode); | |||
|
|||
var corelationId = Activity.Current is not null ? Activity.Current.RootId : Guid.NewGuid().ToString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When there's no activity that is recored (Current = null
), is it OK to just create a new id?
I don't know what's best here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand things, if the current activity is null, this means you are the first activity in the chain, so creating a new activity/id is the right thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the current activity is null, this means you are the first activity in the chain
Being the first one in the chain means that there's no parent, but Current
is set an Activity
.
When there's no listeneres interested in an activity or no listeners are registered, then ActivitySource.StartActivity
will return null
, hence Activity.Current
will be null
too.
So when there's no activity I don't know if just creating a new (random) Id is correct.
Maybe in that case it should just be empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Activity would be null if the app isn't actually emitting Otel with a configured exporter. Sorry, just what @gfoidl already said
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeremydmiller
So should we set the current activity if it is null? or not start an activity at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's common for the Activity to be null, so just expect it. If you look in the Wolverine code, you'll see that every call to Activity is Activity?.AddTag() or similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code needed at all? The activity will internally check for a current activity and grab the correlation ID if required.
@SeanFarrow One combination of the LInq tests will fail periodically on CI, but always succeed on a subsequent run. I've never been able to reproduce it locally. |
…ould aggregate to null
…ct uses the Id of the original document. Closes JasperFxGH-3096
|
||
public Task<int> ExecuteAsync(NpgsqlCommand command, CancellationToken token = new CancellationToken()) | ||
{ | ||
if (_openTelemetryOptions.TrackConnectionEvents) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this checked here if it's already checked when initializing this object?
var innerConnectionLifetime =GetInnerConnectionLifetime(store, mode); | ||
|
||
var corelationId = Activity.Current is not null ? Activity.Current.RootId : Guid.NewGuid().ToString(); | ||
var tags = new Dictionary<string, object> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer TagsList
when dealing with OTEL tags. This should also be lazy initialized only if EventTracingConnectionLifetime
is being used.
return OpenTelemetryOptions.TrackConnectionEvents | ||
? new EventTracingConnectionLifetime(innerConnectionLifetime, OpenTelemetryOptions, | ||
StartConnectionActivity(Activity.Current, tags)) | ||
: innerConnectionLifetime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we only allocate the event tracing lifetime if there's any active listeners:
return OpenTelemetryOptions.TrackConnectionEvents | |
? new EventTracingConnectionLifetime(innerConnectionLifetime, OpenTelemetryOptions, | |
StartConnectionActivity(Activity.Current, tags)) | |
: innerConnectionLifetime; | |
return (OpenTelemetryOptions.TrackConnectionEvents && ActivitySource.HasListeners()) | |
? new EventTracingConnectionLifetime(innerConnectionLifetime, OpenTelemetryOptions, | |
StartConnectionActivity(Activity.Current, tags)) | |
: innerConnectionLifetime; |
…time if we have listeners.
OK, I think I've fixed everything, can people please check they are happy. I'll then look at tests. |
@jeremydmiller I'm just adding tests. Things gone a bit mad at work as I've just switched teams. I'll start the event/other stuff at the end of next week when things calm down. |
Update: I should be code complete by tomorrow evening UK time. The only thing to then do is the docs.
Thanks,
Sean.
From: Jeremy D. Miller ***@***.***>
Sent: Wednesday, April 17, 2024 7:41 PM
To: JasperFx/marten ***@***.***>
Cc: Sean Farrow ***@***.***>; Mention ***@***.***>
Subject: Re: [JasperFx/marten] Add OpenTelemetry event tracing to connections (PR #3102)
@SeanFarrow<https://github.com/SeanFarrow> Anything new on this? Is there anything you're blocked on?
—
Reply to this email directly, view it on GitHub<#3102 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALDK7V6KZCOHDD4HRLZBLDY5263VAVCNFSM6AAAAABFQLQCQ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRRHE3TGNBVGI>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
…e SessionOptions class.
…connection events.
…ss and supporting classes.
…time if we have listeners.
…t is inline with OTel standards.
…tivity methods in the MartenTracing class.
…thod nullable in the MartenTracing class and remove unneeded using declarations.
…ing OpenTelemetry when the appropriate option is configured and we have listeners.
…lemetry activity.
…eanFarrow/marten into feature/sf/2909-open-telemetry
@SeanFarrow You'd done a ton man, I'll take you up on some docs, but I'm good to pull this in sooner rather than later. I can easily deal w/ whatever CI hiccups are happening. |
@jeremydmiller I'm done and happy. Thanks. |
Cool, and thanks Sean! I'll try to pull this in in the next couple days. Might be until next week before 7.9 gets out with this though. |
This PR add event tracing to IConnectionLifetime.
@jeremydmiller I have a few questions:
How do we want to test the EventConnectionLifetime and MartenTracing classes?
When do we want to update the documentation?
Do we want to wait to merge this until all the other OTel stuff is done?
Thanks,
Sean.