-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add APM Single Step Instrumentation #981
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #981 +/- ##
==========================================
+ Coverage 57.16% 57.17% +0.01%
==========================================
Files 164 164
Lines 19574 19703 +129
==========================================
+ Hits 11189 11265 +76
- Misses 7732 7777 +45
- Partials 653 661 +8
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 3 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
273f183
to
25ec543
Compare
Thanks @CharlyF for the PR! In both helm and Cluster Agent, we have some validation of SSI parameters. I was wondering if https://github.com/DataDog/datadog-operator/blob/main/controllers/datadogagent/clusteragent.go#L689 would be the right place to put it for the Operator. The validation logic is the following:
|
Oh I see this logic is in features.go. Will throwing an error there fail Operator installation? |
@@ -228,6 +252,47 @@ func (f *apmFeature) ManageDependencies(managers feature.ResourceManagers, compo | |||
// ManageClusterAgent allows a feature to configure the ClusterAgent's corev1.PodTemplateSpec | |||
// It should do nothing if the feature doesn't need to configure it. | |||
func (f *apmFeature) ManageClusterAgent(managers feature.PodTemplateManagers) error { | |||
if f.singleStepInstrumentation != nil && f.singleStepInstrumentation.enabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The outer check should only have if f.singleStepInstrumentation != nil
.
It's possible to have f.singleStepInstrumentation.enabled='false' && f.singleStepInstrumentation.enabledNamespaces != nil
- in this case SSI is disabled in the whole cluster, but enabled in some namespaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if f.singleStepInstrumentation != nil {
// Unsupported configuration
if len(f.singleStepInstrumentation.disabledNamespaces) > 0 &&
len(f.singleStepInstrumentation.enabledNamespaces) > 0 {
return error
}
if f.singleStepInstrumentation.enabled && f.singleStepInstrumentation.enabledNamespaces {
// SSI instrumentation is enabled on the cluster level
log.warn("SSI will be enabled in the whole cluster")
setEnv(DDAPMInstrumentationEnabled, true)
} else if !f.singleStepInstrumentation.enabled && f.singleStepInstrumentation.disabledNamespaces {
// SSI instrumentation is disabled on the cluster level
log.warn("SSI will be disabled in the whole cluster")
setEnv(DDAPMInstrumentationEnabled, false)
} else if len(f.singleStepInstrumentation.enabledNamespaces) > 0 {
// SSI instrumentation is enabled on the namespace level
setEnv(DDAPMInstrumentationEnabled, false)
setEnv(DDAPMInstrumentationEnabledNamespaces, parsedNs)
} else if len(f.singleStepInstrumentation.disabledNamespaces) > 0
{
// SSI instrumentation is disabled on the namespace level
setEnv(DDAPMInstrumentationEnabled, true)
setEnv(DDAPMInstrumentationDisabledNamespaces, parsedNs)
}
setEnv(DDAPMInstrumentationLibVersions, libVersions)
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm that would be super confusing UX imo. The flag explicitly says enabled: false
but then it's not actually disabled? Why should we do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
singleStepInstrumentation.enabled
is a cluster level configuration, singleStepInstrumentation.enabledNamespace
is a namespace level configuration. SSI can be disabled in the cluster, but enabled on a specific namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not do that please. This would be unique behavior, no other product does that. It's also confusing, as a user you'd think you're disabling the feature but it's not actually the case. Global enable/disable should be global, namespace-level config should be namespace level, etc. and global should override everything, that's how every other config of the agent works.
c661027
to
810d6c8
Compare
if f.singleStepInstrumentation != nil { | ||
if len(f.singleStepInstrumentation.disabledNamespaces) > 0 && len(f.singleStepInstrumentation.enabledNamespaces) > 0 { | ||
// This configuration is not supported | ||
return fmt.Errorf("`enabledNamespaces` and `disabledNamespaces` cannot be set together") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Behavior I saw when testing DataDog/helm-charts#1252 was not to error and instead use the list based on the flag selection - i.e. if enabled use disabledNamespaces
, if disabled use enabledNamespaces
. Returning error will prevent reconciling and any agent installation.
Could you confirm this is what we want, @liliyadd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was actually a bug in Helm Chart that didn't fail the helm chart installation when both enabledNamespaces and disabledNamespaces are set. With DataDog/helm-charts#1261 it won't be possible.
I ran the following tests in the staging cluster:
Cluster agent env variables:
Test app behavior:
Cluster agent env variables:
Test app behavior:
Cluster agent env variables:
Test app behavior:
4.features.apm.instrumentation.disabledNamespaces is set
Cluster agent env variables:
Test app behavior:
5.features.apm.instrumentation.disabledNamespaces && features.apm.instrumentation.enabledNamespaces are set
Result:
Cluster agent env variables:
Test app behavior:
7.features.apm.instrumentation.enabled is enabled && features.apm.instrumentation.disabledNamespaces is set
Cluster agent env variables:
Test app behavior:
8.features.apm.instrumentation.enabled is enabled && features.apm.instrumentation.enabledNamespaces is set
Cluster agent env variables:
Test app behavior:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in the PR look good to me. There are some discrepancies between Operator and current Helm chart behavior around disabling SSI in a Datadog namespace by default that @CharlyF and I discussed - Operator won't support disabling SSI in DD namespace and this behavior needs to be documented for the end user. The logic of disabling SSI in DD namespace will be moved to Cluster Agent in 7.51 Agent release allowing Helm Chart cleanup and making Helm and Operator consistent.
What does this PR do?
Define a new struct in the APM feature to allow configuration of the Single Instrumentation Step. Sister PR of DataDog/helm-charts#1211.
Added a couple of tests, the CRD change and the main code in distinct commits.
Motivation
To be confirmed if the following is the expected format.
but the DCA's config seems good:
Additional Notes
Anything else we should know when reviewing?
Minimum Agent Versions
Are there minimum versions of the Datadog Agent and/or Cluster Agent required?
Describe your test plan
should start the Cluster Agent and configure:
DD_APM_INSTRUMENTATION_ENABLED
,DD_APM_INSTRUMENTATION_DISABLED_NAMESPACES
andDD_APM_INSTRUMENTATION_LIB_VERSIONS
.If you use
DD_APM_INSTRUMENTATION_DISABLED_NAMESPACES
andDD_APM_INSTRUMENTATION_ENABLED_NAMESPACES
, it will error out (not supported for now?).Checklist
bug
,enhancement
,refactoring
,documentation
,tooling
, and/ordependencies
qa/skip-qa
label