Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make it possible to configure named resource detectors #119

Open
codeboten opened this issue Sep 9, 2024 · 6 comments
Open

make it possible to configure named resource detectors #119

codeboten opened this issue Sep 9, 2024 · 6 comments

Comments

@codeboten
Copy link
Contributor

It would be great to have the capability to configure resource detectors via the configuration file.

For example, the Collector needs to be able to specify that the environment variable resource detector should be used to address open-telemetry/opentelemetry-collector#10909

@jack-berg
Copy link
Member

In opentelemetry-java we have the ability to enable / disable resource detectors based on name using langauge-specific environment variables: otel.java.[enabled|disabled].resource.providers

The name is the FQCN.

@jack-berg
Copy link
Member

Dotnet auto instrumentation also has resource detector ids: https://github.com/open-telemetry/opentelemetry-dotnet-instrumentation/blob/main/docs/config.md#resource-detectors

According to @matt-hensley, PHP and JS also have the same concept in auto instrumentation projects.

This all suggests that resource detector names are a "defacto" feature in the spec, despite them not be explicitly written down.

I think we should: 1. Extend the spec to give resource detectors a name / id field. 2. Extend the configuration data model.

@codeboten do you think this is required for stability?

@matt-hensley
Copy link

Additional links:
JavaScript uses OTEL_NODE_RESOURCE_DETECTORS
PHP uses OTEL_PHP_DETECTORS

Also, being able to specify resource detectors via configuration is something end users have requested.

@marcalff
Copy link
Member

Instead of providing a name for a resource detector, should a resource detector be an extension point instead ?

This would be aligned with other parts of the configuration provided as components (custom exporter, etc).

@jack-berg
Copy link
Member

Right now, a typical resource configuration might look like:

resource:
  attributes:
    - name: service.name
       value: my-service
    - name: service.namespace
       value: acme
  attributes_list: ${OTEL_RESOURCE_ATTRIBUTES}
  detectors:
      included:
        - process.*
      excluded:
        - process.command_args

Capabilities:

  • Statically define attributes
  • Incorporate resource attributes from OTEL_RESOURCE_ATTRIBUTES standard env var
  • Pattern matching allow / deny list for attributes that come from resource detectors

A couple of things are missing:

  • Config conflicts with declarative config philosophy of "what you see is what you get". Depends on resource detectors automatically contributing attributes whether or not the user has selected them.
  • No way to enable or disable a resource detector. You might say that we can just have a policy where all resource detectors are enabled by default, and a user can selectively turn off attributes using .resource.detectors.excluded, but some resource detectors make network calls and should not be enabled by default.
  • No way to configure a resource detector. Sometimes resource detectors need configuration properties, like this openshift detector which has configurable address / token / tls properties that it uses to resolve resource attributes from a network location.

These types of requirements suggest that a resource detector should be a proper extension point as @marcalff suggests. But I think we probably need a bit more.

Here's what an ideal configuration might look like:

resource:
  detectors:
    - host:
    - container:
    - process:
    - openshift:
        address: https://api.example.com
        token: "token"

Notes:

  • Detectors are "what you see is what you get". Your resource won't have any attributes unless you explicitly include it in the list. This aligns with the rest of declarative config, but means that implementations need to have a well documented set of detector names for users to reference.
  • Detectors is an array, allowing the user to influence the priority of attributes if multiple resource detectors contribute the same attribute key.
  • Most detectors won't have any configuration options (i.e. host:) but some will (i.e. openshift: ...). Not being able to pass configuration to a detector would be an oversight.

What to do about .resource.detectors.included and .resource.detectors.excluded? Its still good to have a standard mechanism to specify an allow / deny list of attributes from the detector. But none of the options seem appealing:

  1. If we put included / excluded as properties in the .resource.detectors, then we could be exposed to conflicts if the names of these properties ever expand and collide with user-provider resource detector names:
resource:
  detectors:
    - process:
      excluded:
        - process.command_line
    - openshift:
        address: https://api.example.com
        token: "token"
  1. This suggests that we could nest the user-provided detector names inside a property, but this results in a lot of nesting and poor ergonomics:
resource:
  detectors:
    - detector:
        process:
      excluded:
        - process.command_line
    - detector:
        openshift:
          address: https://api.example.com
          token: "token"
  1. Another option is to say that included / excluded are not configurable at the invidual detector level. Instead, they are extracted to some common top level properties and apply universally. This isn't actually that bad. At first I thought that this would be insufficient because you may want to control included / excluded on an individual detector basis. But the only reason you'd want to do this is to resolve conflicts, and we already have a conflict resolution mechanism by specifying detectors as an array.
resource:
  detector_common:
    excluded:
      - process.command_line
  detectors:
    - process:
    - openshift:
        address: https://api.example.com
        token: "token"

What do folks think? Any other ideas?

@jack-berg
Copy link
Member

I brought this up in the 2/13/25 java SIG. Was interested in hearing what the java SIG had to say about an opt-in approach to resource detection as I proposed in option 3 above. Talked about a variety of an alternatives and their shortcomings:

Object with entries for each detector, each with enabled property for each, optional additional detector-specific properties

  • No ability to specify prioritization between detectors
  • Mixes standard enabled property with detector-specific properties (no precedent for this in declarative config)
  • Verbose
  • Ambiguous how enabled interacts with language / distribution defaults
resource:
  detectors:
    foo:
      enabled: true
    bar:
      enabled: true
      host:
      token:
    baz:
      enabled: true

Array of detector objects, each enabled property and required detector object to specify / configure the specific detector

  • Solves prioritization problem
  • Verbose
  • Ambiguous how enabled interacts with language / distribution defaults
resource:
  detectors:
    - detector:
         foo:
       enabled: true
    - detector:
         bar:
           host:
           token:
      enabled: true

Enabled / disable detector object arrays

  • Solves prioritization problem
  • Verbose
  • Ambiguous how enabled / disabled interacts with language / distribution defaults
resource:
  detectors:
    enabled:
      - foo:
          host:
          token:
      - bar:
    disabled:
      - baz:

The common themes are verbosity and ambiguous interaction with language / distribution defaults, which are solved with option 3 from this comment.

Of course there is the concern that having users explicitly opt-in to resource detectors is cumbersome, but:

  • Explicitly stating the desired configuration is aligned with existing patterns in declarative config
  • We have examples to help guide users to do the right thing
  • The collector's resourcedetection processor is opt-in and appears to be well received by users

@trask brought up a good point that the downside of an opt-in approach is diminished with good tooling for distributions to customize declarative config. For example, the otel java agent could by default customize the resource detectors by adding a default set if the user doesn't set the field.


I think wherever we land on this, I'm inclined to keep resource detection as experimental property for some time to work out the kinks, using the mechanisms for experimental properties laid out in #142. The lack of specification around resource detectors means there is more debate to be had, and more prototyping needed to feel comfortable with the design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

4 participants