Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logs_to_metric transform should allow for deriving tags field from a field (without specifically listing all the possible keys) #14744

Open
breathe opened this issue Oct 5, 2022 · 1 comment
Labels
transform: log_to_metric Anything `log_to_metric` transform related type: feature A value-adding code addition that introduce new functionality.

Comments

@breathe
Copy link

breathe commented Oct 5, 2022

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Use Cases

It appears that the log_to_metric transform has no way to support deriving an arbitrary set of tags from a log event ...?

A configuration like this will fail because the tags configuration property is required to be a map ...

    metrics:
      - type: "histogram"
        field: "FIELD"
        name: "{{NAME}}"
        namespace: "{{NAMESPACE}}"
        tags: "{{TAGS}}"

Attempted Solutions

  • I tried with and without the template expansion
  • I looked into whether I could use remap to add the tags back onto the metric after it had been converted from log into a metric -- but I'm not clear on whether its possible to pass information through the pipeline in such a way to allow this ...

I've ended up working around this issue for now with a lua transform ... -- at the moment I only care about supporting gauge metrics and so I've only implemented the transform for the gauge type (I suspect there is likely more sophisticated logic needed for some of the other metrics ...)

It would really nice if I could instead just point the logs_to_metric transform at a field containing all the tags I want defined on the output metric ... To prevent high cardinality metric issues -- I'd use the tag_cardinality_limit transform ...

I'm working around for now with a lua transform. I'm filing this feature request to document a use case where lua transform seems required (as suggested in the lua transform docs)

transforms:
  parsing:
    type: "remap"
    inputs:
      - snowflake_s3
    source: |-
      . = parse_json!(string!(.message))

  route_logs_by_type:
    type: route
    inputs:
      - parsing
    route:
      counter: .TYPE == "counter"
      histogram: .TYPE == "histogram"
      gauge: .TYPE == "gauge"
      set: .TYPE == "set"
      summary: .TYPE == "summary"
      log: .TYPE == "log"

  remap_counter_log_to_metric:
    type: log_to_metric
    inputs:
      - route_logs_by_type.counter
    metrics:
      - type: "counter"
        field: "FIELD"
        name: "{{NAME}}"
        namespace: "{{NAMESPACE}}"

  remap_histogram_log_to_metric:
    type: log_to_metric
    inputs:
      - route_logs_by_type.histogram
    metrics:
      - type: "histogram"
        field: "FIELD"
        name: "{{NAME}}"
        namespace: "{{NAMESPACE}}"

  # {"name":"storage.table.retained_bytes.avg","namespace":"snowflake","tags":{"env":"dev","schema":"STAGING","service":"snowflake","table":"SOME_LOG"},"timestamp":"2022-10-05T21:09:16Z","kind":"absolute","gauge":{"value":0.0}}
  remap_gauge_log_to_metric:
    type: lua
    version: "2"
    inputs:
      - route_logs_by_type.gauge
    hooks:
      process: |-
        function (event, emit)
          event.metric = {
            name = event.log.NAME,
            namespace = event.log.NAMESPACE,
            kind = "absolute",
            timestamp = os.date("!*t"),
            tags = event.log.TAGS,
            gauge = {
              value = event.log.FIELD
            }
          }
          event.log = nil
          emit(event)
        end

  # {"name":"storage.table.retained_bytes.avg","namespace":"snowflake","timestamp":"2022-10-05T21:04:04.776646Z","kind":"absolute","gauge":{"value":0.0}}
  # remap_gauge_log_to_metric:
  #   type: log_to_metric
  #   inputs:
  #     - route_logs_by_type.gauge
  #   metrics:
  #     - type: "gauge"
  #       field: "FIELD"
  #       name: "{{NAME}}"
  #       namespace: "{{NAMESPACE}}"

  remap_set_log_to_metric:
    type: log_to_metric
    inputs:
      - route_logs_by_type.set
    metrics:
      - type: "gauge"
        field: "FIELD"
        name: "{{NAME}}"
        namespace: "{{NAMESPACE}}"

  remap_summary_log_to_metric:
    type: log_to_metric
    inputs:
      - route_logs_by_type.summary
    metrics:
      - type: "summary"
        field: "FIELD"
        name: "{{NAME}}"
        namespace: "{{NAMESPACE}}"

  remap_log_field_to_message:
    type: remap
    inputs:
      - route_logs_by_type.log
    source: |-
      .message = .FIELD
      del(.FIELD)
      .ddsource = .NAMESPACE
      del(.NAMESPACE)
      .service = .TAGS.service
      .ddtags = .TAGS
      del(.TAGS)
      del(.ddtags.service)
      del(.TYPE)
      .timestamp = now()

sinks:
  datadog_metrics:
    type: datadog_metrics
    inputs:
      - remap_counter_log_to_metric
      - remap_histogram_log_to_metric
      - remap_gauge_log_to_metric
      - remap_set_log_to_metric
      - remap_summary_log_to_metric
    default_api_key: "${DD_API_KEY:?err}"

  datadog_logs:
    type: datadog_logs
    inputs:
      - remap_log_field_to_message
      - route_logs_by_type._unmatched
    default_api_key: "${DD_API_KEY:?err}"

  console_output:
    type: console
    inputs:
      - remap*
    encoding:
      codec: "text"

tests:
  - name: "parsing -> parsing"
    inputs:
      - type: raw
        insert_at: parsing
        value: |-
          {"FIELD":0,"NAME":"storage.table.retained_bytes.avg","NAMESPACE":"snowflake","TAGS":{"env":"dev","schema":"STAGING","service":"snowflake","table":"STORAGE_LOG"},"TYPE":"gauge"}

    outputs:
      - extract_from: parsing
        conditions:
          - type: vrl
            source: |-
              assert!(exists(.TYPE))

  - name: "parsing -> route_logs_by_type.gauge"
    inputs:
      - type: raw
        insert_at: parsing
        value: |-
          {"FIELD":0,"NAME":"storage.table.retained_bytes.avg","NAMESPACE":"snowflake","TAGS":{"env":"dev","schema":"STAGING","service":"snowflake","table":"STORAGE_LOG"},"TYPE":"gauge"}

    outputs:
      - extract_from: route_logs_by_type.gauge
        conditions:
          - type: vrl
            source: |-
              assert!(exists(.NAMESPACE))

  - name: "parsing -> remap_gauge_log_to_metric"
    inputs:
      - type: raw
        insert_at: parsing
        value: |-
          {"FIELD":0,"NAME":"storage.table.retained_bytes.avg","NAMESPACE":"snowflake","TAGS":{"env":"dev","schema":"STAGING","service":"snowflake","table":"STORAGE_LOG"},"TYPE":"gauge"}

    outputs:
      # {"name":"storage.table.retained_bytes.avg","namespace":"snowflake","tags":{"env":"dev","schema":"STAGING","service":"snowflake","table":"STORAGE_LOG"},"timestamp":"2022-10-05T21:09:16Z","kind":"absolute","gauge":{"value":0.0}}
      - extract_from: remap_gauge_log_to_metric
        conditions:
          - type: vrl
            source: |-
              assert!(exists(.name))
              assert!(exists(.namespace))
              assert!(exists(.tags))
              assert!(exists(.timestamp))
              assert!(exists(.kind))

  - name: "parsing -> remap_log_field_to_message"
    inputs:
      - type: raw
        insert_at: parsing
        value: |-
          {"FIELD":"test logging","NAMESPACE":"snowflake","TAGS":{"env":"dev","service":"snowflake"},"TYPE":"log"}

    outputs:
      # {"ddsource":"snowflake","ddtags":{"env":"dev"},"message":"test logging","service":"snowflake","timestamp":"2022-10-05T21:48:30.823011Z"}
      - extract_from: remap_log_field_to_message
        conditions:
          - type: vrl
            source: |-
              assert!(exists(.ddsource))
              assert!(exists(.ddtags))
              assert!(exists(.message))
              assert!(exists(.service))
              assert!(exists(.timestamp))

Proposal

Expand log_to_metric with a new parameter that allows deriving all tags for the metric from a particular field in the log. The field in the log will be expected and required to be a map from string -> string.

References

Version

vector 0.24.1 (x86_64-apple-darwin 8935681 2022-09-12)

@breathe breathe added the type: feature A value-adding code addition that introduce new functionality. label Oct 5, 2022
@spencergilbert spencergilbert added the transform: log_to_metric Anything `log_to_metric` transform related label Oct 6, 2022
@titaneric
Copy link

I just want exactly same feature! I hope that I could dig into this possible feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
transform: log_to_metric Anything `log_to_metric` transform related type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
Development

No branches or pull requests

3 participants