Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Otel section #53699

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
154 changes: 136 additions & 18 deletions docs/operator-guides/collecting-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,28 +16,146 @@ Airbyte uses Datadog to monitor Airbyte Cloud performance on a [number of metric

![Datadog's Airbyte Integration Dashboard](assets/DatadogAirbyteIntegration_OutOfTheBox_Dashboard.png)

<!--
## Airbyte OpenTelemetry Integration
## OpenTelemetry metrics monitoring (Self-Managed Enterprise only) {#otel}

Setting up this integration for Airbyte instances involves three straightforward steps:
Airbyte Self-Managed Enterprise generates a number of crucial metrics about syncs and volumes of data moved. You can configure Airbyte to send telemetry data to an OpenTelemetry collector endpoint so you can consume these metrics in your downstream monitoring tool of choice. Airbyte does not send traces and logs.

1. **Deploy an OpenTelemetry Collector**: Follow the official [Kubernetes Getting Started documentation](https://opentelemetry.io/docs/collector/getting-started/#kubernetes) to deploy a collector in your kubernetes cluster.
Airbyte sends specific metrics to provide you with health insight in the following areas.

2. **Update the chart values**: Modify your `values.yaml` file in the Airbyte repository to include the `metrics-reporter` container. This submits Airbyte metrics to the OpenTelemetry collector:
- Resource provisioning: Monitor API requests and sync attempts to ensure your deployment has adequate resources

```yaml
global:
metrics:
metricClient: "otel"
otelCollectorEndpoint: "http://otel-collector.opentelemetry.svc:4317"
- Sync performance: Track sync duration and data volume moved to understand performance

metrics:
enabled: true
```
- System health: Monitor sync status and completion rates to ensure system stability

:::note
Update the value of `otelCollectorEndpoint` with your collector URL.
:::
### Configure OpenTelemetry metrics

3. **Re-deploy Airbyte**: With the updated chart values, you're ready to deploy your Airbyte application by upgrading the chart.
-->
1. Deploy an OpenTelemetry collector if you don't already have one. See the [OpenTelemetry documentation](https://opentelemetry.io/docs/collector/getting-started/#kubernetes) for help doing this. If you use Datadog as your monitoring tool, they have an excellent guide to [set up a collector and exporter](https://docs.datadoghq.com/opentelemetry/collector_exporter/).

2. Update your `values.yaml` file to enable OpenTelemetry.

```yaml
global:
edition: enterprise # This is an enterprise-only feature
metrics:
enabled: true
otlp:
enabled: true
collectorEndpoint: "YOUR_ENDPOINT" # The OTel collector endpoint Airbyte sends metrics to. You configure this endpoint outside of Airbyte as part of your OTel deployment.
```

3. Redeploy Airbyte with the updated values.

Airbyte sends metrics to the collector you specified in your configuration.

### Available metrics

The following metrics are available. They're published every minute.

<table>
<thead>
<tr>
<th>Metric</th>
<th>Tag</th>
<th>Example Value</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="8"><code>airbyte.syncs</code></td>
<td><code>connection_id</code></td>
<td>653a067e-cd0b-4cab-96b5-5e5cb03f159b</td>
</tr>
<tr>
<td><code>workspace_id</code></td>
<td>bed3b473-1518-4461-a37f-730ea3d3a848</td>
</tr>
<tr>
<td><code>job_id</code></td>
<td>23642492</td>
</tr>
<tr>
<td><code>status</code></td>
<td>success, failed</td>
</tr>
<tr>
<td><code>attempt_count</code></td>
<td>3</td>
</tr>
<tr>
<td><code>version</code></td>
<td>1.5.0</td>
</tr>
<tr>
<td><code>source_connector_id</code></td>
<td>82c7fb2d-7de1-4d4e-b12e-510b0d61e374</td>
</tr>
<tr>
<td><code>destination_connector_id</code></td>
<td>3cb42982-755b-4644-9ed4-19651b53ebdd</td>
</tr>
<tr>
<td rowspan="6"><code>airbyte.gb_moved</code></td>
<td><code>connection_id</code></td>
<td>653a067e-cd0b-4cab-96b5-5e5cb03f159b</td>
</tr>
<tr>
<td><code>workspace_id</code></td>
<td>bed3b473-1518-4461-a37f-730ea3d3a848</td>
</tr>
<tr>
<td><code>job_id</code></td>
<td>23642492</td>
</tr>
<tr>
<td><code>source_connector_id</code></td>
<td>82c7fb2d-7de1-4d4e-b12e-510b0d61e374</td>
</tr>
<tr>
<td><code>destination_connector_id</code></td>
<td>3cb42982-755b-4644-9ed4-19651b53ebdd</td>
</tr>
<tr>
<td><code>version</code></td>
<td>1.5.0</td>
</tr>
<tr>
<td rowspan="6"><code>airbyte.sync_duration</code></td>
<td><code>connection_id</code></td>
<td>653a067e-cd0b-4cab-96b5-5e5cb03f159b</td>
</tr>
<tr>
<td><code>workspace_id</code></td>
<td>bed3b473-1518-4461-a37f-730ea3d3a848</td>
</tr>
<tr>
<td><code>job_id</code></td>
<td>23642492</td>
</tr>
<tr>
<td><code>source_connector_id</code></td>
<td>82c7fb2d-7de1-4d4e-b12e-510b0d61e374</td>
</tr>
<tr>
<td><code>destination_connector_id</code></td>
<td>3cb42982-755b-4644-9ed4-19651b53ebdd</td>
</tr>
<tr>
<td><code>version</code></td>
<td>1.5.0</td>
</tr>
<tr>
<td rowspan="3"><code>airbyte.api_requests</code></td>
<td><code>workspace_id</code></td>
<td>bed3b473-1518-4461-a37f-730ea3d3a848</td>
</tr>
<tr>
<td><code>endpoint</code></td>
<td>/v1/connections/sync</td>
</tr>
<tr>
<td><code>status</code></td>
<td>200</td>
</tr>
</tbody>
</table>
Loading