diff --git a/docs/src/main/sphinx/admin.md b/docs/src/main/sphinx/admin.md index cb7879475eed..7f76ccb98da9 100644 --- a/docs/src/main/sphinx/admin.md +++ b/docs/src/main/sphinx/admin.md @@ -8,9 +8,12 @@ running, and managing Trino clusters. admin/web-interface admin/preview-web-interface +admin/logging admin/tuning admin/jmx admin/opentelemetry +admin/openmetrics +admin/properties admin/spill admin/resource-groups admin/session-property-managers diff --git a/docs/src/main/sphinx/admin/logging.md b/docs/src/main/sphinx/admin/logging.md new file mode 100644 index 000000000000..d693935e9e18 --- /dev/null +++ b/docs/src/main/sphinx/admin/logging.md @@ -0,0 +1,133 @@ + +# Logging + +Trino include numerous features to better understand and monitor a running +system, such as [](/admin/opentelemetry) or [](/admin/jmx). Logging and +configuring logging is one important aspect for operating and troubleshooting +Trino. + +(logging-configuration)= +## Configuration + +Trino application logging is optional and configured in the `log.properties` +file in your Trino installation `etc` configuration directory as set by the +[launcher](running-trino). + +Use it to add specific loggers and configure the minimum log levels. Every +logger has a name, which is typically the fully qualified name of the class that +uses the logger. Loggers have a hierarchy based on the dots in the name, like +Java packages. The four log levels are `DEBUG`, `INFO`, `WARN` and `ERROR`, +sorted by decreasing verbosity. + +For example, consider the following log levels file: + +```properties +io.trino=WARN +io.trino.plugin.iceberg=DEBUG +io.trino.parquet=DEBUG +``` + +The preceding configuration sets the changes the level for all loggers in the +`io.trino` namespace to `WARN` as an update from the default `INFO` to make +logging less verbose. The example also increases logging verbosity for the +Iceberg connector using the `io.trino.plugin.iceberg` namespace, and the Parquet +file reader and writer support located in the `io.trino.parquet` namespace to +`DEBUG` for troubleshooting purposes. + +Additional loggers can include other package namespaces from libraries and +dependencies embedded within Trino or part of the Java runtime, for example: + +* `io.airlift` for the [Airlift](https://github.com/airlift/airlift) application + framework used by Trino. +* `org.eclipse.jetty` for the [Eclipse Jetty](https://jetty.org/) web server + used by Trino. +* `org.postgresql` for the [PostgresSQL JDBC driver](https://github.com/pgjdbc) + used by the PostgreSQL connector. +* `javax.net.ssl` for TLS from the Java runtime. +* `java.io` for I/O operations. + +There are numerous additional properties available to customize logging in +[](config-properties), with details documented in [](/admin/properties-logging) +and in following example sections. + +## Log output + +By default, logging output is file-based with rotated files in `var/log`: + +* `launcher.log` for logging out put from the application startup from the + [launcher](running-trino). Only used if the launcher starts Trino in the + background, and therefore not used in the Trino container. +* `http-request.log` for HTTP request logs, mostly from the [client + protocol](/client/client-protocol) and the [Web UI](/admin/web-interface). +* `server.log` for the main application log of Trino, including logging from all + plugins. + +## JSON and TCP channel logging + +Trino supports logging to JSON-formatted output files with the configuration +`log.format=json`. Optionally you can set `node.annotations-file` as path to a +properties file such as the following example: + +```properties +host_ip=1.2.3.4 +service_name=trino +node_name=${ENV:MY_NODE_NAME} +pod_name=${ENV:MY_POD_NAME} +pod_namespace=${ENV:MY_POD_NAMESPACE} +``` + +The annotations file supports environment variable substitution, so that the +above example attaches the name of the Trino node as `pod_name` and other +information to every log line. When running Trino on Kubernetes, you have access +to [a lot of information to use in the +log](https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/). + +TCP logging allows you to log to a TCP socket instead of a file with the +configuration `log.path=tcp://:`. The endpoint must be +available at the URL configured with `server_ip` and `server_port` and is +assumed to be stable. + +You can use an application such as [fluentbit](https://fluentbit.io/) as a +consumer for these JSON-formatted logs. + +Example fluentbit configuration file `config.yaml`: + +```yaml +pipeline: + inputs: + - name: tcp + tag: trino + listen: 0.0.0.0 + port: 5170 + buffer_size: 2048 + format: json + outputs: + - name: stdout + match: '*' +``` + +Start the application with the command: + +```shell +fluent-bit -c config.yaml +``` + +Use the following Trino properties configuration: + +```properties +log.path=tcp://localhost:5170 +log.format=json +node.annotation-file=etc/annotations.properties +``` + +File `etc/annotation.properties`: + +```properties +host_ip=1.2.3.4 +service_name=trino +pod_name=${ENV:HOSTNAME} +``` + +As a result, Trino logs appear as structured JSON log lines in fluentbit in the +user interface, and can also be [forwarded into a configured logging +system](https://docs.fluentbit.io/manual/pipeline/outputs). diff --git a/docs/src/main/sphinx/admin/openmetrics.md b/docs/src/main/sphinx/admin/openmetrics.md new file mode 100644 index 000000000000..8596e564ba05 --- /dev/null +++ b/docs/src/main/sphinx/admin/openmetrics.md @@ -0,0 +1,300 @@ +# Trino metrics with OpenMetrics + +Trino supports the metrics standard [OpenMetrics](https://openmetrics.io/), that +originated with the open-source systems monitoring and alerting toolkit +[Prometheus](https://prometheus.io/). + +Metrics are automatically enabled and available on the coordinator at the +`/metrics` endpoint. The endpoint is protected with the configured +[authentication](security-authentication), identical to the +[](/admin/web-interface) and the [](/client/client-protocol). + +For example, you can retrieve metrics data from an unsecured Trino server +running on `localhost:8080` with random username `example`: + +```shell +curl -H X-Trino-User:foo localhost:8080/metrics +``` + +The result follows the [OpenMetrics +specification](https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md) +and looks similar to the following example output: + +``` +# TYPE io_airlift_http_client_type_HttpClient_name_ForDiscoveryClient_CurrentResponseProcessTime_Min gauge +io_airlift_http_client_type_HttpClient_name_ForDiscoveryClient_CurrentResponseProcessTime_Min NaN +# TYPE io_airlift_http_client_type_HttpClient_name_ForDiscoveryClient_CurrentResponseProcessTime_P25 gauge +io_airlift_http_client_type_HttpClient_name_ForDiscoveryClient_CurrentResponseProcessTime_P25 NaN +# TYPE io_airlift_http_client_type_HttpClient_name_ForDiscoveryClient_CurrentResponseProcessTime_Total gauge +io_airlift_http_client_type_HttpClient_name_ForDiscoveryClient_CurrentResponseProcessTime_Total 0.0 +# TYPE io_airlift_http_client_type_HttpClient_name_ForDiscoveryClient_CurrentResponseProcessTime_P90 gauge +io_airlift_http_client_type_HttpClient_name_ForDiscoveryClient_CurrentResponseProcessTime_P90 NaN +``` + +The same data is available when using a browser, and logging manually. + +The user, `foo` in the example, must have read permission to system information +on a secured deployment, and the URL and port must be adjusted accordingly. + +Each Trino node, so the coordinator and all workers, provide separate metrics +independently. + +Use the property `openmetrics.jmx-object-names` in [](config-properties) to +define the JMX object names to include when retrieving all metrics. Multiple +object names are must be separated with `|`. Metrics use the package namespace +for any metric. Use `:*` to expose all metrics. Use `name` to select specific +classes or `type` for specific metric types. + +Examples: + +* `trino.plugin.exchange.filesystem:name=FileSystemExchangeStats` for metrics + from the `FileSystemExchangeStats` class in the + `trino.plugin.exchange.filesystem` package. +* `trino.plugin.exchange.filesystem.s3:name=S3FileSystemExchangeStorageStats` + for metrics from the `S3FileSystemExchangeStorageStats` class in the + `trino.plugin.exchange.filesystem.s3` package. +* `io.trino.hdfs:*` for all metrics in the `io.trino.hdfs` package. +* `java.lang:type=Memory` for all memory metrics in the `java.lang` package. + +Typically, Prometheus or a similar application is configured to monitor the +endpoint. The same application can then be used to inspect the metrics data. + +Trino also includes a [](/connector/prometheus) that allows you to query +Prometheus data using SQL. + +## Examples + +The following sections provide tips and tricks for your usage with small +examples. + +Other configurations with tools such as +[grafana-agent](https://grafana.com/docs/agent/latest/) or [grafana alloy +opentelemetry agent](https://grafana.com/docs/alloy/latest/) are also possible, +and can use platforms such as [Cortex](https://cortexmetrics.io/) or [Grafana +Mimir](https://grafana.com/oss/mimir/mimir) for metrics storage and related +monitoring and analysis. + +### Simple example with Docker and Prometheus + +The following steps provide a simple demo setup to run +[Prometheus](https://prometheus.io/) and Trino locally in Docker containers. + +Create a shared network for both servers called `platform`: + +```shell +docker network create platform +``` + +Start Trino in the background: + +```shell +docker run -d \ + --name=trino \ + --network=platform \ + --network-alias=trino \ + -p 8080:8080 \ + trinodb/trino:latest +``` + +The preceding command starts Trino and adds it to the `platform` network with +the hostname `trino`. + +Create a `prometheus.yml` configuration file with the following content, that +point Prometheus at the `trino` hostname: + +```yaml +scrape_configs: +- job_name: trino + basic_auth: + username: trino-user + static_configs: + - targets: + - trino:8080 +``` + +Start Prometheus from the same directory as the configuration file: + +```shell +docker run -d \ + --name=prometheus \ + --network=platform \ + -p 9090:9090 \ + --mount type=bind,source=$PWD/prometheus.yml,target=/etc/prometheus/prometheus.yml \ + prom/prometheus +``` + +The preceding command adds Prometheus to the `platform` network. It also mounts +the configuration file into the container so that metrics from Trino are +gathered by Prometheus. + +Now everything is running. + +Install and run the [Trino CLI](/client/cli) or any other client application and +submit a query such as `SHOW CATALOGS;` or `SELECT * FROM tpch.tiny.nation;`. + +Optionally, log into the [Trino Web UI](/admin/web-interface) at +[http://localhost:8080](http://localhost:8080) with a random username. Press +the **Finished** button and inspect the details for the completed queries. + +Access the Prometheus UI at [http://localhost:9090/](http://localhost:9090/), +select **Status** > **Targets** and see the configured endpoint for Trino +metrics. + +To see an example graph, select **Graph**, add the metric name +`trino_execution_name_QueryManager_RunningQueries` in the input field and press +**Execute**. Press **Table** for the raw data or **Graph** for a visualization. + +As a next step, run more queries and inspect the effect on the metrics. + +Once you are done you can stop the containers: + +```shell +docker stop prometheus +docker stop trino +``` + +You can start them again for further testing: + +```shell +docker start trino +docker start prometheus +``` + +Use the following commands to completely remove the network and containers: + +```shell +docker rm trino +docker rm prometheus +docker network rm platform +``` + +## Coordinator and worker metrics with Kubernetes + +To get a complete picture of the metrics on your cluster, you must access the +coordinator and the worker metrics. This section details tips for setting up for +this scenario with the [Trino Helm chart](https://github.com/trinodb/charts) on +Kubernetes. + +Add an annotation to flag all cluster nodes for scraping in your values for the +Trino Helm chart: + +```yaml +coordinator: + annotations: + prometheus.io/trino_scrape: "true" +worker: + annotations: + prometheus.io/trino_scrape: "true" +``` + +Configure metrics retrieval from the workers in your Prometheus configuration: + +```yaml + - job_name: trino-metrics-worker + scrape_interval: 10s + scrape_timeout: 10s + kubernetes_sd_configs: + - role: pod + relabel_configs: + - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_trino_scrape] + action: keep # scrape only pods with the trino scrape anotation + regex: true + - source_labels: [__meta_kubernetes_pod_container_name] + action: keep # dont try to scrape non trino container + regex: trino-worker + - action: hashmod + modulus: $(SHARDS) + source_labels: + - __address__ + target_label: __tmp_hash + - action: keep + regex: $(SHARD) + source_labels: + - __tmp_hash + - source_labels: [__meta_kubernetes_pod_name] + action: replace + target_label: pod + - source_labels: [__meta_kubernetes_pod_container_name] + action: replace + target_label: container + metric_relabel_configs: + - source_labels: [__name__] + regex: ".+_FifteenMinute.+|.+_FiveMinute.+|.+IterativeOptimizer.+|.*io_airlift_http_client_type_HttpClient.+" + action: drop # droping some highly granular metrics + - source_labels: [__meta_kubernetes_pod_name] + regex: ".+" + target_label: pod + action: replace + - source_labels: [__meta_kubernetes_pod_container_name] + regex: ".+" + target_label: container + action: replace + + scheme: http + tls_config: + insecure_skip_verify: true + basic_auth: + username: mysuer # replace with a user with system information permission + # DO NOT ADD PASSWORD +``` + +The worker authentication uses a user with access to the system information, yet +does not add a password and uses access via HTTP. + +Configure metrics retrieval from the coordinator in your Prometheus +configuration: + +```yaml + - job_name: trino-metrics-coordinator + scrape_interval: 10s + scrape_timeout: 10s + kubernetes_sd_configs: + - role: pod + relabel_configs: + - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_trino_scrape] + action: keep # scrape only pods with the trino scrape anotation + regex: true + - source_labels: [__meta_kubernetes_pod_container_name] + action: keep # dont try to scrape non trino container + regex: trino-coordinator + - action: hashmod + modulus: $(SHARDS) + source_labels: + - __address__ + target_label: __tmp_hash + - action: keep + regex: $(SHARD) + source_labels: + - __tmp_hash + - source_labels: [__meta_kubernetes_pod_name] + action: replace + target_label: pod + - source_labels: [__meta_kubernetes_pod_container_name] + action: replace + target_label: container + - action: replace # overide the address to the https ingress address + target_label: __address__ + replacement: {{ .Values.trinourl }} + metric_relabel_configs: + - source_labels: [__name__] + regex: ".+_FifteenMinute.+|.+_FiveMinute.+|.+IterativeOptimizer.+|.*io_airlift_http_client_type_HttpClient.+" + action: drop # droping some highly granular metrics + - source_labels: [__meta_kubernetes_pod_name] + regex: ".+" + target_label: pod + action: replace + - source_labels: [__meta_kubernetes_pod_container_name] + regex: ".+" + target_label: container + action: replace + + scheme: https + tls_config: + insecure_skip_verify: true + basic_auth: + username: mysuer # replace with a user with system information permission + password_file: /some/password/file +``` + +The coordinator authentication uses a user with access to the system information +and requires authentication and access via HTTPS. diff --git a/docs/src/main/sphinx/connector/kafka.md b/docs/src/main/sphinx/connector/kafka.md index a417cf6a85b7..528e57b6b7ec 100644 --- a/docs/src/main/sphinx/connector/kafka.md +++ b/docs/src/main/sphinx/connector/kafka.md @@ -79,7 +79,7 @@ creates a catalog named `sales` using the configured connector. ### Log levels Kafka consumer logging can be verbose and pollute Trino logs. To lower the -{ref}`log level `, simply add the following to `etc/log.properties`: +[log level](logging-configuration), simply add the following to `etc/log.properties`: ```text org.apache.kafka=WARN diff --git a/docs/src/main/sphinx/installation/deployment.md b/docs/src/main/sphinx/installation/deployment.md index b1d100428ef3..b1d2cb9d85ae 100644 --- a/docs/src/main/sphinx/installation/deployment.md +++ b/docs/src/main/sphinx/installation/deployment.md @@ -241,24 +241,9 @@ properties for topics such as {doc}`/admin/properties-general`, {doc}`/admin/properties-query-management`, {doc}`/admin/properties-web-interface`, and others. -(log-levels)= -### Log levels - -The optional log levels file, `etc/log.properties`, allows setting the -minimum log level for named logger hierarchies. Every logger has a name, -which is typically the fully qualified name of the class that uses the logger. -Loggers have a hierarchy based on the dots in the name, like Java packages. -For example, consider the following log levels file: - -```text -io.trino=INFO -``` - -This would set the minimum level to `INFO` for both -`io.trino.server` and `io.trino.plugin.hive`. -The default minimum level is `INFO`, -thus the above example does not actually change anything. -There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`. +Further configuration can include [](/admin/logging), [](/admin/opentelemetry), +[](/admin/jmx), [](/admin/openmetrics), and other functionality described in the +[](/admin) section. (catalog-properties)= ### Catalog properties diff --git a/docs/src/main/sphinx/security/ldap.md b/docs/src/main/sphinx/security/ldap.md index 1ade3a221826..8acb524629a7 100644 --- a/docs/src/main/sphinx/security/ldap.md +++ b/docs/src/main/sphinx/security/ldap.md @@ -284,7 +284,7 @@ Verify the password for a keystore file and view its contents using ### Debug Trino to LDAP server issues If you need to debug issues with Trino communicating with the LDAP server, -you can change the {ref}`log level ` for the LDAP authenticator: +you can change the [log level](logging-configuration) for the LDAP authenticator: ```none io.trino.plugin.password=DEBUG diff --git a/docs/src/main/sphinx/security/oauth2.md b/docs/src/main/sphinx/security/oauth2.md index ca9feae7c1f0..932017d49de4 100644 --- a/docs/src/main/sphinx/security/oauth2.md +++ b/docs/src/main/sphinx/security/oauth2.md @@ -243,7 +243,7 @@ The following configuration properties are available: (trino-oauth2-troubleshooting)= ## Troubleshooting -To debug issues, change the {ref}`log level ` for the OAuth 2.0 +To debug issues, change the [log level ` for the OAuth 2.0 authenticator: ```none