Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prometheus_textfile source #22547

Open
UiP9AV6Y opened this issue Mar 2, 2025 · 1 comment
Open

Add prometheus_textfile source #22547

UiP9AV6Y opened this issue Mar 2, 2025 · 1 comment
Labels
type: feature A value-adding code addition that introduce new functionality.

Comments

@UiP9AV6Y
Copy link

UiP9AV6Y commented Mar 2, 2025

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Use Cases

We already use Vector as a metrics relay to aggregate multiple metric sources under a single one using the prometheus_exporter sink. However, we are unable to integrate file-based metrics into the workflow without adding another component into the mix. Currently we deploy a Prometheus node_exporter with only the textfile-collector alongside vector and instruct vector to scrape the metrics from there. Ideally, we would like Vector to fetch the metrics from the files itself.

Vector already offers file, exec, and stdin sources, but those are only for managing logs.

Attempted Solutions

Deploying node_exporter with only the textfile collector enabled alongside Vector. The data flow would
be metrics generator (e.g. cronjob running a script) -> file.prom -> node_exporter -> vector -> ...

Proposal

The prometheus_scrape source does already most of the work regarding parsing and validating but only supports TCP-based inputs. I recon most of the code can be shared with a potential prometheus_textfile source implementation

References

No response

Version

vector 0.45.0 (x86_64-unknown-linux-gnu 063cabb 2025-02-24 14:52:02.810034614)

@UiP9AV6Y UiP9AV6Y added the type: feature A value-adding code addition that introduce new functionality. label Mar 2, 2025
@jorgehermo9
Copy link
Contributor

jorgehermo9 commented Mar 4, 2025

Hmmm, I think we could benefit a lot from a prometheus codec (such as influxdb one) for all sources. You are reading prometheus texts from a file, but for example, It is also useful to read the texts from a kafka source (to decouple & do horizontal scaling). This is something I've thinking about contributing to for a while

Either a prometheus codec (the config under decoding.codec=...) or a prometheus VRL function (the same as we did with influxdb, where we have both) could work with this approach.

The prometheus codec would be easy to do as the parser is already done and used in [other prometheus-related components(https://github.com/search?q=repo%3Avectordotdev%2Fvector+prometheus%3A%3Aparser&type=code)

The VRL function would be a bit difficult to do as we have to decouple the internal prometheus-parser lib from Vector repo (and move it elsewhere) or use a different parser for VRL https://crates.io/crates/prometheus-parse... But I'm not sure that the latter would be consistent with Vector's behavior if we don't change the internal prometheus-parser

What you would do with that codec is to read the whole file content (I hope this is possible with the current file source...) and then use decoding.codec=prometheus to generate metrics from the file's content

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
Development

No branches or pull requests

2 participants