-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for adding data generation for system package #6168
Comments
@andresrc @aspacca @bturquet @tommyers-elastic As soon as we have the implementation of the corpus spec done in elastic-package, we should move over the existing specs and then tackle the system integration, at least a subset of the datasets as I expect this to be one of the most request datasets for testing. @maryam-saeidi Can you share a bit more details on what the exact metrics are you are interested in. From the links you shared, it seems to be cpu and network? Others? |
@ruflin At the moment, I would like to test adding a condition for CPU usage and getting alerts related to the hosts that met that threshold. Now my question is: How do ECS host fields relate to system integration? Can I expect all those fields to be added? Also, regarding what system fields we can alert based on, I checked release-oblt ( But for my test, CPU, Memory, and Network fields are enough to start (plus the ECS host fields if it is applicable) |
Yes |
this is true for schema-c:
I indeed have to investigate if the tool supports ECS fields coming from https://github.com/elastic/integrations/blob/main/packages/system/data_stream/cpu/fields/ecs.yml, or they are in the output because they are defined as well in https://github.com/elastic/integrations/blob/main/packages/system/data_stream/cpu/fields/agent.yml if you want to generate schema-c data (ie: post-ingest pipeline, it does mean you should disable the ingest pipeline when ingesting in please, beware, as discussed, that unless you are able to tweak the data to be generated trough the fields generation configuration so that they will trigger the rule you want to test, that you cannot be sure that the data generated will contain events that will trigger that rule. for that https://github.com/elastic/geneve is a better tool, but as more limit regarding the generation of all the fields of the document. I think there is some way to generate the fields affecting the rule as well the ECS one through geneve, @cavokz might be more helpful here |
Thanks @aspacca. Ccing @charlie-pichette. @maryam-saeidi, Geneve is not very good for generating realistic data, neither in the fields of the generated documents nor in the content of such fields. What Geneve is good for is adding fields mentioned in a query and put there content that would satisfy said query and therefore a rule. If for example you have this query (not sure if I got the units right here):
You would get something similar to
You see that aside for If this is something that interests you, we need to find the way to integrate Geneve with tools that generate better "background" data on top of which Geneve can adjust/add the fields as needed. |
Elastic has quite a few data generation tool out there. As in many observability cases, the data we are interested in comes from packages, I rather focus for system metrics on the data generated by elastic-package and extending it for the use cases then extending geneve. |
Indeed I was thinking at integrating Geneve with other tools more than extending it. For instance we already evaluated the idea of adding support for package-integrations in Geneve (elastic/geneve#113) and concluded that it's not a good idea. |
@maryam-saeidi https://github.com/elastic/logen may also be of value. |
@charlie-pichette I get 404 when I try to access the repo |
Perhaps @tammytorbert can provide access to Logen. |
we will for sure create the assets for the system metrics in elastic-package, still for the use case of @maryam-saeidi it might not be the right solution because of the inability about creating data triggering a rule
as @ruflin mentioned, in the context of observability "the data we are interested in comes from packages", and that's what the corpus generator tool handles very well but it misses the way to drive data according to a query/rule across multiple events we talked while ago about having the two tools somehow be able to "speak each others" and I see @maryam-saeidi's scenario a good one where we could start building upon: what do you think? |
## Summary Closes #157189 This PR adds a metric threshold integration test. This is the first step in adding more test coverage for observability rules. **Steps during the test** 1. Generating fake host data by using a similar implementation as https://github.com/elastic/high-cardinality-cluster - Data is generated for the last 15 mins - Implementation was simplified only to cover fake hosts and was converted to typescript 2. Creating an action using an index connector 3. Creating a metric threshold rule containing step number 2 action 4. Checking the status of the rule to be active 5. Checking the triggered action to have the correct parameters 6. Checking the generated alert to have the correct information 7. Clean up **How to run locally** - Run server ``` node scripts/functional_tests_server --config x-pack/test/api_integration/apis/metrics_ui/config.ts ``` - Then run the test ``` node scripts/functional_tests__runner --include-pack/test/api_integration/apis/metrics_ui/cometric_threshold_rule.ts --config x-pack/test/api_integration/apis/metrics_ui/config.ts ``` **Reference** I created elastic/integrations#6168 to find a better way to generate data and make sure that data matches what metricbeats generates --------- Co-authored-by: kibanamachine <[email protected]>
## Summary Closes #157189 This PR adds a metric threshold integration test. This is the first step in adding more test coverage for observability rules. **Steps during the test** 1. Generating fake host data by using a similar implementation as https://github.com/elastic/high-cardinality-cluster - Data is generated for the last 15 mins - Implementation was simplified only to cover fake hosts and was converted to typescript 2. Creating an action using an index connector 3. Creating a metric threshold rule containing step number 2 action 4. Checking the status of the rule to be active 5. Checking the triggered action to have the correct parameters 6. Checking the generated alert to have the correct information 7. Clean up **How to run locally** - Run server ``` node scripts/functional_tests_server --config x-pack/test/api_integration/apis/metrics_ui/config.ts ``` - Then run the test ``` node scripts/functional_tests__runner --include-pack/test/api_integration/apis/metrics_ui/cometric_threshold_rule.ts --config x-pack/test/api_integration/apis/metrics_ui/config.ts ``` **Reference** I created elastic/integrations#6168 to find a better way to generate data and make sure that data matches what metricbeats generates --------- Co-authored-by: kibanamachine <[email protected]>
Hi! We just realized that we haven't looked into this issue in a while. We're sorry! We're labeling this issue as |
@lalit-satapathy ^ Would be great to get this in as it would help with development and testing. |
Yes, will help on this.
@maryam-saeidi, We already have the rally benchmark supported for system.cpu and system.memory. Is this something you can give a try and we can extend to system.network in future? If you need help running corpus generator tool, please let's know. |
Summary
As an Actionable Observability team member, I am looking for a way to generate system data (according to what metricbeat/elastic agent does) to test infra-alert rules. At the moment, I am using high_cardinality_indexer to generate that according to fake_host template.
This approach has the following challenges:
My end goal is to use this tool in Kibana for API integration testing of infra-alert rules.
Related topic
The text was updated successfully, but these errors were encountered: