[FR] Generate investigation guides #4358

Mikaayenson · 2025-01-08T21:16:30Z

Pull Request

Issue link(s):

Closes https://github.com/elastic/ia-trade-team/issues/160
Closes https://github.com/elastic/ia-trade-team/issues/146
Closes https://github.com/elastic/security-team/issues/7112
Related to https://github.com/elastic/ia-trade-team/issues/167
Related to https://github.com/elastic/ia-trade-team/issues/487

Summary - What I changed

Added a unit test that enforces investigation guides for Elastic prebuilt rules
Adds a disclaimer to the rules for ones that are genai generated per @approksiu
Note: This will add a large number of rule changes which impacts the number of assets shipped cc. @banderror
Note: Does not relocate setup info that was originally in the note field as our build process will automatically migrate rules to the setup field on build time.
Normalized the header Triage and analysis
Leveraged a bit of code to automate the process.

Details

import tomlkit
from tomlkit import string
from pathlib import Path

for rule in rules:
    if (rule.contents.data.note is not None and ("## Triage and analysis" not in rule.contents.data.note and "## Triage and Analysis" not in rule.contents.data.note)) or rule.contents.data.note is None:
        note = rule.contents.data.note
        updated_date_field = "2025/01/08"

        # Add in generated guide
        rule_uuid = rule.id

        generated_guide_path = Path(f"results/guides/{rule_uuid}_guide.md")
        with generated_guide_path.open() as f:
            new_guide = f.read()

        if note:
            note = f"{new_guide}\n\n{note}"
        else:
            note = new_guide

        with rule.path.open("r", encoding="utf-8") as g:
            toml_data = tomlkit.parse(g.read())

            # update the date
            toml_data["metadata"]["updated_date"] = updated_date_field

            # update the note field
            toml_data["rule"]["note"] = string(note, multiline=True)

        # save the toml_data back to the file
        with rule.path.open("w", encoding="utf-8") as g:
            g.write(tomlkit.dumps(toml_data))

    else:
        print(rule.path)
        print(f"Skipping {rule.id}")


# Updating the date (useful after multiple tweaks)

import subprocess
from pathlib import Path
import tomlkit

# Get the list of modified files using Git
modified_files = subprocess.check_output(
    ["git", "diff", "--name-only", "--relative"],
    text=True
).strip().split("\n")

# Filter only the relevant files (rules and building blocks)
rule_files = [
    Path(file) for file in modified_files
    if file.startswith(("rules/", "rules_building_block/")) and file.endswith(".toml")
]

updated_date_field = "2025/01/08"

for rule_file in rule_files:
    with rule_file.open("r", encoding="utf-8") as f:
        toml_data = tomlkit.parse(f.read())

    # Update the `metadata.updated_date` field
    if "metadata" in toml_data and isinstance(toml_data["metadata"], dict):
        toml_data["metadata"]["updated_date"] = updated_date_field

    with rule_file.open("w", encoding="utf-8") as f:
        f.write(tomlkit.dumps(toml_data))

    print(f"Updated: {rule_file}")

How To Test / Review

Review the content for general consistency
Unit tests should pass
Import the rules into the UI to confirm the guides are formatted properly 8.18-consolidated-rules.ndjson.txt. Note: The number of rules is getting large so a few were removed from the ndjson so that the UI would import them.
Would like someone from security-docs to review cc. @jmikell821

Sample UI: AWS IAM Password Recovery Requested

Important

Please make the changes and use the "suggest changes" feature in lieu of comments. That way the changes can be accepted as a batch change.

Checklist

Added a label for the type of pr: bug, enhancement, schema, maintenance, Rule: New, Rule: Deprecation, Rule: Tuning, Hunt: New, or Hunt: Tuning so guidelines can be generated
Added the meta:rapid-merge label if planning to merge within 24 hours
Secret and sensitive material has been managed correctly
Automated testing was updated or added to match the most common scenarios
Documentation and comments were added for features that require explanation

protectionsmachine · 2025-01-08T21:16:44Z

Mikaayenson · 2025-01-08T21:17:21Z

detection_rules/rule_formatter.py

+                    if 'query' in osquery_item and isinstance(osquery_item['query'], str):
+                        # Transform instances of \ to \\ as calling write will convert \\ to \.
+                        # This will ensure that the output file has the correct number of backslashes.
+                        osquery_item['query'] = osquery_item['query'].replace("\\", "\\\\")


I noticed this when trying to run toml-lint on our rules. Without this accounted for, it breaks the loader on formatting issues.

Mikaayenson · 2025-01-08T21:18:08Z

tests/test_all_rules.py

+
+        for rule in self.all_rules:
+            if not rule.contents.data.is_elastic_rule:
+                continue  # Don't enforce on non-Elastic rules


Question if we should go ahead an enforce on all rules.

w0rk3r · 2025-01-08T22:57:23Z

tests/test_all_rules.py

+    def test_note_contains_triage_and_analysis(self):
+        """Ensure the note field contains Triage and analysis content for Elastic rules."""


Added a unit test that enforces investigation guides for Elastic prebuilt rules

Is this needed? If we are going to go all in for the generated guides, couldn't we have a weekly/monthly automation that generates guides for new rules and open a PR?

My preference is have the complete rule at initial release, so users get full picture, and we do not introduce additional updates. I can understand there might be exceptions for urgent releases.

IMO: That adds more complexity when reviewing, potentially distracting reviewers from what matters the most: rule logic

Plus, it adds one more step to get the rules done, as authors would need to fix the generated guide before pushing the PR

This question is related to #4358 (comment) FWIW, we now have a GitHub action job that can run on a detection-rules branch to generate a guide (given the rule uuid), so the idea would be to generate the guide assuming it follows the same general guidance as this PR.

On another vein, an alternative idea would be to remove the unit test and make generating guides a step of the release where guides are created in a single PR prior to shipping, but ideally the original authors would still review those anyway. It may be easier to review in the context of the PR when the rule was created.

Aegrah

This is very good, saving us lots of time, while still providing some very basic guidance to junior SOC personel. My few cents:

I think we should avoid providing OSQuery queries, or specifically mention somewhere that the OSQueries that are generated might be wrong. I understand this is part of the initial disclaimer, but I still think it's a bit silly to provide wrong queries if we state that "these have been reviewed to improve its accuracy and relevance". Another option would be to just refer to the OSquery documentation, and mention the pre-built OSQuery packs + our hunting queries.
AI sometimes provides duplicate information or useless/too broad investigation steps. I'm curious whether working with a 2-step model would work. Provide input from this AI into a new AI with more strict rules to filter out these mistakes/inconsistencies. This might also filter out the cases where AI is writing about paths that do not exist for example.
In general this is great to start. I am just curious whether the prompt could be altered to make more specific investigations for certain activity, because it remains very broad. I understand the more specific you go, the more it will get wrong. Curious to see what options we could have here.
I also wonder the same thing as @w0rk3r does with regards to unit testing enforcements. Having CI/CD handle that for us would be convenient.

rules/linux/collection_linux_clipboard_activity.toml

rules/linux/credential_access_ssh_backdoor_log.toml

rules/linux/defense_evasion_chattr_immutable_file.toml

rules/linux/discovery_pspy_process_monitoring_detected.toml

rules/linux/persistence_kernel_driver_load.toml

banderror · 2025-01-09T12:17:14Z

Note: This will add a large number of rule changes which impacts the number of assets shipped cc. @banderror

@Mikaayenson Thank you for the heads up. How many new rule versions are we adding -- 902? cc @xcrzx

It's awesome that we're adding investigation guides to the remaining prebuilt rules!

sodhikirti07

Reviewed the ML investigation guides, and overall, the content looks good. I've added a few minor comments regarding Osquery and some descriptive inconsistencies in the investigation steps. Wonder if we could make the guides more concise by removing duplicate information or having the model summarize it further?

rules/integrations/beaconing/command_and_control_beaconing.toml

approksiu · 2025-01-10T14:11:22Z

@Mikaayenson @w0rk3r Regarding OSQuery queries, how many are suggested for these new guides? Would it be possible to have the interactive osqueries, like here:

We'd potentially need manual review/fixes for them.
What do you think?

susan-shu-c

Left some comments on SecML packages. Thanks for doing this!
For potential updates in the future, I'm wondering if we could generate less points in each section (such as "response and remediation") so that the LLM generates the most relevant ones? From my first impression, it feels that there are some less important points that it brought up just to fulfill some sort of bullet point count. But it's mostly a first impression.

rules/integrations/beaconing/command_and_control_beaconing.toml

rules/integrations/beaconing/command_and_control_beaconing_high_confidence.toml

rules/integrations/ded/exfiltration_ml_high_bytes_destination_geo_country_iso_code.toml

rules/integrations/ded/exfiltration_ml_high_bytes_destination_ip.toml

sodhikirti07

Suggesting some small changes.

rules/integrations/ded/exfiltration_ml_high_bytes_destination_geo_country_iso_code.toml

rules/integrations/ded/exfiltration_ml_high_bytes_destination_ip.toml

rules/integrations/ded/exfiltration_ml_high_bytes_destination_region_name.toml

rules/integrations/dga/command_and_control_ml_dga_high_sum_probability.toml

rules/integrations/lmd/lateral_movement_ml_spike_in_connections_to_a_destination_ip.toml

Mikaayenson · 2025-01-11T03:17:43Z

Update Jan 10 - Regenerated Guides

I took all the feedback and regenerated guides. Essentially making these changes:

Removed OSQuery recommendations because some were incorrect and were being manually removed based on the suggested changes. Most of the feedback thus far was about ossuary issues.
Attempted to deduplicate investigative steps from response and remediation steps
Modified the prompts to try to generate more specific and useful steps vs appearing to fill a number requirement
Updated the disclaimer to match other notes used in the guide.

cc. folks who've already provided feedback so far @susan-shu-c @sodhikirti07 @Aegrah @w0rk3r

rules_building_block/collection_archive_data_zip_imageload.toml

terrancedejesus · 2025-01-15T14:18:27Z

@Mikaayenson - Someone may have covered it already, but typically we add Resources: Investigation Guide tag to any rule with an investigation guide. I didn't see that with these changes.

Mikaayenson · 2025-01-15T14:27:24Z

Resources: Investigation Guide

For consistency, this makes sense. I think on the contrary we should remove these since they all have guides now, (but Id prefer not to at the moment for the sake for version bumping every rule).

Also we have test_investigation_guide_tag but as you know its being skipped atm.

terrancedejesus

Manual review looks good to me.

Mikaayenson · 2025-01-17T15:16:51Z

Update Jan 17 - Test for Guides

Migrated a test from [FR] Add investigation guide checks #2994 into this PR to check for consistency among guides
Removed new guides on rules that were staged for deprecation (caught during consistency check) - unit test will fail until they are actually deprecated (>1 month staged)
Added more guides for latest new rules

Aegrah

Did a review of a sub-sample, which all now look pretty good for fully automatically generated IGs.

susan-shu-c

LGTM thanks for addressing comments

[FR] Generate investigation guides

265d8cc

Mikaayenson added enhancement New feature or request Security Content labels Jan 8, 2025

Mikaayenson requested review from w0rk3r, approksiu, DefSecSentinel, imays11, Samirbous, Aegrah and terrancedejesus January 8, 2025 21:16

Mikaayenson self-assigned this Jan 8, 2025

Mikaayenson commented Jan 8, 2025

View reviewed changes

Normalize the investigation guide header

736b7c5

w0rk3r reviewed Jan 8, 2025

View reviewed changes

Aegrah reviewed Jan 9, 2025

View reviewed changes

sodhikirti07 reviewed Jan 9, 2025

View reviewed changes

rules/integrations/beaconing/command_and_control_beaconing.toml Outdated Show resolved Hide resolved

susan-shu-c reviewed Jan 10, 2025

View reviewed changes

sodhikirti07 reviewed Jan 10, 2025

View reviewed changes

Prep - reset files to main for the next bulk update

114a4b4

nastasha-solomon self-requested a review January 10, 2025 20:46

Mikaayenson and others added 2 commits January 10, 2025 20:31

Merge branch 'main' into gen_investigation_guides

d128721

regenerate guides

c6b8239

Mikaayenson requested review from sodhikirti07, Aegrah and w0rk3r January 11, 2025 03:22

botelastic bot added Integration: Endpoint Elastic Endpoint Security Integration: GCP GCP related rules Integration: Google Workspace Integration: Microsoft 365 Integration: Okta okta related rules OS: Linux python Internal python for the repository labels Jan 14, 2025

w0rk3r reviewed Jan 15, 2025

View reviewed changes

rules_building_block/collection_archive_data_zip_imageload.toml Outdated Show resolved Hide resolved

update dates

826ad60

remove guides from building block rules

4622931

Mikaayenson and others added 7 commits January 15, 2025 09:09

update unit test on investigation guides

de38f90

Merge branch 'main' into gen_investigation_guides

61e2014

Merge branch 'main' into gen_investigation_guides

f391346

add new guides

d9e7372

bump pyproject.toml version

1b2f712

add tag

b8974fb

Merge branch 'main' into gen_investigation_guides

6bc02d3

botelastic bot added the Integration: CyberArkPas CyberArkPas integration label Jan 15, 2025

bump pyproject.toml version

defcd7f

terrancedejesus approved these changes Jan 15, 2025

View reviewed changes

Mikaayenson mentioned this pull request Jan 17, 2025

[FR] Add investigation guide checks #2994

Closed

Mikaayenson and others added 3 commits January 17, 2025 08:51

Add unit test for guide consistency

fe66adb

Merge branch 'main' into gen_investigation_guides

ed3192b

Add more guides

fd0470c

Aegrah approved these changes Jan 17, 2025

View reviewed changes

Merge branch 'main' into gen_investigation_guides

1f3ff7f

susan-shu-c approved these changes Jan 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] Generate investigation guides #4358

[FR] Generate investigation guides #4358

Mikaayenson commented Jan 8, 2025 •

edited

Loading

protectionsmachine commented Jan 8, 2025

Mikaayenson Jan 8, 2025

Mikaayenson Jan 8, 2025

w0rk3r Jan 8, 2025

approksiu Jan 10, 2025

w0rk3r Jan 10, 2025

Mikaayenson Jan 11, 2025 •

edited

Loading

Aegrah left a comment •

edited

Loading

banderror commented Jan 9, 2025

sodhikirti07 left a comment

approksiu commented Jan 10, 2025 •

edited

Loading

susan-shu-c left a comment

sodhikirti07 left a comment

Mikaayenson commented Jan 11, 2025

terrancedejesus commented Jan 15, 2025

Mikaayenson commented Jan 15, 2025 •

edited

Loading

terrancedejesus left a comment

Mikaayenson commented Jan 17, 2025 •

edited

Loading

Aegrah left a comment

susan-shu-c left a comment

		def test_note_contains_triage_and_analysis(self):
		"""Ensure the note field contains Triage and analysis content for Elastic rules."""

[FR] Generate investigation guides #4358

Are you sure you want to change the base?

[FR] Generate investigation guides #4358

Conversation

Mikaayenson commented Jan 8, 2025 • edited Loading

Pull Request

Summary - What I changed

How To Test / Review

Checklist

protectionsmachine commented Jan 8, 2025

Enhancement - Guidelines

Documentation and Context

Code Standards and Practices

Testing

Additional Checks

Mikaayenson Jan 8, 2025

Choose a reason for hiding this comment

Mikaayenson Jan 8, 2025

Choose a reason for hiding this comment

w0rk3r Jan 8, 2025

Choose a reason for hiding this comment

approksiu Jan 10, 2025

Choose a reason for hiding this comment

w0rk3r Jan 10, 2025

Choose a reason for hiding this comment

Mikaayenson Jan 11, 2025 • edited Loading

Choose a reason for hiding this comment

Aegrah left a comment • edited Loading

Choose a reason for hiding this comment

banderror commented Jan 9, 2025

sodhikirti07 left a comment

Choose a reason for hiding this comment

approksiu commented Jan 10, 2025 • edited Loading

susan-shu-c left a comment

Choose a reason for hiding this comment

sodhikirti07 left a comment

Choose a reason for hiding this comment

Mikaayenson commented Jan 11, 2025

Update Jan 10 - Regenerated Guides

terrancedejesus commented Jan 15, 2025

Mikaayenson commented Jan 15, 2025 • edited Loading

terrancedejesus left a comment

Choose a reason for hiding this comment

Mikaayenson commented Jan 17, 2025 • edited Loading

Update Jan 17 - Test for Guides

Aegrah left a comment

Choose a reason for hiding this comment

susan-shu-c left a comment

Choose a reason for hiding this comment

Mikaayenson commented Jan 8, 2025 •

edited

Loading

Mikaayenson Jan 11, 2025 •

edited

Loading

Aegrah left a comment •

edited

Loading

approksiu commented Jan 10, 2025 •

edited

Loading

Mikaayenson commented Jan 15, 2025 •

edited

Loading

Mikaayenson commented Jan 17, 2025 •

edited

Loading