Skip to content

Commit

Permalink
Ys_splunk_enhance (#37078)
Browse files Browse the repository at this point in the history
* start

* continue

* add validate indexes

* little fix

* improve

* add exception to validate bathed events

* code improvement

* finish

* build improvements

* fix failing unit tests

* add test

* release notes

* build improvements

* build improvements

* remove dot

* put dot

* build improvements

* build improvements

* fix

* build effort

* pre commit

* pre commit

* pre commit

* pre commit

* pre commit

* pre commit

* improve docs

* CR improvements

* release notes and improvements

* .

* update release notes

* remove unify file

* erase release notes

* readd release notes

* change pack metadata

* change pack metadata

* add dot

* fix readme
  • Loading branch information
YaelShamai authored Nov 17, 2024
1 parent 012c901 commit 95f9bfc
Show file tree
Hide file tree
Showing 6 changed files with 354 additions and 34 deletions.
16 changes: 13 additions & 3 deletions Packs/SplunkPy/Integrations/SplunkPy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -516,22 +516,32 @@ Parses the raw part of the event.

### splunk-submit-event-hec
***
Sends events to an HTTP event collector using the Splunk platform JSON event protocol.
Sends events Splunk. if `batch_event_data` or `entry_id` arguments are provided then all arguments related to a single event are ignored.

##### Base Command

`splunk-submit-event-hec`
##### Input

| **Argument Name** | **Description** | **Required** |
| --- | --- | --- |
| event | The event payload key-value pair. An example string: "event": "Access log test message.". | Required |
| event | The event payload key-value pair. An example string: "event": "Access log test message.". | Optional |
| fields | Fields for indexing that do not occur in the event payload itself. Accepts multiple, comma-separated, fields. | Optional |
| index | The index name. | Optional |
| host | The hostname. | Optional |
| source_type | The user-defined event source type. | Optional |
| source | The user-defined event source. | Optional |
| time | The epoch-formatted time. | Optional |
| request_channel | A channel identifier (ID) where to send the request, must be a Globally Unique Identifier (GUID). **If the indexer acknowledgment is turned on, a channel is required.** | Optional |
| batch_event_data | A batch of events to send to Splunk. For example, `{"event": "something happened at 14/10/2024 12:29", "fields": {"severity": "INFO", "category": "test2, test2"}, "index": "index0","sourcetype": "sourcetype0","source": "/example/something" } {"event": "something happened at 14/10/2024 13:29", "index": "index1", "sourcetype": "sourcetype1","source": "/example/something", "fields":{ "fields" : "severity: INFO, category: test2, test2"}}`. **If provided, the arguments related to a single event and the `entry_id` argument are ignored.** | Optional |
| batch_event_data | A batch of events to send to splunk. For example, `{"event": "something happened at 14/10/2024 12:29", "fields": {"severity": "INFO", "category": "test2, test2"}, "index": "index0","sourcetype": "sourcetype0","source": "/example/something" } {"event": "something happened at 14/10/2024 13:29", "index": "index1", "sourcetype": "sourcetype1","source": "/exeample/something", "fields":{ "fields" : "severity: INFO, category: test2, test2"}}`. **If provided, the arguments related to a single event and the `entry_id` argument are ignored.** | Optional |
| entry_id | The entry id in Cortex XSOAR of the file containing a batch of events. Content of the file should be valid batch event's data, as it would be provided to the `batch_event_data`. **If provided, the arguments related to a single event are ignored.** | Optional |

##### Batched events description
This command allows sending events to Splunk, either as a single event or a batch of multiple events.
To send a single event: Use the `event`, `fields`, `host`, `index`, `source`, `source_type`, and `time` arguments.
To send a batch of events, there are two options, either use the batch_event_data argument or use the entry_id argument (for a file uploaded to Cortex XSOAR).
Batch format requirements: The batch must be a single string containing valid dictionaries, each representing an event. Events should not be separated by commas. Each dictionary should include all necessary fields for an event. For example: `{"event": "event occurred at 14/10/2024 12:29", "fields": {"severity": "INFO", "category": "test1"}, "index": "index0", "sourcetype": "sourcetype0", "source": "/path/event1"} {"event": "event occurred at 14/10/2024 13:29", "index": "index1", "sourcetype": "sourcetype1", "source": "/path/event2", "fields": {"severity": "INFO", "category": "test2"}}`.
This formatted string can be passed directly via `batch_event_data`, or, if saved in a file, the file can be uploaded to Cortex XSOAR, and the `entry_id` (e.g., ${File.[4].EntryID}) should be provided.

##### Context Output

Expand Down
149 changes: 126 additions & 23 deletions Packs/SplunkPy/Integrations/SplunkPy/SplunkPy.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
from splunklib.binding import AuthenticationError, HTTPError, namespace


INTEGRATION_LOG = "Splunk- "
OUTPUT_MODE_JSON = 'json' # type of response from splunk-sdk query (json/csv/xml)
# Define utf8 as default encoding
params = demisto.params()
Expand Down Expand Up @@ -2615,6 +2616,81 @@ def splunk_submit_event_command(service: client.Service, args: dict):
return_results(f'Event was created in Splunk index: {r.name}')


def validate_indexes(indexes, service):
"""Validates that all provided Splunk indexes exist within the Splunk service instance."""
real_indexes = service.indexes
real_indexes_names_set = set()
for real_index in real_indexes:
real_indexes_names_set.add(real_index.name)
indexes_set = set(indexes)
return indexes_set.issubset(real_indexes_names_set)


def get_events_from_file(entry_id):
"""
Retrieves event data from a file in Demisto based on a specified entry ID as a string.
Args:
entry_id (int): The entry ID corresponding to the file containing event data.
Returns:
str: The content of the file as a string.
"""
get_file_path_res = demisto.getFilePath(entry_id)
file_path = get_file_path_res["path"]
with open(file_path, encoding='utf-8') as file_data:
return file_data.read()


def parse_fields(fields):
"""
Parses the `fields` input into a dictionary.
- If `fields` is a valid JSON string, it is converted into the corresponding dictionary.
- If `fields` is not valid JSON, it is wrapped as a dictionary with a single key-value pair,
where the key is `"fields"` and the value is the original `fields` string.
Examples:
1. Input: '{"severity": "INFO", "category": "test2, test2"}'
Output: {"severity": "INFO", "category": "test2, test2"}
2. Input: 'severity: INFO, category: test2, test2'
Output: {"fields": "severity: INFO, category: test2, test2"}
"""
if fields:
try:
parsed_fields = json.loads(fields)
except Exception:
demisto.debug('Fields provided are not valid JSON; treating as a single field')
parsed_fields = {'fields': fields}
return parsed_fields
return None


def ensure_valid_json_format(events: str | dict):
"""Converts a batch of events to a valid JSON format for processing.
Args:
events (str): The batch of events to be formatted as JSON.
Raises:
DemistoException: If the input cannot be converted to a valid JSON format, an exception is raised.
Returns:
list: A list of JSON objects derived from the input events.
"""
try:
events_str = str(events)

events_str = events_str.replace("'", '"')
rgx = re.compile(r"}[\s]*{")
valid_json_events = rgx.sub("},{", events_str)
valid_json_events = json.loads(f"[{valid_json_events}]")
return valid_json_events
except Exception as e:
raise DemistoException(f'{str(e)}\nMake sure that the events are in the correct format.')


def splunk_submit_event_hec(
hec_token: str | None,
baseurl: str,
Expand All @@ -2625,27 +2701,39 @@ def splunk_submit_event_hec(
source_type: str | None,
source: str | None,
time_: str | None,
request_channel: str | None
request_channel: str | None,
batch_event_data: str | None,
entry_id: int | None,
service
):
if hec_token is None:
raise Exception('The HEC Token was not provided')

parsed_fields = None
if fields:
try:
parsed_fields = json.loads(fields)
except Exception:
parsed_fields = {'fields': fields}
if batch_event_data:
events = batch_event_data

args = assign_params(
event=event,
host=host,
fields=parsed_fields,
index=index,
sourcetype=source_type,
source=source,
time=time_
)
elif entry_id:
demisto.debug(f'{INTEGRATION_LOG} - loading events data from file with {entry_id=}')
events = get_events_from_file(entry_id)

else:
parsed_fields = parse_fields(fields)

events = assign_params(
event=event,
host=host,
fields=parsed_fields,
index=index,
sourcetype=source_type,
source=source,
time=time_
)
valid_json_events = ensure_valid_json_format(events)

indexes = [d.get('index') for d in valid_json_events if d.get('index')]

if not validate_indexes(indexes, service):
raise DemistoException('Index name does not exist in your splunk instance')

headers = {
'Authorization': f'Splunk {hec_token}',
Expand All @@ -2654,15 +2742,23 @@ def splunk_submit_event_hec(
if request_channel:
headers['X-Splunk-Request-Channel'] = request_channel

data = ''
if entry_id or batch_event_data:
data = events
else:
data = json.dumps(events)

demisto.debug(f'{INTEGRATION_LOG} sending {len(valid_json_events)}')

return requests.post(
f'{baseurl}/services/collector/event',
data=json.dumps(args),
data=data,
headers=headers,
verify=VERIFY_CERTIFICATE,
)


def splunk_submit_event_hec_command(params: dict, args: dict):
def splunk_submit_event_hec_command(params: dict, service, args: dict):
hec_token = params.get('cred_hec_token', {}).get('password') or params.get('hec_token')
baseurl = params.get('hec_url')
if baseurl is None:
Expand All @@ -2676,18 +2772,25 @@ def splunk_submit_event_hec_command(params: dict, args: dict):
source = args.get('source')
time_ = args.get('time')
request_channel = args.get('request_channel')
batch_event_data = args.get('batch_event_data')
entry_id = args.get('entry_id')

if not event and not batch_event_data and not entry_id:
raise DemistoException("Invalid input: Please specify one of the following arguments: `event`, "
"`batch_event_data`, or `entry_id`.")

response_info = splunk_submit_event_hec(hec_token, baseurl, event, fields, host, index, source_type, source, time_,
request_channel)
request_channel, batch_event_data, entry_id, service)

if 'Success' not in response_info.text:
return_error(f"Could not send event to Splunk {response_info.text}")
else:
response_dict = json.loads(response_info.text)
response_dict = json.loads(response_info.text
)
if response_dict and 'ackId' in response_dict:
return_results(f"The event was sent successfully to Splunk. AckID: {response_dict['ackId']}")
return_results(f"The events were sent successfully to Splunk. AckID: {response_dict['ackId']}")
else:
return_results('The event was sent successfully to Splunk.')
return_results('The events were sent successfully to Splunk.')


def splunk_edit_notable_event_command(base_url: str, token: str, auth_token: str | None, args: dict) -> None:
Expand Down Expand Up @@ -3151,7 +3254,7 @@ def main(): # pragma: no cover
token = get_auth_session_key(service)
splunk_edit_notable_event_command(base_url, token, auth_token, args)
elif command == 'splunk-submit-event-hec':
splunk_submit_event_hec_command(params, args)
splunk_submit_event_hec_command(params, service, args)
elif command == 'splunk-job-status':
return_results(splunk_job_status(service, args))
elif command.startswith('splunk-kv-') and service is not None:
Expand Down
6 changes: 5 additions & 1 deletion Packs/SplunkPy/Integrations/SplunkPy/SplunkPy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -454,7 +454,7 @@ script:
Event payload key-value pair.
String example: "event": "Access log test message".
name: event
required: true
required: false
- description: Fields for indexing that do not occur in the event payload itself. Accepts multiple, comma-separated, fields.
name: fields
- description: The index name.
Expand All @@ -469,6 +469,10 @@ script:
name: time
- description: A channel identifier (ID) where to send the request, must be a Globally Unique Identifier (GUID). If the indexer acknowledgment is turned on, a channel is required.
name: request_channel
- description: 'A batch of events to send to Splunk. For example, `{"event": "something happened at 14/10/2024 12:29", "fields": {"severity": "INFO", "category": "test2, test2"}, "index": "index0","sourcetype": "sourcetype0","source": "/example/something" } {"event": "something happened at 14/10/2024 13:29", "index": "index1", "sourcetype": "sourcetype1","source": "/example/something", "fields":{ "fields" : "severity: INFO, category: test2, test2"}}`. If provided all arguments except of `request_channel` are ignored.'
name: batch_event_data
- description: The entry ID in Cortex XSOAR of the file containing a batch of events. If provided, the arguments related to a single event are ignored.
name: entry_id
description: Sends events to an HTTP Event Collector using the Splunk platform JSON event protocol.
name: splunk-submit-event-hec
- arguments:
Expand Down
Loading

0 comments on commit 95f9bfc

Please sign in to comment.