Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix index count for manual events #3066

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions .github/workflows/build-image.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: build-image

on:
pull_request:
types: [opened, synchronize, reopened]

jobs:
build-timesketch-image:
runs-on: ubuntu-latest

steps:
- name: Checkout repository
uses: actions/checkout@v3

- name: Build Docker image
uses: docker/build-push-action@v4
with:
context: docker/release/build
build-args: |
BRANCH=${{ github.head_ref }}
17 changes: 0 additions & 17 deletions .github/workflows/documentation.yml

This file was deleted.

5 changes: 4 additions & 1 deletion .github/workflows/e2e-tests.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
name: e2e-tests
on:
push:
branches:
- master
pull_request:
types: [opened, synchronize, reopened]
jobs:
Expand All @@ -8,7 +11,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
os: [ubuntu-20.04, ubuntu-22.04]
os: [ubuntu-20.04]
steps:
- uses: actions/checkout@v2
- name: Set up infrastructure with docker compose
Expand Down
7 changes: 5 additions & 2 deletions .github/workflows/linters.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
name: linters

on:
push:
branches:
- master
pull_request:
types: [opened, synchronize, reopened]

Expand All @@ -9,8 +12,8 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
os: [ubuntu-20.04, ubuntu-22.04]
python-version: ['3.8', '3.10']
os: [ubuntu-20.04]
python-version: ['3.10']

steps:
- uses: actions/checkout@v2
Expand Down
46 changes: 46 additions & 0 deletions .github/workflows/publish-image.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
name: Build and publish Timesketch Docker image

on:
push:
branches: ['master']
release:
types: [published]

env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}

jobs:
build-and-publish-timesketch-image:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write

steps:
- name: Checkout repository
uses: actions/checkout@v3

- name: Log in to the Container registry
uses: docker/login-action@v2
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@v4
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}

- name: Build and publish Docker image
uses: docker/build-push-action@v4
with:
context: docker/release/build
build-args: |
BRANCH=${{ github.ref_name }}
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}

7 changes: 5 additions & 2 deletions .github/workflows/unit-tests.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
name: unit-tests

on:
push:
branches:
- master
pull_request:
types: [opened, synchronize, reopened]

Expand All @@ -10,8 +13,8 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
os: [ubuntu-20.04, ubuntu-22.04]
python-version: ['3.8', '3.10']
os: [ubuntu-20.04]
python-version: ['3.10']
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
Expand Down
4 changes: 4 additions & 0 deletions api_client/python/timesketch_api_client/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -340,6 +340,10 @@ def _create_session(

session = requests.Session()

# GCP IAP
if token := os.getenv("AUTHORIZATION_TOKEN"):
session.headers = {"Authorization": f"Bearer {token}"}

# If using HTTP Basic auth, add the user/pass to the session
if auth_mode == "http-basic":
session.auth = (username, password)
Expand Down
77 changes: 41 additions & 36 deletions api_client/python/timesketch_api_client/sketch.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,24 +15,15 @@
from __future__ import unicode_literals

import copy
import os
import json
import logging
import os

import pandas

from . import analyzer
from . import aggregation
from . import definitions
from . import error
from . import graph
from . import aggregation, analyzer, definitions, error, graph
from . import index as api_index
from . import resource
from . import search
from . import searchtemplate
from . import story
from . import timeline

from . import resource, search, searchtemplate, story, timeline

logger = logging.getLogger("timesketch_api.sketch")

Expand All @@ -49,7 +40,9 @@ class Sketch(resource.BaseResource):
"""

# Add in necessary fields in data ingested via a different mechanism.
_NECESSARY_DATA_FIELDS = frozenset(["timestamp", "datetime", "message"])
_NECESSARY_DATA_FIELDS = frozenset(
["timestamp", "timestamp_desc", "datetime", "message"]
)

def __init__(self, sketch_id, api, sketch_name=None):
"""Initializes the Sketch object.
Expand Down Expand Up @@ -1785,8 +1778,8 @@ def generate_timeline_from_es_index(
self,
es_index_name,
name,
index_name="",
description="",
timeline_filter_id=None,
timeline_update_query=True,
provider="Manually added to OpenSearch",
context="Added via API client",
data_label="OpenSearch",
Expand All @@ -1803,8 +1796,11 @@ def generate_timeline_from_es_index(
Args:
es_index_name: name of the index in OpenSearch.
name: string with the name of the timeline.
index_name: optional string for the SearchIndex name, defaults
to the same as the es_index_name.
timeline_filter_id: optional string to filter on documents in an
index with multiple timelines.
timeline_update_query: optional boolean to determine if the update
by query is executed to add the timeline ID to the documents
, defaults to True.
description: optional string with a description of the timeline.
provider: optional string with the provider name for the data
source of the imported data. Defaults to "Manually added
Expand All @@ -1831,16 +1827,20 @@ def generate_timeline_from_es_index(
raise ValueError("Timeline name needs to be provided.")

# Step 1: Make sure the index doesn't exist already.
for index_obj in self.api.list_searchindices():
if index_obj is None:
continue
if index_obj.index_name == es_index_name:
raise ValueError("Unable to add the ES index, since it already exists.")
# This step is executed when an index is used for a single timeline
if (None, True) == (timeline_filter_id, timeline_update_query):
for index_obj in self.api.list_searchindices():
if index_obj is None:
continue
if index_obj.index_name == es_index_name:
raise ValueError(
"Unable to add the ES index, since it already exists."
)

# Step 2: Create a SearchIndex.
resource_url = f"{self.api.api_root}/searchindices/"
form_data = {
"searchindex_name": index_name or es_index_name,
"searchindex_name": es_index_name,
"es_index_name": es_index_name,
}
response = self.api.session.post(resource_url, json=form_data)
Expand All @@ -1865,7 +1865,6 @@ def generate_timeline_from_es_index(
# Step 3: Verify mappings to make sure data conforms.
index_obj = api_index.SearchIndex(searchindex_id, api=self.api)
index_fields = set(index_obj.fields)

if not self._NECESSARY_DATA_FIELDS.issubset(index_fields):
index_obj.status = "fail"
raise ValueError(
Expand Down Expand Up @@ -1910,19 +1909,25 @@ def generate_timeline_from_es_index(
)

# Step 5: Add the timeline ID into the dataset.
resource_url = f"{self.api.api_root}/sketches/{self.id}/event/add_timeline_id/"
form_data = {
"searchindex_id": searchindex_id,
"timeline_id": timeline_dict["id"],
}
response = self.api.session.post(resource_url, json=form_data)

if response.status_code not in definitions.HTTP_STATUS_CODE_20X:
error.error_message(
response,
message="Unable to add timeline identifier to data",
error=ValueError,
# This step is skipped if the update_by_query is `False`, because the
# documents will be ingested with timeline ID as field
if timeline_update_query:
resource_url = (
f"{self.api.api_root}/sketches/{self.id}/event/add_timeline_id/"
)
form_data = {
"searchindex_id": searchindex_id,
"timeline_id": timeline_dict["id"],
"timeline_filter_id": timeline_filter_id,
}
response = self.api.session.post(resource_url, json=form_data)

if response.status_code not in definitions.HTTP_STATUS_CODE_20X:
error.error_message(
response,
message="Unable to add timeline identifier to data",
error=ValueError,
)

# Step 6: Add a DataSource object.
resource_url = f"{self.api.api_root}/sketches/{self.id}/datasource/"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
FROM ubuntu:22.04

ARG PPA_TRACK=stable
ARG BRANCH

# Prevent needing to configure debian packages, stopping the setup of
# the docker container.
Expand All @@ -19,9 +20,9 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*

# Install Timesketch
RUN wget https://raw.githubusercontent.com/google/timesketch/master/requirements.txt
RUN wget https://raw.githubusercontent.com/hnhdev/timesketch/${BRANCH}/requirements.txt
RUN pip3 install -r requirements.txt
RUN pip3 install https://github.com/google/timesketch/archive/master.zip
RUN pip3 install https://github.com/hnhdev/timesketch/archive/$BRANCH.zip

# Install Plaso
RUN add-apt-repository -y ppa:gift/$PPA_TRACK
Expand Down
35 changes: 30 additions & 5 deletions docs/developers/api-upload-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,27 +245,52 @@ work with Timesketch. This function does limited checking before making it
available. The timeline may or may not work in Timesketch, depending on
multiple factors._

The data that is ingested needs to have few fields already set before it can be
ingested into Timesketch:

- message
- timestamp
- datetime
- timestamp
- timestamp_desc

The datetime field also needs to be mapped as a date, not a text string.

A sample code on how to ingest data into Timesketch that is already in OpenSearch:

- Method 1 - generate a timeline from a index in OpenSearch
- Method 2 - generate a timeline from a index in OpenSearch, that contains documents
from multiple timelines filtered by the field `__ts_timeline_filter_id`
- Method 3 - create a timeline and use the identifier to ingest a timeline into OpensSearch

```python
from timesketch_api_client import config

ts_client = config.get_client()
sketch = ts_client.get_sketch(SKETCH_ID)

# Method 1 - Single timeline from a single index
sketch.generate_timeline_from_es_index(
es_index_name=OPENSEARCH_INDEX_NAME,
name=TIMELINE_NAME,
provider='My Custom Ingestion Script',
context='python my_custom_script.py --ingest',
)

# Method 2 - Multiple timelines from a single index
sketch.generate_timeline_from_es_index(
index_name=OPENSEARCH_INDEX_NAME,
es_index_name=OPENSEARCH_INDEX_NAME,
name=TIMELINE_NAME,
timeline_filter_id="1",
provider='My Custom Ingestion Script',
context='python my_custom_script.py --ingest',
)

# Method 3 - Multiple timeline from a single, where the timeline ID is returned
timeline = sketch.generate_timeline_from_es_index(
es_index_name=OPENSEARCH_INDEX_NAME,
name=TIMELINE_NAME,
timeline_update_query=False,
provider='My Custom Ingestion Script',
context='python my_custom_script.py --ingest',
)

# Use `timeline.id` as value the of in the documents that will be ingested in to the index
# e.g. Logstash filter: `{mutate {add_field => { "__ts_timeline_id" => "${TIMELINE_ID}"}}} ``
```
Loading
Loading