Package esalert-plus forked from Akagi201/esalert, a simple framework for real-time alerts on data in Elasticsearch. esalter-plus append more features like Elasticsearch X-Watcher and support for Elasticsearch 7.6.2+。
- Esalert's runtime configs.
- Configs can be passed from command-line, environment or config file.
alerts = "configs/esalert.d"
es-addr = "10.41.41.145:9200, 10.41.41.145:9201, 10.41.41.145:9202"
es-user = "elastic"
es-pass = "123456"
lua-init = ""
lua-vms = ""
log-level = "debug"
log-dir = "logs"
dingding-webhook = ""
slack-webhook = ""
force-run = ""
A yaml file, or directory with yaml files, containing alert definitions。
Address to find an elasticsearch instance on. support for multi discover server.default to 127.0.0.1:9200
Username for the elasticsearch. default to elastic
Password for the elasticsearch. default to changeme
If set the given lua script file will be executed at the initialization of every lua vm
How many lua vms should be used. Each vm is completely independent of the other, and requests are executed on whatever vm is available at that moment. Allows lua scripts to not all be blocked on the same os thread. default to 1
Adjust the log level. Valid options are: error, warn, info, debug. default to info
log dir, default to stdout
.
Slack webhook url, required if using any Slack actions
dingding webhook url, required if using any Dingding action
- Alert configs contain all the data processing which should be performed.
- Esalert runs with one or more alerts defined in its configuration, each one operating independant of the others.
- Alert configs can be in one file or a directory of files.
- Alert configs use yaml format. Each file contains an array of alerts.
# esalert.yml
- name: alert_foo
# other alert parameters
- name: alert_bar
# other alert parameters
OR
# esalert.d/foo.yml
- name: alert_foo
# other alert parameters
- name: alert_foo2
# other alert parameters
and
# esalert.d/bar.yml
- name: alert_bar
# other alert parameters
- name: alert_bar2
# other alert parameters
A single alert has the following fields in its document (all are required):
- name: something_unique
interval: "*/5 * * * * *"
search_index: # see the search subsection
search_type: # see the search subsection
search: # see the search subsection
metadata: # see the metadata subsection
throttle_period: # see the throttle_period
process: # see the process subsection
This is an arbitrary string to identify the alert. It must be unique amongst all of the defined alerts.
A jobber-style interval string describing when the search should be run and have the process run on the results.
The search which should be performed against elasticsearch. The results are simply held onto for the process step, nothing else is done with them at this point.
search_index: filebeat-{{.Format "2006.01.02"}}
search_type: logs
# conveniently, json is valid yaml
search: {
"query": {
"query_string": {
"query":"severity:fatal"
}
}
}
- See query dsl docs for more on how to formulate query objects.
- See query string docs for more on how to formulate query strings.
- All three fields(
search_index
,search_type
andsearch
) can have go templating applied. - See the alert context subsection for more information on what fields/methods are available to use.
The metadata that can share between search
and process
phase.
A threshold alert periodically checks when your data is above, below, equals, or is in between a certain threshold within a given time interval.
Once the search is performed the results are kept in the context, which is then passed into this step. The process lua script then checks these results against whatever conditions are desired, and may optionally return a list of actions to take. See the alert context section for all available fields in ctx.
process:
lua_file: ./foo-process.yml
OR
process:
lua_inline: |
if ctx.HitCount > 10 then
return {
{
type = "log",
message = "got " .. ctx.HitCount .. " hits",
}
}
end
-- To indicate no actions, you can return an empty table, nil, or simply
-- don't return at all
return {}
The table returned by process is a list of actions which should be taken. Each action has a type and subsequent fields based on that type.
Simply logs an INFO message to the console or the log-dir
you have configured. Useful if you're testing an alert and don't want to set up any real actions yet or write the monitor info in the disk for the next step.
{
type = "log",
message = "Performing action for alert " .. ctx.Name,
}
Create and execute an http command. A warning is logged if anything except a 2xx response code is returned.
{
type = "http",
method = "POST", -- optional, defaults to GET
url = "http://example.com/some/endpoint?ARG1=foo",
headers = { -- optional
"X-FOO" = "something",
},
body = "some body for " .. ctx.Name, -- optional
}
Triggers an event in slack. The --slack-key param must be set in the runtime configuration in order to use this action type.
{
type = "slack",
text = "some text"
}
Triggers an event in DingDing. You must write the text format by yourself.
{
type="dingding",
text="{\"msgtype\": \"text\", \"text\": { \"content\": \"" .. msg .. "\"}}"
}
Through its lifecycle each alert has a context object attached to it. The results from the search step are included in it, as well as other data. Here is a description of the available data in the context, as well as how to use it.
NOTE THAT THE CONTEXT IS READ-ONLY IN ALL CASES
{
Name string // The alert's name
StartedTS uint64 // The timestamp the alert started at
// The following are filled in by the search step
TookMS uint64 // Time search took to complete, in milliseconds
HitCount interface{} // The total number of documents matched
HitMaxScore float64 // The maximum score of all the documents matched
// Array of actual documents matched. Keep in mind that unless you manually
// define a limit in your search query this will be capped at 10 by
// elasticsearch. Usually HitCount is the important data point anyway
Hits []{
Index string // The index the hit came from
Type string // The type the document is
ID string // The unique id of the document
Score float64 // The document's score relative to the query
Source object // The actual document
}
// If an aggregation was defined in the search query, the results will be
// set here
Aggregations object
}
Within lua scripts the context is made available as a global variable called ctx
. Fields on it are directly addressable using the above names, for example ctx.HitCount
and ctx.Hits[1].ID
.
In some areas go templates, provided by the template/text package, are used to add some dynamic capabilities to otherwise static configuration fields. In these places the context is made available as the root object. For example, {{.HitCount}}.
In addition to the fields defined above, the root template object also has some methods on it which may be helpful for working with dates. All methods defined on go's time.Time object are available. For example, to format a string into the filebeat index for the current day:
filebeat-{{.Format "2006.01.02"}}
And to do the same, but for yesterday:
filebeat-{{(.AddDate 0 0 -1).Format "2006.01.02"}}