Skip to content

Commit

Permalink
updated readme, almost done. Need to figure out galaxy integration (a…
Browse files Browse the repository at this point in the history
…gain)
  • Loading branch information
DrPsychick committed Sep 7, 2018
1 parent cfd9b68 commit 6a120c0
Show file tree
Hide file tree
Showing 7 changed files with 92 additions and 49 deletions.
10 changes: 5 additions & 5 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
services:
services:
- docker
language: python
python: "2.7"
Expand All @@ -20,7 +20,7 @@ before_install:
install:
# Install ansible and jq.
- "pip install ansible jq"

# Add ansible.cfg to pick up roles path.
- printf "[defaults]\nroles_path = ../" > ansible.cfg

Expand All @@ -35,16 +35,16 @@ script:

# Run the role/playbook again, checking to make sure it's idempotent.
- >
ansible-playbook -i tests/inventory tests/test.yml --connection=local --sudo
ansible-playbook -i tests/inventory tests/test.yml --connection=local --sudo
| grep -q 'changed=0.*failed=0'
&& (echo 'Idempotence test: pass' && exit 0)
|| (echo 'Idempotence test: fail' && exit 1);
# TEST = migration
- >
if [ "$TEST" = "migration" ]; then
if [ "$TEST" = "migration" ]; then
curl -s http://localhost:8086/query?db=test_agg --data-urlencode "q=SELECT * FROM test_agg.rp_7d.test";
result=$(curl -s http://localhost:8086/query?db=test_agg --data-urlencode "q=SELECT MEAN(value) FROM test_agg.rp_7d.test" | jq .results[0].series[0].values[0][1]);
result=$(curl -s http://localhost:8086/query?db=test_agg --data-urlencode "q=SELECT MEAN(value) FROM test_agg.rp_7d.test" | jq .results[0].series[0].values[0][1]);
[ "$result" -ne 35 ] && (echo "Aggregation test failed: '$result' != 35"; exit 1) || (echo "Aggregation test: pass"; exit 0);
fi
Expand Down
99 changes: 74 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,45 +3,94 @@
Configure influxDB for downsampling
===================================

Motivation:
InfluxDB uses a default retention policy that keeps data **forever** in 7 day shards - in RAW format (data points every 10 or 30 seconds, depending on your input configuration).
Of course this is a good default, but once you have old data and want to introduce downsampling without loosing data, its a **lot** of manual work to setup all the queries etc.

So ... I have done this for you!

Two usage scenarios:
* You already have an influxdb running and it's getting BIG, so you want to introduce downsampling on-the-fly to make things faster and cheaper
* You intend to use influxdb and want to set it up with downsampling in mind
* You intend to use influxdb and want to set it up with downsampling in mind (so it does not grow big over time in the first place)

honestly the two use cases are not much different. The biggest difference is the time it takes to run through the playbook. Of course, if you work on existing data, don't forget to have a proper backup!
Honestly the two use cases are not much different. The biggest difference is the time it takes to run through the playbook when you enable backfilling. Of course, if you work on existing data, don't forget to have a proper backup!

Preparation
-----------
As preparation you don't need much, expect knowing how exactly you want to downsample your data.
As preparation you don't need much, expect knowing how exactly you want to downsample your data as you need to setup your configuration first.

Setup
-----

Easiest setup is create a role in your own repository and adding this:
* Decide on the name of the setup, let's call the role "" and the setup"frank"
* *hint* you can have any number of setup configured in this role. You just always have to load first **your** role (defining the setup) and then **DrPsychick.ansible-influx-downsampling** for each setup.

`tasks/main.yml`
```---
- name: "Include definition from influxdb_{{vars_name}}.yml"
include_vars: influxdb_{{vars_name}}.yml
when: vars_name is defined
```

`vars/influxdb_frank.yml`
--> take one from the examples directory as a base for your own: [examples/](examples/)

Now in your playbook, include both roles:
```
- name: InfluxDB
hosts: localhost
roles:
- { role: , vars_name: "frank" }
- { role: DrPsychick.ansible-influxdb-downsampling}
```


Attention
=========
If you enable **backfill**:
* Check the size of your data first. Depending on the amount of series in a measurement, you need to configure the time range for backfilling. A good default is "1d".
* Timeouts: Your InfluxDB as well as the calls in this playbook may time out! Or you may hit other limits in the influxdb.conf.

My Settings for backfilling 9GB of data on 5 aggregation levels on a docker container with 3GB of RAM (no CPU limit for backfilling)
* `ansible_influx_databases`, 5 levels: 14d@1m, 30d@5m, 90d@15m, 1y@1h, 3y@3h
* `ansible_influx_timeout`: 600 (10 minutes)
* influxdb.conf: `query-timeout="600s", max-select-point=200000000, max-select-series=1000000, log-queries-after="10s"`
* duration:

My full setup can be found in [examples/full-5level-backfill-compact/](examples/full-5level-backfill-compact/)

History
-------
=======

Version 0.3:

T more tests:
T * run backfill without CQ to switch RP on existing data
T * run backfill without CQ during operation (configurable timing of input)
T howto switch retention policy (cleanup after all is setup)
T * Case: copy from "autogen", no CQ, drop source after backfill + set default RP -> see test
T shift RPs by "spread" seconds: 60+/-5sec EVERY 5m+-1s,2s,3s,... + step in seconds
* [ ] full readme -> docs
* [ ] multiple examples -> docs/example
* [ ] more tests:
* [ ] * run backfill without CQ and switch RP on existing data (compact/evict old data)
* [ ] * run backfill without CQ during operation (configurable timing of input) and switch RP
* [ ] howto switch retention policy (cleanup after all is setup)
* [ ] * Case: copy from "autogen", no CQ, drop source after backfill + set default RP -> see test
* [ ] shift RPs by "spread" seconds: 60+/-5sec EVERY 5m+-1s,2s,3s,... + step in seconds
* [ ] add RP shard duration option

Version 0.2:

W Check variables upfront (define clear dependencies) and print useful error messages before acting
* fix: continuous_query is required even if empty (bad usability)
T full readme -> docs
T multiple examples -> docs/example
W more tests:
* test parallel tests
T * prepare seeding (generator or file?)
T * run downsampling + backfill on existing DB (needs seed)
T * run downsampling + backfill + switch RP on existing DB (needs seed)
T * run backfill with step 7 (on RP with 7d)
* set RP default yes/no
* improve/extend dict structure (BC break!)
* update continuous queries (drop+create)
* stats (total data points written per DB / average downsampling ratio)
* support selective group by in backfill and continuous query
* [ ] Update description + basic readme
* [ ] Check variables upfront (define clear dependencies) and print useful error messages before acting
* [x] fix: continuous_query is required even if empty (bad usability)
* [ ] more tests:
** [x] test parallel tests
** [x]prepare seeding (generator or file?)
** [x] run downsampling + backfill on existing DB (needs seed)
** [ ] run backfill with step X (on RP with 7d)
* [x] set RP default yes/no
* [x] improve/extend dict structure (BC break!)
* [x] update continuous queries (drop+create)
* [x] stats (total data points written per DB / average downsampling ratio)
* [x] support selective group by in backfill and continuous query

Version 0.1:

Expand Down
4 changes: 2 additions & 2 deletions defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,11 @@ ansible_influx_databases:
measurements: { cpu }

# Predefined set of queries for standard telegraf inputs
# You can overwrite these with selectively the variable "my_ansible_influx_queries"
# You can overwrite these with selectively the variable "my_ansible_influx_queries"
# Use the same structure as below.

# Attention!
# columns have to be named explicitly, otherwise influxdb will prepend the aggregation
# columns have to be named explicitly, otherwise influxdb will prepend the aggregation
# method name (e.g. mean(usage_user) -> mean_usage_user)
# see https://github.com/influxdata/influxdb/issues/7332
ansible_influx_queries:
Expand Down
2 changes: 0 additions & 2 deletions examples/basic.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,3 @@ ansible_influx_databases:
# cq_resample: only useful when doing cq
# backfill: only needed when doing cq
# measurements: only needed when doing cq


1 change: 0 additions & 1 deletion tasks/influxdb_database.yml
Original file line number Diff line number Diff line change
Expand Up @@ -120,4 +120,3 @@
- name: '{{db_prefix}} Average series downsampling'
debug: msg="Average series downsampling = {{ (mm_downsampling_totals|map('float')|sum(start=0) / mm_downsampling_totals|length) |round(2) }} %"
when: ifx_backfill and mm_downsampling_totals|length > 0

22 changes: 11 additions & 11 deletions tasks/influxdb_measurement.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,9 @@
when: ifx_db.source is defined

- name: '{{mm_prefix}} Get SOURCE measurement count(*)'
uri:
url: "{{ansible_influx_url}}/query?db={{ifx_db.source.name}}"
method: POST
uri:
url: "{{ansible_influx_url}}/query?db={{ifx_db.source.name}}"
method: POST
body: "q=SELECT COUNT(*) FROM {{source_mm}} WHERE time >= now() - {{ifx_db.retention_policy.amount+ifx_db.retention_policy.unit}}"
return_content: yes
register: ansible_influx_mm_count
Expand All @@ -50,7 +50,7 @@

- name: '{{mm_prefix}} Count on SOURCE'
debug: msg="Max count on SOURCE {{measurement}} = {{mm_count_source}}"
when: mm_count_source|int > 0
when: mm_count_source|int > 0

- name: '{{mm_prefix}} Create list of fields'
set_fact:
Expand All @@ -67,7 +67,7 @@
# now() - rp.amount+rp.unit - x*bf.step
body: >
q={{cq_select}} INTO {{target_mm}} FROM {{source_mm}}
WHERE time >= now() - {{seq}}{{ifx_db.retention_policy.unit}}
WHERE time >= now() - {{seq}}{{ifx_db.retention_policy.unit}}
AND time < now() - {{seq|int - ifx_db.backfill.step|default(1)|int}}{{ifx_db.retention_policy.unit}}
{{bf_where}} GROUP BY time({{cq_interval}}),{{cq_groupby|join(',')}}
return_content: yes
Expand All @@ -84,9 +84,9 @@

- name: '{{mm_prefix}} Print result from backfill'
debug: var=ansible_influx_mm_backfill
when: (ansible_influx_mm_backfill is succeeded
and ansible_influx_mm_backfill is not changed
and (ansible_influx_mm_backfill.results|map(attribute='skipped')|flatten|default([])|unique != [ true ]))
when: (ansible_influx_mm_backfill is succeeded
and ansible_influx_mm_backfill is not changed
and (ansible_influx_mm_backfill.results|map(attribute='skipped')|flatten|default([])|unique != [ true ]))
or ansible_influx_mm_backfill is failed

- name: '{{mm_prefix}} Sum up written data points'
Expand All @@ -99,9 +99,9 @@
when: mm_backfill and ansible_influx_mm_backfill is changed

- name: '{{mm_prefix}} Drop continuous query {{ifx_cq_name}}'
uri:
url: "{{ansible_influx_url}}/query"
method: POST
uri:
url: "{{ansible_influx_url}}/query"
method: POST
body: 'q=DROP CONTINUOUS QUERY "{{ifx_cq_name}}" ON "{{ifx_db.name}}"'
when: ifx_db.source is defined and ifx_cq_name in ifx_cqs

Expand Down
3 changes: 0 additions & 3 deletions tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,9 @@
#- debug: var=ansible_influx_cqs
- set_fact:
ifx_cqs: "{{ansible_influx_cqs.json.results[0].series |rejectattr('values', 'callable') |map(attribute='values') |flatten |select('match', '^(?!CREATE).*') |list if ansible_influx_cqs.json.results[0].series is defined else []}}"

#- debug: var=ifx_cqs

- name: Setup databases
include_tasks: influxdb_database.yml database={{db_item}}
with_items: "{{ansible_influx_databases|sort}}"
loop_control: { loop_var: db_item }


0 comments on commit 6a120c0

Please sign in to comment.