ovis-hpc/ldms-containers
git repository contains recipes and scripts for
building Docker Images of various components in LDMS, namely:
ovishpc/ldms-dev
: an image containing dependencies for building OVIS binaries and developing LDMS plugins.ovishpc/ldms-samp
: an image containingldmsd
binary and sampler plugins.ovishpc/ldms-agg
: an image containingldmsd
binary, sampler plugins, and storage plugins (including SOS).ovishpc/ldms-maestro
: an image containingmaestro
andetcd
.ovishpc/ldms-ui
: an image containing UI back-end elements, providing LDMS data access over HTTP (uwsgi
+django
+ ovis-hpc/numsos + ovis-hpc/sosdb-ui + ovis-hpc/sosdb-grafana)ovishpc/ldms-grafana
: an image containinggrafana
and the SOS data source plugin for grafana (sosds)
Table of Contents:
- Brief Overview About Docker Containers
- Sites WITHOUT internet access
- SYNOPSIS
- EXAMPLES
- LDMS Sampler Container
- LDMS Aggregator Container
- Maestro Container
- LDMS UI Back-End Container
- LDMS-Grafana Container
- SSH port forwarding to grafana
- Building Containers
A docker container is a runnable instance of an image. In Linux, it is
implemented using namespaces (namespaces(7)).
docker create
command creates a container that can later be started with
docker start
, while docker run
creates and starts the container in one go.
When a container starts, the first process being run, or a root process, is the
program specified by the --entrypoint
CLI option or ENTRYPOINT
Dockerfile
directive. When the root process exits or is killed, the container status
becomes exited
. docker stop
command sends SIGTERM
to the root process, and
docker kill
command send SIGKILL
to the root process. The other processes in
the container are also terminated or killed when the root process is terminated
or killed. docker ps
shows "running" containers, while docker ps -a
shows
ALL containers (including the exited one).
When a container is created (before started), its mount namespace
(mount_namespaces(7))
is prepared by the Docker engine. This isolates container's filesystems from the
host. The Docker Image is the basis of the filesystem mounted in the container.
The image itself is read-only, and the modification to the files/directories
inside the container at runtime is done on the writable layer on top of the
image. They are "unified" and presented to the container as a single filesystem
by OverlayFS (most preferred by Docker, but other drivers like btrfs
could
also be used). A Docker Image is actually a collection of "layers" of root
directories (/
). When a container is stopped
(the root process
exited/killed), the writable top layer still persists until docker rm
command
removes the container.
The network namespace (network_namespace)
and the process namespace (process namespace)
of a container are normally isolated, but could also
use host's namespaces. The LDMS sampler containers (ovishpc/ldms-samp
) require
host process namespace (--pid=host
option) so that the ldmsd
reads host's
/proc
data. Otherwise, we will be collecting container's metric data. Other
LDMS containers do not need host process namespace. For the network namespace,
it is advisable to use host's network namespace (--network=host
) to fully
utilize RDMA hardware on the host with minimal effort in network configuration.
- On your laptop (or a machine that HAS the Internet access)
$ docker pull ovishpc/ldms-dev
$ docker pull ovishpc/ldms-samp
$ docker pull ovishpc/ldms-agg
$ docker pull ovishpc/ldms-maestro
$ docker pull ovishpc/ldms-ui
$ docker pull ovishpc/ldms-grafana
$ docker save ovishpc/ldms-dev > ovishpc-ldms-dev.tar
$ docker save ovishpc/ldms-samp > ovishpc-ldms-samp.tar
$ docker save ovishpc/ldms-agg > ovishpc-ldms-agg.tar
$ docker save ovishpc/ldms-maestro > ovishpc-ldms-maestro.tar
$ docker save ovishpc/ldms-ui > ovishpc-ldms-ui.tar
$ docker save ovishpc/ldms-grafana > ovishpc-ldms-grafana.tar
# Then, copy these tar files to the site
- On the site that has NO Internet access
$ docker load < ovishpc-ldms-dev.tar
$ docker load < ovishpc-ldms-samp.tar
$ docker load < ovishpc-ldms-agg.tar
$ docker load < ovishpc-ldms-maestro.tar
$ docker load < ovishpc-ldms-ui.tar
$ docker load < ovishpc-ldms-grafana.tar
Then, the images are available locally (no need to docker pull
).
In this section, the options in [ ]
are optional. Please see the #
comments
right after the options for the descriptions. Please also note that the options
BEFORE the Docker Image name are for docker run
, and the options AFTER the
image name are for the entrypoint script. The following is the information
regarding entrypoint options for each image:
ovishpc/ldms-dev
entrypoint options are pass-through to/bin/bash
.ovishpc/ldms-samp
entrypoint options are pass-through to ldmsd.ovishpc/ldms-agg
entrypoint options are pass-through to ldmsd.ovishpc/ldms-maestro
entrypoint options are ignored.ovishpc/ldms-ui
entrypoint options are pass-through to uwsgi.ovishpc/ldms-grafana
entrypoint options are pass-through to grafana-server program.
# Pulling images
$ docker pull ovishpc/ldms-dev
$ docker pull ovishpc/ldms-samp
$ docker pull ovishpc/ldms-agg
$ docker pull ovishpc/ldms-maestro
$ docker pull ovishpc/ldms-ui
$ docker pull ovishpc/ldms-grafana
# munge remark: munge.key file must be owned by 101:101 (which is munge:munge in
# the container) and has 0600 mode.
# ovishpc/ldms-maestro
$ docker run -d --name=<CONTAINER_NAME> --network=host --privileged
[ -v /run/munge:/run/munge:ro ] # expose host's munge to the container
[ -v /on-host/munge.key:/etc/munge/munge.key:ro ] # use container's munged with custom key
-v /on-host/ldms_cfg.yaml:/etc/ldms_cfg.yaml:ro # bind ldms_cfg.yaml, used by maestro_ctrl
ovishpc/ldms-maestro # the image name
# ovishpc/ldms-samp
$ docker run -d --name=<CONTAINER_NAME> --network=host --pid=host --privileged
-e COMPID=<NUMBER> # set COMPID environment variable
[ -v /run/munge:/run/munge:ro ] # expose host's munge to the container
[ -v /on-host/munge.key:/etc/munge/munge.key:ro ] # use container's munged with custom key
ovishpc/ldms-samp # the image name
-x <XPRT>:<PORT> # transport, listening port
[ -a munge ] # use munge authentication
[ OTHER LDMSD OPTIONS ]
# ovishpc/ldms-agg
$ docker run -d --name=<CONTAINER_NAME> --network=host --pid=host --privileged
-e COMPID=<NUMBER> # set COMPID environment variable
[ -v /on-host/storage:/storage:rw ] # bind 'storage/'. Could be any path, depending on ldmsd configuration
[ -v /on-host/dsosd.json:/etc/dsosd.json:ro ] # bind dsosd.json configuration, if using dsosd to export SOS data
[ -v /run/munge:/run/munge:ro ] # expose host's munge to the container
[ -v /on-host/munge.key:/etc/munge/munge.key:ro ] # use container's munged with custom key
ovishpc/ldms-agg # the image name
-x <XPRT>:<PORT> # transport, listening port
[ -a munge ] # use munge authentication
[ OTHER LDMSD OPTIONS ]
# Run dsosd to export SOS data
$ docker exec -it <CONTAINER_NAME> /bin/bash
(<CONTAINER_NAME>) $ rpcbind
(<CONTAINER_NAME>) $ export DSOSD_DIRECTORY=/etc/dsosd.json
(<CONTAINER_NAME>) $ dsosd >/var/log/dsosd.log 2>&1 &
(<CONTAINER_NAME>) $ exit
# ovishpc/ldms-ui
$ docker run -d --name=<CONTAINER_NAME> --network=host --privileged
-v /on-host/dsosd.conf:/opt/ovis/etc/dsosd.conf # dsosd.conf file, required to connect to dsosd
-v /on-host/settings.py:/opt/ovis/ui/sosgui/settings.py # sosdb-ui Django setting file
ovishpc/ldms-ui # the image name
[ --http-socket=<ADDR>:<PORT> ] # addr:port to serve, ":80" by default
[ OTHER uWSGI OPTIONS ]
# ovishpc/ldms-grafana
$ docker run -d --name=<CONTAINER_NAME> --network=host --privileged
[ -v /on-host/grafana.ini:/etc/grafana/grafana.ini:ro ] # custom grafana config
[ -e GF_SERVER_HTTP_ADDR=<ADDR> ] # env var to override Grafana IP address binding (default: all addresses)
[ -e GF_SERVER_HTTP_PORT=<PORT> ] # env var to override Grafana port binding (default: 3000)
ovishpc/ldms-grafana # the image name
[ OTHER GRAFANA-SERVER OPTIONS ] # other options to grafana-server
# -------------------------------------
# configuration files summary
# -------------------------------------
# - /on-host/dsosd.json: contains dictionary mapping hostname - container
# location in the host, e.g.
# {
# "host1": {
# "dsos_cont":"/storage/cont_host1"
# },
# "host2": {
# "dsos_cont":"/storage/cont_host2"
# }
# }
#
# - /on-host/dsosd.conf: contains host names (one per line) of the dsosd, e.g.
# host1
# host2
#
# - /on-host/settings.py: Django settings. Pay attention to DSOS_ROOT and
# DSOS_CONF variables.
In this example, we have 8-nodes cluster with host names cygnus-01 to cygnus-08.
cygnus-0[1-4]
are used as compute nodes (deploying ovishpc/ldms-samp
containers). cygnus-0[5-6]
are used as L1 aggregator (ovishpc/ldms-agg
containers without storage). cygnus-07
is used as L2 aggregator with a DSOS
storage (ovishpc/ldms-agg
with dsosd). cygnus-07
will also host
ovishpc/maestro
, ovishpc/ldms-ui
and ovishpc/ldms-grafana
containers.
We will be running commands from cygnus-07
. The cluster has munged
pre-configured and running on all nodes with the same key.
Configuration files used in this example are listed at the end of the section. The following is a list of commands that deploys various containers on the cygnus cluster:
# Start sampler containers on cygnus-01,02,03,04
root@cygnus-07 $ pdsh -w cygnus-0[1-4] 'docker run -d --name=samp --network=host --pid=host --privileged -v /run/munge:/run/munge:ro -e COMPONENT_ID=${HOSTNAME#cygnus-0} ovishpc/ldms-samp -x rdma:411 -a munge'
# Notice the COMPONENT_ID environment variable setup using Bash substitution.
# The COMPONENT_ID environment variable is later used in LDMSD sampler plugin
# configuration `component_id: ${COMPONENT_ID}` in the `ldms_cfg.yaml` file.
# Start L1 aggregator containers on cygnus-05,06
root@cygnus-07 $ pdsh -w cygnus-0[5-6] docker run -d --name=agg1 --network=host --pid=host --privileged -v /run/munge:/run/munge:ro ovishpc/ldms-agg -x rdma:411 -a munge
# Start L2 aggregator container on cygnus-07
root@cygnus-07 $ docker run -d --name=agg2 --network=host --pid=host --privileged -v /run/munge:/run/munge:ro -v /store:/store:rw ovishpc/ldms-agg -x rdma:411 -a munge
# Start dsosd in the `agg2`, our L2 aggregator container
root@cygnus-07 $ echo 'rpcbind ; dsosd > /var/log/dsosd.log 2>&1 &' | docker exec -i agg2 /bin/bash
# Start maestro container on cygnus-07
root@cygnus-07 $ docker run -d --name=maestro --network=host --privileged -v /run/munge:/run/munge:ro -v ${PWD}/ldms_cfg.yaml:/etc/ldms_cfg.yaml:ro ovishpc/ldms-maestro
# Start Django UI container
root@cygnus-07 $ docker run -d --name=ui --network=host --privileged -v ${PWD}/dsosd.conf:/opt/ovis/etc/dsosd.conf -v ${PWD}/settings.py:/opt/ovis/ui/sosgui/settings.py ovishpc/ldms-ui
# Start Grafana container
root@cygnus-07 $ docker run -d --name=grafana --privileged --network=host ovishpc/ldms-grafana
Related configuration files
# dsosd.conf
cygnus-07
# ldms_cfg.yaml
xprt: &xprt "rdma"
daemons:
- names : &samp-names "samp-[1-4]"
hosts : &samp-hosts "cygnus-0[1-4]-iw"
endpoints :
- names : &samp-eps "cygnus-0[1-4]-iw-ep"
ports : 411
xprt : *xprt
maestro_comm : True
auth :
name : munge
plugin : munge
- names : &L1-names "agg-[11-12]"
hosts : &L1-hosts "cygnus-0[5-6]-iw"
endpoints :
- names : &L1-eps "agg-[11-12]-ep"
ports : 411
xprt : *xprt
maestro_comm : True
auth :
name : munge
plugin : munge
- names : &L2-name "agg-2"
hosts : &L2-host "cygnus-07-iw"
endpoints :
- names : &L2-ep "agg-2-ep"
ports : 411
xprt : *xprt
maestro_comm : True
auth :
name : munge
plugin : munge
aggregators:
- daemons : *L1-names
peers :
- daemons : *samp-names
endpoints : *samp-eps
reconnect : 1s
type : active
updaters :
- mode : pull
interval : "1.0s"
offset : "200ms"
sets :
- regex : .*
field : inst
- daemons : *L2-name
peers:
- daemons : *L1-names
endpoints : *L1-eps
reconnect : 1s
type : active
updaters :
- mode : pull
interval : "1.0s"
offset : "400ms"
sets :
- regex : .*
field : inst
samplers:
- daemons : *samp-names
plugins :
- name : meminfo # Variables can be specific to plugin
interval : "1s" # Used when starting the sampler plugin
offset : "0s"
config : &simple_samp_config
component_id : "${COMPONENT_ID}"
perm : "0777"
stores:
- name : sos-meminfo
daemons : *L2-name
container : meminfo
schema : meminfo
flush : 10s
plugin :
name : store_sos
config :
path : /store
# settings.py
"""
Django settings for sosgui project.
Generated by 'django-admin startproject' using Django 1.8.2.
For more information on this file, see
https://docs.djangoproject.com/en/1.8/topics/settings/
For the full list of settings and their values, see
https://docs.djangoproject.com/en/1.8/ref/settings/
"""
# Build paths inside the project like this: os.path.join(BASE_DIR, ...)
import os
import json
log = open('/var/log/sosgui/settings.log', 'a')
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
# Quick-start development settings - unsuitable for production
# See https://docs.djangoproject.com/en/1.8/howto/deployment/checklist/
# SECURITY WARNING: keep the secret key used in production secret!
SECRET_KEY = 'blablablablablablablablablablablablablablablablablabla'
# SECURITY WARNING: don't run with debug turned on in production!
DEBUG = True
ALLOWED_HOSTS = [
'*',
]
APPEND_SLASH = False
STATIC_ROOT = os.path.join(BASE_DIR, "assets")
AUTH_USER_MODEL = 'sosdb_auth.SosdbUser'
# Application definition
INSTALLED_APPS = (
'corsheaders',
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'container',
'jobs',
'objbrowser',
'sos_db',
'sosdb_auth',
)
try:
from . import ldms_settings
INSTALLED_APPS = INSTALLED_APPS + ldms_settings.INSTALLED_APPS
except:
pass
try:
from . import grafana_settings
INSTALLED_APPS = INSTALLED_APPS + grafana_settings.INSTALLED_APPS
except:
pass
try:
from . import baler_settings
INSTALLED_APPS = INSTALLED_APPS + baler_settings.INSTALLED_APPS
except:
pass
MIDDLEWARE = (
'corsheaders.middleware.CorsMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
'django.middleware.security.SecurityMiddleware',
)
ROOT_URLCONF = 'sosgui.urls'
TEMPLATES = [
{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'DIRS': [
'/opt/ovis/ui/templates',
],
'APP_DIRS': True,
'OPTIONS': {
'context_processors': [
'django.contrib.auth.context_processors.auth',
'django.template.context_processors.debug',
'django.template.context_processors.request',
'django.contrib.messages.context_processors.messages',
],
},
},
]
WSGI_APPLICATION = 'sosgui.wsgi.application'
# Database
# https://docs.djangoproject.com/en/1.8/ref/settings/#databases
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
}
}
LANGUAGE_CODE = 'en-us'
TIME_ZONE = 'UTC'
USE_I18N = True
USE_L10N = True
USE_TZ = True
# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/1.8/howto/static-files/
STATIC_URL = '/static/'
STATICFILES_DIRS = [
'/opt/ovis/ui/static/',
]
SESSION_EXPIRE_AT_BROWSER_CLOSE = True
SOS_ROOT = "/store/"
DSOS_ROOT = "/store/"
DSOS_CONF = "/opt/ovis/etc/dsosd.conf"
LOG_FILE = "/var/log/sosgui/sosgui.log"
LOG_DATE_FMT = "%F %T"
ODS_LOG_FILE = "/var/log/sosgui/ods.log"
ODS_LOG_MASK = "255"
ODS_GC_TIMEOUT = 10
BSTORE_PLUGIN="bstore_sos"
os.environ.setdefault("BSTORE_PLUGIN_PATH", "/opt/ovis/lib64")
os.environ.setdefault("SET_POS_KEEP_TIME", "3600")
try:
import ldms_cfg
LDMS_CFG = ldms_cfg.aggregators
except Exception as e:
log.write(repr(e)+'\n')
LDMS_CFG = { "aggregators" : [] }
try:
import syslog
SYSLOG_CFG = syslog.syslog
except Exception as e:
log.write('SYSLOG_SETTINGS ERR '+repr(e)+'\n')
SYSLOG_CFG = { "stores" : [] }
# SYNOPSIS
$ docker run -d --name=<CONTAINER_NAME> --network=host --pid=host --privileged
-e COMPID=<NUMBER> # set COMPID environment variable
[ -v /run/munge:/run/munge:ro ] # expose host's munge to the container
[ -v /on-host/munge.key:/etc/munge/munge.key:ro ] # use container's munged with custom key
ovishpc/ldms-samp # the image name
-x <XPRT>:<PORT> # transport, listening port
[ -a munge ] # use munge authentication
[ OTHER LDMSD OPTIONS ] # e.g. -v INFO
ovishpc/ldms-samp
entrypoint executes ldmsd -F
, making it
the leader process of the container. Users can append [OPTIONS]
and they will
be passed to ldmsd -F
CLI. If -a munge
is given, the entrypoint script will
check if /run/munge
is a bind-mount from the host. If so, munge
encoding/decoding is done through munged
on the host via the bind-mounged
/run/munge
-- no need to run munged
inside the container. Otherwise, in the
case that -a munge
is given and /run/munge
is not host-bind-mounted,
the entrypoint script runs munged
and tests it BEFORE ldmsd
.
Usage examples:
## On a compute node
# Pull the container image
$ docker pull ovishpc/ldms-samp
# Start ldmsd container, with host network namespace and host PID namespace;
# - COMPID env var is HOSTNAME without the non-numeric prefixes and the leading
# zeroes (e.g. nid00100 => 100, nid10000 => 10000). Note that this uses
# bash(1) Parameter Expansion and Pattern Matching features.
#
# - serving on socket transport port 411 with munge authentication
#
# - using host munge
$ docker run -d --name=samp --network=host --pid=host --privileged \
-e COMPID=${HOSTNAME##*([^1-9])} \
-v /run/munge:/run/munge:ro \
ovishpc/ldms-samp -x sock:411 -a munge
We encourage to use maestro
to configure a cluster of ldmsd
. However, if
there is a need to configure ldmsd
manually, one can do from within the
container. In this case:
$ docker exec samp /bin/bash
(samp) $ ldmsd_controller --xprt sock --port 411 --host localhost --auth munge
LDMSD_CONTROLLER_PROMPT>
# SYNOPSIS
$ docker run -d --name=<CONTAINER_NAME> --network=host --pid=host --privileged
-e COMPID=<NUMBER> # set COMPID environment variable
[ -v /on-host/storage:/storage:rw ] # bind 'storage/'. Could be any path, depending on ldmsd configuration
[ -v /on-host/dsosd.json:/etc/dsosd.json:ro ] # bind dsosd.json configuration, if using dsosd to export SOS data
[ -v /run/munge:/run/munge:ro ] # expose host's munge to the container
[ -v /on-host/munge.key:/etc/munge/munge.key:ro ] # use container's munged with custom key
ovishpc/ldms-samp # the image name
-x <XPRT>:<PORT> # transport, listening port
[ -a munge ] # use munge authentication
[ OTHER LDMSD OPTIONS ]
# dsosd to export SOS data
$ docker exec -it <CONTAINER_NAME> /bin/bash
(<CONTAINER_NAME>) $ rpcbind
(<CONTAINER_NAME>) $ export DSOSD_DIRECTORY=/etc/dsosd.json
(<CONTAINER_NAME>) $ dsosd >/var/log/dsosd.log 2>&1 &
(<CONTAINER_NAME>) $ exit
ovishpc/ldms-agg
entrypoint executes ldmsd -F
, making it the
leader process of the container. It also handles -a munge
the same way that
ovishpc/ldms-samp
does. In the case of exporting SOS data through dsosd
,
the daemon is required to execute after the container is up.
Example usage:
## On a service node
# Pull the container image
$ docker pull ovishpc/ldms-agg
# Start ldmsd container, using host network namespace and host PID namespace;
# - with host munge
# - serving port 411
# - The `-v /on-host/storage:/storage:rw` option is to map on-host storage
# location `/on-host/storage` to `/storage` location in the container. The
# data written to `/storage/` in the container will persist in
# `/on-host/storage/` on the host.
$ docker run -d --name=agg --network=host --privileged \
-v /run/munge:/run/munge:ro \
-v /on-host/storage:/storage:rw \
ovishpc/ldms-agg -x sock:411 -a munge
# Start dsosd service for remote SOS container access (e.g. by UI), by first
# bring up a shell inside the container, then start rpcbind and dsosd.
$ docker exec agg /bin/bash
(agg) $ rpcbind
(agg) $ export DSOSD_DIRECTORY=/etc/dsosd.json
(agg) $ dsosd >/var/log/dsosd.log 2>&1 &
(agg) $ exit
dsosd.json
contains a collection of container_name
- path
mappings for
each host. For example:
{
"host1": {
"dsos_cont":"/storage/cont_host1",
"tmp_cont":"/tmp/ram_cont"
},
"host2": {
"dsos_cont":"/storage/cont_host2",
"tmp_cont":"/tmp/ram_cont"
}
}
# SYNOPSIS
$ docker run -d --name=<CONTAINER_NAME> --network=host --privileged
[ -v /run/munge:/run/munge:ro ] # expose host's munge to the container
[ -v /on-host/munge.key:/etc/munge/munge.key:ro ] # use container's munged with custom key
-v /on-host/ldms_cfg.yaml:/etc/ldms_cfg.yaml:ro # bind ldms_cfg.yaml, used by maestro_ctrl
ovishpc/ldms-maestro # the image name
ovishpc/ldms-maestro
containers will run at the least two daemons: etcd
and
maestro
. It may also run munged
if host's munge is not used (i.e.
-v /run/munge:/run/munge:ro
is not given to docker run
).
The entrypoint script does the following:
- starts
etcd
- starts
munged
if host's munge is not used. - execute
maestro_ctrl
with--ldms_config /etc/ldms_cfg.yaml
. Notice that theldms_cfg.yaml
file is given by the user by the-v
option. - execute
maestro
process.maestro
will periodically connect to allldmsd
specified byldms_cfg.yaml
and send the corresponding configuration.
REMARK: For now, the etcd
and maestro
processes in the
ovishpc/ldms-maestro
container run as stand-alone processes. We will support a
cluster of ovishpc/ldms-maestro
containers in the future.
Example usage:
## On a service node
# Pull the container image
$ docker pull ovishpc/ldms-maestro
# Start maestro container, using host network namespace, and using host's munge
$ docker run -d --network=host --privileged \
-v /run/munge:/run/munge:ro \
-v /my/ldms_cfg.yaml:/etc/ldms_cfg.yaml:rw \
ovishpc/ldms-maestro
Please see ldms_cfg.yaml for an example.
# SYNOPSIS
$ docker run -d --name=<CONTAINER_NAME> --network=host --privileged
-v /on-host/dsosd.conf:/opt/ovis/etc/dsosd.conf # dsosd.conf file, required to connect to dsosd
-v /on-host/settings.py:/opt/ovis/ui/sosgui/settings.py # sosdb-ui Django setting file
ovishpc/ldms-ui # the image name
[ --http-socket=<ADDR>:<PORT> ] # addr:port to serve, ":80" by default
[ OTHER uWSGI OPTIONS ]
ovishpc/ldms-ui
execute uwsgi
process with sosgui
(the back-end
GUI WSGI module) application module. It is the only process in the container.
The uwsgi
in this container by default will listen to port 80. The
--http-socket=ADDR:PORT
will override this behavior. Other options given to
docker run
will also be passed to the uwsgi
command as well.
The sosgui
WSGI application requires two configuration files:
dsosd.conf
: containing a list of hostnames of dsosd, one per line. See here for an example.settings.py
: containing a WSGI application settings. Please pay attention to DSOS_ROOT and DSOS_CONF. See here for an example.
Usage example:
## On a service node
# Pull the container image
$ docker pull ovishpc/ldms-ui
# Start ldms-ui container, using host network namespace
$ docker run -d --name=ui --network=host --privileged \
-v /HOST/dsosd.conf:/opt/ovis/etc/dsosd.conf \
-v /HOST/settings.py:/opt/ovis/ui/sosgui/settings.py \
ovishpc/ldms-ui
# SYNOPSIS
$ docker run -d --name=<CONTAINER_NAME> --network=host --privileged
[ -v /on-host/grafana.ini:/etc/grafana/grafana.ini:ro ] # custom grafana config
[ -e GF_SERVER_HTTP_ADDR=<ADDR> ] # env var to override Grafana IP address binding (default: all addresses)
[ -e GF_SERVER_HTTP_PORT=<PORT> ] # env var to override Grafana port binding (default: 3000)
ovishpc/ldms-grafana # the image name
[ OTHER GRAFANA-SERVER OPTIONS ] # other options to grafana-server
ovishpc/ldms-grafana
is based on
grafana/grafana-oss:9.1.0-ubuntu
with Sos data source plugin to access distributed-SOS data.
The grafana server listens to port 3000 by default. The options specified
at the docker run
CLI will be passed to the grafana-server
command.
## On a service node
# Pull the container image
$ docker pull ovishpc/ldms-grafana
# Start ldms-grafana container, this will use port 3000
$ docker run -d --name=grafana --privileged --network=host ovishpc/ldms-grafana
# Use a web browser to navigate to http://HOSTNAME:3000 to access grafana
In the case that the grafana server cannot be accessed directly, use SSH port forwarding as follows:
(laptop) $ ssh -L 127.0.0.1:3000:127.0.0.1:3000 LOGIN_NODE
(LOGIN_HODE) $ ssh -L 127.0.0.1:3000:127.0.0.1:3000 G_HOST
# Assuming that the ldms-grafana container is running on G_HOST.
Then, you should be able to access the grafana web server via
http://127.0.0.1:3000/
on your laptop.
TL;DR: edit config.sh, customize the *_REPO
, *_BRANCH
and
*_OPTIONS
, then run ./scripts/build-all.sh
.
The following steps describe the building process executed by the scripts/build-all.sh script:
- Build
ovishpc/ldms-dev
docker image. This "development" image contains development programs and libraries for building/opt/ovis
binaries anddsosds
. - Build
/opt/ovis
binaries with scripts/build-ovis-binaries.sh script. The environment variables specified in config.sh file inform the build script which reposositories or branches to check out and build. The variables categorized by the components are as follows:- ovis: the main component of OVIS project (
ldmsd
and LDMS python)OVIS_REPO
OVIS_BRANCH
- sos: the Scalable Object Storage technology
SOS_REPO
SOS_BRANCH
- maestro: the
ldmsd
cluster configuratorMAESTRO_REPO
MAESTRO_BRANCH
- numsos:
NUMSOS_REPO
NUMSOS_BRANCH
- sosdb-ui:
SOSDBUI_REPO
SOSDBUI_BRANCH
- sosdb-grafana:
SOSDBGRAFANA_REPO
SOSDBGRAFANA_BRANCH
The binaries output directory (absolute, or relative to the top source directory) is specified by theOVIS
variable in config.sh.
- ovis: the main component of OVIS project (
- Build
dsosds
grafana data source plugin for SOS data access with scripts/build-dsosds.sh. The following envronment variables in config.sh determine which repository and branch to check the code out for buildingdsosds
:DSOSDS_REPO
DSOSDS_BRANCH
Thedsosds
output directory (absolute, or relative to the top source directory) is specified byDSOSDS
variable in config.sh.
- Build
ovishpc/ldms-samp
image using theovis
binaries built in step 2. The LDMS Sampler Image contains onlyldmsd
, the sampler plugins and their dependencies. The storage plugins are not included.- See recipes/ldms-samp/docker-build.sh and recipes/ldms-samp/Dockerfile.
- Also see
OVIS_OPTIONS
in config.sh for the build options that enable/disable plugins.
- Build
ovishpc/ldms-agg
image using theovis
binaries built in step 2. The LDMS Aggregator Image contains SOS,ldmsd
and all plugins (both samplers and stores).- See recipes/ldms-agg/docker-build.sh and recipes/ldms-agg/Dockerfile.
- Also see
OVIS_OPTIONS
in config.sh for the build options that enable/disable plugins.
- Build
ovishpc/ldms-maestro
image using the maestro binaries fromovis
binaries built in step 2. This image also includesetcd
, a dependency ofmaestro
. - Build
ovishpc/ldms-ui
image using the UI components fromovis
binaries built in step 2 (ovis/ui/
). The image includesuwsgi
web server that is used to servesosdb-ui
Django application, providing SOS data access over HTTP. - Build
ovishpc/ldms-grafana
image based ongrafana
image and includedsosds
grafana data source plugin built in step 3. A container that instantiates from this image is bacially a grafana server withdsosds
data source plugin pre-installed.
Note that many of the docker-build.sh
scripts use tar
to create docker build
context (a set of files / directories for Docker Build process to ADD) instead
of using the working directory that contains Dockerfile
. This is so that we
don't have to copy the selected files from ovis
into each of the Dockerfile
directories.
It is also possible to manually run an ovishpc/ldms-dev
container and build
your version of ovis
(e.g. creating a new plugin) and package a custom
ovishpc/ldms-samp
with recipes/ldms-samp/docker-buildingn.sh
because the
docker-building.sh
script uses whatever binaries available in the ovis
directory.
Happy hacking! :)