Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ~{var} expansion in yaml #45

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

baallan
Copy link
Collaborator

@baallan baallan commented May 20, 2022

After the discussion of supporting ~{} directly in ldms (which
identified the lack of resolution of variables beyond the same
statement as a problem), this patch provides the capability via
tilde-expansion to write most ldms plugin specifications as only two
lines while allowing override for special cases. This patch also lets
us get all the hard-wired yet incorrect (in specific plugin cases) default plugin
settings out of the python layer.

Example:
we can specify most sampler plugins as:

    plugins :
      - <<: *sampler-all-defaults
        name : tx2mon
      - <<: *sampler-all-defaults
        name : lustre_client
      - <<: *sampler-all-defaults
        name : loadavg

while supporting special cases:

     - <<: *sampler-time-defaults
        name : procstat
        config :
          <<: *sampler-config-defaults
          maxcpu : 0
          schema : procstat_0
      - <<: *sampler-time-defaults
        name : sysclassib
        config :
          <<: *sampler-config-defaults
          ports : "mlx5_0.1,mlx5_1.1"
          schema : sysclassib_lserver
      - <<: *sampler-time-defaults
        name : dstat
        config :
          <<: *sampler-config-defaults
          io : 1
          stat : 1
          statm : 1
          fdtypes : 1
          auto-schema : 1

if we define defaults as:

sampler_time_defaults : &sampler-time-defaults
  interval : "1s"
  offset : "0s"

sampler_all_defaults : &sampler-all-defaults
  <<: *sampler-time-defaults
  config : &sampler-config-defaults
    job_set : ${host}/jobid
    component_id: ${COMPONENT_ID}
    schema : ~{name}
    instance : ${host}/~{schema}
    producer : ${host}
    perm : "0644"

Similarly for stores:

stores:
  - <<: *csv-defaults
    schema : jobid
  - <<: *csv-defaults
    schema : lustre_client
  - <<: *csv-defaults
    schema : loadavg
  - <<: *csv-defaults
    schema : tx2mon
  - <<: *csv-defaults
    schema : meminfo

with special cases:

  - <<: *csv-defaults
    schema : meminfo_l1
    metrics:
      - Active
      - MemFree

  - <<: *csv-defaults
    schema : dstat_37
    producers:
      - regex : "wp1.*"
      - regex : "wp2.*"

if we define store defaults:

store_config_defaults: &csv-defaults
  name : csv-~{schema}
  daemons : *agg-name
  container : store_csv
  plugin :
    name : store_csv
    config :
      path : /ldmslocal/store
      altheader : 0
      typeheader : 1
      create_uid : 3031
      create_gid : 3031

After the discussion of supporting ~{} directly in ldms (which
identified the lack of resolution of variables beyond the same
statement as a problem), this patch provides the capability via
tilde-expansion to write most ldms plugin specifications as only two
lines while allowing override for special cases. This patch also lets
us get all the hard-wired yet incorrect (in some cases) default plugin
settings out of the python layer.

Example:
we can specify most sampler plugins as:
    plugins :
      - <<: *sampler-all-defaults
        name : tx2mon
      - <<: *sampler-all-defaults
        name : lustre_client
      - <<: *sampler-all-defaults
        name : loadavg
while supporting special cases:
     - <<: *sampler-time-defaults
        name : procstat
        config :
          <<: *sampler-config-defaults
          maxcpu : 0
          schema : procstat_0
      - <<: *sampler-time-defaults
        name : sysclassib
        config :
          <<: *sampler-config-defaults
          ports : "mlx5_0.1,mlx5_1.1"
          schema : sysclassib_lserver
      - <<: *sampler-time-defaults
        name : dstat
        config :
          <<: *sampler-config-defaults
          io : 1
          stat : 1
          statm : 1
          fdtypes : 1
          auto-schema : 1

if we define defaults as:

sampler_time_defaults : &sampler-time-defaults
  interval : "1s"
  offset : "0s"

sampler_all_defaults : &sampler-all-defaults
  <<: *sampler-time-defaults
  config : &sampler-config-defaults
    job_set : ${host}/jobid
    component_id: ${COMPONENT_ID}
    schema : ~{name}
    instance : ${host}/~{schema}
    producer : ${host}
    perm : "0644"

Similarly for stores:
stores:
  - <<: *csv-defaults
    schema : jobid
  - <<: *csv-defaults
    schema : lustre_client
  - <<: *csv-defaults
    schema : loadavg
  - <<: *csv-defaults
    schema : tx2mon
  - <<: *csv-defaults
    schema : meminfo

with special cases:
  - <<: *csv-defaults
    schema : meminfo_l1
    metrics:
      - Active
      - MemFree

  - <<: *csv-defaults
    schema : dstat_37
    producers:
      - regex : "wp1.*"
      - regex : "wp2.*"

if we define store defaults:
store_config_defaults: &csv-defaults
  name : csv-~{schema}
  daemons : *agg-name
  container : store_csv
  plugin :
    name : store_csv
    config :
      path : /ldmslocal/store
      altheader : 0
      typeheader : 1
      create_uid : 3031
      create_gid : 3031
@baallan
Copy link
Collaborator Author

baallan commented May 20, 2022

ovis-hpc/ldms#882 is the referenced issue that lead to this patch.
@tom95858

@tom95858
Copy link
Contributor

@baallan, could you please elaborate on "...this patch also lets
us get all the hard-wired yet incorrect (in specific plugin cases) default plugin
settings out of the python layer."

@tom95858
Copy link
Contributor

@baallan, ~{name} is used all over the place, which "name" is being substituted. It seems to me that whatever namespace scope you have presumed could create all kinds of confusion for the configuration writer.

@baallan
Copy link
Collaborator Author

baallan commented Jul 24, 2023

@baallan, could you please elaborate on "...this patch also lets us get all the hard-wired yet incorrect (in specific plugin cases) default plugin settings out of the python layer."

@tom95858
maestro_ctrl has hardwired some defaults, particularly producer, instance, component_id (line 579) that are not correct for all use cases (eg plugins that don't use instance, producers that are not defined by only hostname, component_id that do not come from the environment). At present in maestro these cannot be overridden. They should not be defaulted as currently, they should come from the user's input. A similar situation existed for stores at the time i wrote the patch, but may have since been rectified.

Separately from the 'remove hardwiring' implementation detail:

  • generating these 3 default values via ansible seems to be entirely plausible, though I have not done so myself.
  • In the absence of ansible or something extremely like it, this tilde-expansion patch allows the configuring admin/user to specify site-specific string patterns (such as the choices currently hardwired) dependent on other fields they already defined in the input.

@baallan
Copy link
Collaborator Author

baallan commented Jul 24, 2023

@baallan, ~{name} is used all over the place, which "name" is being substituted. It seems to me that whatever namespace scope you have presumed could create all kinds of confusion for the configuration writer.

@tom95858 The 'name' value being substituted is the value of the 'name' parameter at the same scope in nearly all cases. It is user-defined, not presumed. This seems pretty unconfusing to me. For example:

    plugins :
      - <<: *sampler-all-defaults
        name : tx2mon

is how you get the tx2mon sampler name defined and use all the site-specific defaults from the user-defined template sampler-all-defaults:

sampler_all_defaults : &sampler-all-defaults
  <<: *sampler-time-defaults
  config : &sampler-config-defaults
    job_set : ${host}/jobid
    component_id: ${COMPONENT_ID}
    schema : ~{name}
    instance : ${host}/~{schema}
    producer : ${host}
    perm : "0644"

Now, one could also rewrite this as ~{plugin_name} or ~{plugin_instance} for "enhanced clarity" in a near-future where we have multiple instances of plugins.

There are specific use cases where a user might want to look upward in the yaml dictionary tree rather than only at immediate peer key/value pairs, and this is also supported. One typical such case is being able to define a widely seen key/val pair such as "cluster": "clustername". Support for this tilde-subst would let us write a single yaml dictionary that could be reused as the configuration for each different cluster monitored on an L2; this helps us avoid the bugs that come from "clone yaml and diverge" configuration practices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants