What can F* do for configuration? #2938

briangmilnes · 2023-05-21T00:39:54Z

briangmilnes
May 21, 2023

I've always been frustrated by configuration systems in later
languages. We're not eating our own dogfood (as the awful saying
goes). They all have multiple configuration systems. I've personally
written five over the years, in lisp twice, in C twice and in SML
once.

Only the SML one, which was simply a signature, a functor and a
module was fairly clean. But it did not allow constant folding of
static calls where configuration variables may not be defined
nor offline configuration checking.

My experience in industry with configurations were pretty bad. Two
million core dumps one day at Lycos, hours and hours down on various
systems at Amazon with bugs regressing frequently. The Linux kernel
now has a huge tool for handling its thousand configurations.

Looking at the literature for configuration functional programming,
there are a few hundred citations to configuration papers and they are
all very convoluted. Implicit arguments, new versions of type classes
and so on. So an interesting little area of research.

So I thought I'd have a shot at it and see what I could get F* to do
to improve the state of the art.

What I have always yearned for is:

A) You write the configuration in the language. So you get full
syntax, semantics and type checking.

B) You have a complete suite of other checks (valid combinations, path
checking) and so on.

C) You compile and incrementally link in just a module of
configurations, after it validates.

D) It reads all the configurations into variables in modules or closures just
once at startup, allowing any additional checks to trigger fail fast
exceptions.

I wrote a simple Configuration that simply held a list of an ADT
of types of configurations: Static, purely functional. So we get A. I
coded up a set of B and ran them in assert(validConfig(conf)) by compute().
Which means you get a fully functional compile/validate time checks,
B, without writing an external checking program in some invented
little language. Validation by normalization is really very very cool.

On C, OCaml is using the native C compiler, and the gnu compilers can
be configured for incremental linking.

So on D, I'm not sure when and how and when F* folds constants on
purely function calls. Can someone explain? A top level variable
binding to a configuration lookup works and writes the code expected
in OCaml. But the top level lets are evaluated when the module is
accessed first and it's lazy by default in OCaml (there is a workaround
that the F* compiler could implement).

Another way to do D would be to allow F* to use normalization to
evaluate a term to a constant, an eval(findBool configs "DEBUG") by
compute() for example. But that probably has some large semantics
impact.

Another issue with this is that you want any missing or incorrect
configurations to be caught on startup and throw an exception. This is
called fail-fast in the reliability literature. This should not be a
problem in generated C code, it will initialize at startup time and
can raise a signal or exit the program.

It's not clear to me that allowing exceptions would not force all of
your configuration variables to have an EXN type and thus lift up all
Tot functions that touch a config into EXN.

Disadvantages of this style

i) You have to have a linker on a server (somewhere) which is a small
security leak. But almost all systems have this.

ii) Even incremental linking may be slower than changing a config
file, testing it, and pushing it out to servers. Either every server
has to relink, or a larger binary has to be shipped out.

iii) Incremental linking may prelude some inter file linker level
optimizations.

iv) I should probably read some more of the research papers to ensure
that I have not missed any system doing a better job.

v) What this does to the semantics of F* is not clear. You need some
way to get failfast in the semantics of configurations, and eager
module loading, without raising everything into an EXN or worse an ML
effect.

To summarize

F* has some real advantages for configuration but would require an eager
module initialization for some and perhaps another effect which may not work.

P.S. GPT-4 was helpful on the module semantics but it thinks it knows F* uses
lazy module initialization but I have no idea where it got that idea.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What can F* do for configuration? #2938

{{title}}

Replies: 0 comments

Select a reply

What can F* do for configuration? #2938

briangmilnes May 21, 2023

Replies: 0 comments

briangmilnes
May 21, 2023