Skip to content

Latest commit

 

History

History
245 lines (196 loc) · 8.96 KB

README.org

File metadata and controls

245 lines (196 loc) · 8.96 KB

sexpr: A S-expression library for Rust

sexpr strives to be the canonical library for reading, manipulating and writing S-expressions in Rust.

The parser is fully featured, performant and can be configured to read almost any s-expression variant including ”Standard”, ”Advanced” and ”Canonical” formats.

Predefined compatibility configurations exist for a dozen standards and protocols including:

  • Standard
  • RFC 2693-compatible SPKI
  • SMTLIB & SMTLIBv2
  • GPG / libgcrypt
  • KiCad
  • R5RS-compatible abstract-syntax

Individual format options can be enabled or disabled, allowing you to parse virtually any variant.

Configuration

sexpr accepts numerous configuration

Predefined Configurations

Namesquare_bracketssemi_commentcolon_keywordshex_escapespipe_actionNotes
STANDARDBase64InteriorA generic ‘standard’ s-expression
SMTLIBQuoteInteriorA common interchange format for SAT and SMT solvers
KICADNoneA computer-aided design program
GUILENoneA scheme intended for embedding in existing C programs
CANONICALNoneA common, interchangable encoding for many cryptographic protocols.

Configuration Variables

semi_comment

Line comments can be enabled when parsing s-expressions by setting semi_comment = Some(&["#", ";"]).

This ignores the rest of the stream until encountering a newline or EOF, this does not comment out interior s-expressions like proposals like SRFI 62.

colon_keywords

Many Scheme implementations assign a special meaning to atoms beginning with #: or :, sexpr can parse these as ‘keywords’ or they can be treated as valid starting characters to an ordinary symbol. (item :keyword value :keyword2 value)

You can control this behaviour with ParseConfig.allow_keywords = Some(&["#", "#:"])

square_brackets

Some Lisp implementations and s-expressions allow square brackets ([ and ]) to be used as an alternative bracket character.

These brackets must still themselves remain matched.

radix_escape
#

Libgcrypt, GPG and numerous Scheme implementations allow you to enclose a hexadecimal string in # characters: (q #61626364656667#)

You can enable this with hex_escapes

#b and #x

In a similar fashion, #b and #x are used to specify binary and hexadecimal encodings of the number that follows the trailing letter.

((two-hundred-fifty-five #xff) would be encoded as List(Symbol(two-hundred-fifty-five), U64(255)) Similarly, (sixteen #b10000)) would be encoded as List(Symbol(sixteen), U64(16))

You can control if both of these are accepted with the radix_escape option.

Both

When both of these options are enabled in tandem, sexpr will use the following character to determine the variety of radix specification.

Neither

If radix_escape is false, the initial # character will be treated as an atom.

parse_pipe_behavior

Standard decoding treats the | character as a valid starting literal to any Atom, although two other options are permitted:

Advanced-style Rivest-style ‘advanced’ encodings dictate a string between two | characters be decoded as a stream of u8 (octets) in Base64.

Use ParseConfig.pipe_action = ParsePipeBehavior::Base64Interior

SMTLIBv2 SMT and SAT solvers using this format use the | character to quote it’s interior, preserving line breaks and other whitespace in a Symbol.

Use ParseConfig.pipe_action = ParsePipeBehavior::QuoteInterior

transport

Today, sexpr supports the most common form of S-expression transport encoding, RFC 4648 Base64. To indicate that you’d like to encode or decode an S-expression as Base64, you can modify your configuration as following.

let mut config = STANDARD.copy()
mut.transport = TransportEncoding::Base64

If you’d like to add a new transport field, simple add to the TransportEncoding enum, and create a new trait that implements SexpTransport, the rest is handled for you.

Encoding

In a 2012 Dr. Dobb’s retrospective, Karl Eiger noted that S-expressions are have been in continuous use longer than any other formats that remain in widespread use today.

Despite this long history, there is no canonical way to encode a variety of different abstract data structures.

Sequences

let vec: Vec<i32> = vec![1,2,3];
sexpr::encode(&vec)

Result:

(1 2 3)
let hs: HashSet<i32> = vec!(1, 2, 3).into_iter().collect();
sexpr::encode(&hs)

Result:

(1 2 3)

Hash Tables

let ht = HashMap::new();
ht.insert('a', 1);
ht.insert('b', 2);
ht.insert('c', 3);
sexpr::encode(&ht);

Result:

((a . 1) (b . 2) (c . 3))

Tuple

Struct
struct TupleStruct(i32, i32, i32);
let ts = TupleStruct(1, 2, 3);
sexpr::encode(&ts);

Result:

((_field0 1) (_field1 2) (_field2 3))

Struct

Ordinary
struct Color {
     r: u8,
     g: u8,
     b: u8,
}
sexpr::encode(&Color {r: 1, g: 2, b: 3});
sexpr::encode(&Color {r: 1, g: 2, b: 3}, (true));

Result:

((variant Color) ((r 1) (g 2) (b 3)))
((r 1) (g 2) (b 3))
Tuple Struct
struct Kangaroo(u32, String);
sexpr::encode(&Kangaroo(34, &"William");

Result:

Newtype
struct Inches(u64)
sexpr::encode(&Inches(128));

Result

128
Unit
struct Instance

Result:

nil

Enum

enum E {
    W { a: i32, b: i32 },
    X(i32, i32),
    Y(i32),
    Z,
}

E::W { a: 0, b: 0 };
E::X(0, 0);
E::Y(0);
E::Z;

Result:

((variant W) ((a 0) (b 0)))
((variant X) (0 0))
((variant Y) 0)
((variant Z))