Common lisp next generation

This document’s main goal is to collect ideas and propositions about next generation CL (standard/specification/othername). This is NOT a template for a new specification, but merely a place for discussion.

Ideas should be pull requests and discussions should take place in issues. Keep in mind that many things proposed here can be contradictory to each other, resulting in several differently designed languages. However this is a step forward compared to single messages and abrupt threads spread across different random chats, blogs and forums. Surely it is time to start accumulating ideas no matter how fundamental or small they are. While the main inspiration comes from Common Lisp, people contributing to this usually know more than one language.

Motivation:

The Common Lisp standard was created in 1994. The major goal was to unite many (relatively big as well as small) dialects of Lisp making it an ultimate successor of MacLisp Lisp series. The natural advantage of such an approach was uniting communities around a single project, thus getting a large amount of manpower for further development. However, as a result the developers had to try to maintain a certain level of backwards compatibility with all the previously existing dialects, resulting in a language with different (not always consistent with each other) features and a certain historical baggage.

Summary:

So far it seems that all suggestions can be divided into three categories – fully backwards compatible, almost backwards compatible (with very minor refactorizations required)… and basically a new lisp language. Apparently, the new language suggestions tend to lean towards low-level features, as the demand for high-level ones is already satisfied by high flexibility of CL.

Fully backwards compatible propositions share goals with the CDR project, read below.

Alternatives:

CDR Project The project is mainly focused on backwards compatible suggestions, implemented incrementally.
Honorable mention:

List of proposed ANSI revisions

And there is also List of clarifications

CIEL is an attempt at creating a batteries-included Common Lisp distribution. It shares similar goals to the first part of our document about fully backwards-compatible changes.

Fully backwards compatible changes:

High-level features:

Extensible loop

Details

The very reason the Iterate library exists. However, having a loop construct (a basic feature) as a library makes it very non obvious, especially for newcomers. The “Iterate vs LOOP” argument, certain design choices (LOOP being “unlispy” vs Iterate using code-walker) – are issues to be resolved.

Reasoning

Current status

Improved hash-tables.

Details

There are many possible improvements for hash-tables – readable hash-tables, initial-values for make-hash-table, typed hash-tables, arbitary :test function for hash-tables, standard support for weak hash table keys and values.

Reasoning

Current status

Unifying variables declarations.

Details

Let handling multiple value bindings and/or list-pattern matching. Global variables with defglobal, see sb-ext:defglobal.

Reasoning

Current status

General features:

Clear policy rules

Details

This includes, but is not limited to, TCO cases, inlining, compiler-macros expansion, general safety of (safety 0) etc.

Reasoning

Current status

Standard parser for lambda and macro lambda lists.

Details

There are a lot of generic half-baked ones. It also may include the discussion of what can be added/changed about lambda-lists structure – such as pattern matching.

Reasoning

Current status

CFFI

Details

Reasoning

Current status

Existing one is decent but surely there are improvememnts. This part requires specific problems described.

Low-level features:

Equivalence functions cleanup.

Details

Depending on the implementation numbers sometimes are eq and sometimes aren’t. This can be complicated for some implementations, therefore maybe this issue is rather a consequence of some other issues that should be highlighted. Some equalp quirks are also questionable.

Reasoning

Current status

Unicode support

Details

Reasoning

Current status

Long string literals split across lines with indentation

Details

(foo bar "this is just one \
          \ string literal with only single spaces")

As well as special characters in string literals via something analogous to \x3F, \177, \n, \t, \u+1234.

Reasoning

Current status

Expand-full function

Details

Perform all expansion on an expression in a given macro environment. Optionally report all free variables.

Reasoning

Current status

Allow `eval` access to environment

Details

That implies eval being able to eval things that only make sense in certain environment.

Reasoning

Current status

Security (fixing reader eval, …)

Details

More security on certain areas.

Reasoning

Current status

Compilation

Details

Different ways of compilation in more details, for example block-compilation, akin to what is done in SBCL.

Reasoning

Current status

Almost backwards compatible changes:

Extensible sequences

Details

Extensible data structures of different kind. The protocol for sequences is also a thing to discuss.

Reasoning

Current status

Native lazy list via lazy-cons type which satisfies consp.

Details

While laziness can be theoretically speaking implemented as a library, the efficient (that is, for production use) laziness is nontrivial to make. Therefore, it makes sense for maintainers of the language to implement it (at some point) as a part of (semi-)standard library.

Reasoning

Current status

Standard library redesign

Details

Some things that are in there can be in utility libs such as Alexandria, while some things from Alexandria can be too useful to not include them.

Reasoning

Current status

Standardize the Meta-Object Protocol for CLOS

Details

Instead of closer-mop we should have just mop. This includes both what currently is in MOP as well as some additions – better definition lookup, all that concerns structures etc.

Reasoning

Current status

First-class macros

Details

Macros that can be bound to variables, passed as arguments and returned from functions. A more detailed explanation.

Reasoning

Current status

Executables and binary files

Details

A standard way to build them, maybe in different forms, with/without tree shaking.

Reasoning

Current status

Sockets

Details

(At least) BSD sockets interface standardization.

Reasoning

Every modern language since 1990 includes BSD sockets library in its core.

Current status

Interfaces are not standardized. There are incompatible implementation-dependent extensions and projects like trivial-sockets and IOlib.

GC finalization support: register callback for finalized object

Details

At least some control over it is in high demand. Better support for dynamic-extent. For more specific examples look here.

Reasoning

Current status

Environments

Details

Standardized, and a set of basic functions to work with them.

Reasoning

Current status

Standardized code walking primitives: one body of user code which correctly walks all special forms.

Details

Reasoning

Current status

There is hu.dwim as a library.

Name conflicts

Details

Reasoning

Current status

As a compatibility library, here is how it looks for a specific implementation.

Unified naming patterns

Details

Have all constants be named like +constant+ (wrapped in plus signs).
Have all dynamic variables be named like *dynamic-variable* (wrapped in ‘earmuffs’).
Either:
- have all predicates named like *? or *-p for full consistency.
- have 2 naming patterns (as it is now) but actually use them consistently:
  - have predicates be named like *p if * is 1 word.
  - have predicates be named like *-p if * is 2+ words separated by hyphens.

Having the standard dictate that incorrect usage of the above be a compilation or even runtime error would make this change (almost?) definitely backwards incompatible (to an unknown (but potentially huge) extent).

Reasoning

It has been the community consensus for many years that these naming patterns should be used based on what meaning a given name holds. Any exception to this rule is just unexpected, unnecessary and probably should be treated as an error because the way a variable is named conveys the intended way of using it. Some compilers (i.e. SBCL) even give warnings about misuse of dynamic-variable-looking variables based on this particular convention.

Current status

The names for constants are not consistent across the board. Some examples of ‘incorrectly’ named constants (from just the standard):

Examples of incorrectly named constants:

cl:most-negative-fixnum
cl:most-positive-fixnum
cl:internal-time-units-per-second
cl:array-rank-limit
cl:array-dimension-limit

Example of incorrectly used dynamic variable naming pattern (before the proposed change) (SBCL implementation):

(let ((*a* 1))
  (setf *a* 2))

; file: /tmp/slimer0wqXc
; in: LET ((*A* 1))
;     (LET ((COMMON-LISP::*A* 1))
;       (SETF COMMON-LISP::*A* 2))
;
; caught COMMON-LISP:STYLE-WARNING:
;   using the lexical binding of the symbol (COMMON-LISP::*A*), not the
;   dynamic binding, even though the name follows
;   the usual naming convention (names like *FOO*) for special variables
;
; compilation unit finished
;   caught 1 STYLE-WARNING condition

Examples of predicates named *p vs *-p (including a popular 3rd-party library):

‘Good’ examples (before the proposed change):

cl:evenp
cl:oddp
cl:stringp
cl:base-string-p
cl:hash-table-p
cl:pathname-match-p

‘Bad’ examples (before the proposed change)

bordeaux-threads:lock-p
bordeaux-threads:semaphore-p
cl:string-lessp
cl:string-not-lessp
cl:string-greaterp
cl:string-not-greaterp

New (presumably low-level) language:

All of the above suggestions apply to this as well however if a new language is being made, it makes sense to care less about any kind of backwards compatibility and more about features. The ones presented here have very general description and are directions rather than something that can be put into actual specification.

Object system

Objects

Details

How objects should be made? Constructors, destructors, (multiple) inhertiance vs composition etc

Reasoning

There are different object systems for different tasks, but some of them are easier implemented in terms of another. Currently classes/structures cannot be parameterized in any way.

Current status

CLOS has classes that are more first-class citizens, compared to structures that are less used and supported. Both of them interact with the type system in a certain way, sometimes not the best.

Type system

Details

Which types should and should not be included, boxed vs unboxed types, type parameterization, linear types, dependent types etc. Interaction with other parts such as (compiler-)macros.

Reasoning

The current type system is state of the art, but it is state of the art from the 90s. There was a lot of research in the area for the last decade that should be considered. While not all of this (if any) should go into the spec, the goal is to make it easier for the future developers to incorporate their preferred features into the language.

Current status

Performnance and types in Lisp
There are various attempts to extend the type system with proper parameterized types, recursive type definitions, more strict type checks and inference, the major example being Coalton.

Methods

Details

Depending on the class system, methods or rather “methods” can be organized into traits/typeclasses, or generics, or belong to the class (unlikely). They can also specialize either on classes or types.

Reasoning

THe way generics work in CLOS allows for a certain flexibility and famous runtime redefinition. However at the same time, the performance of the current approach is quite poor, restriciting its use. Sometimes specializing on classes instead of types can be limiting.

Current status

There are several atttempts to deal with the inefficiency (in terms of raw performance and safety) of generic functions – including fast-generic-functions, specialization-store, and others. However, they do not fully avoid the limitations mentioned above.

There are also attempts to do something completely different such as LIL – they should not be forgotten.

Syntax

While not a low-level thing on its own, any changes to syntax can imply changes so breaking (for backwards compatibility) that it cannot be put in any of the above categories. The opinions on the subject heavily differ, the opposite approaches suggested are:

No additional syntax at all (even for strings) with the ability to heavily modify the reader extending it for these types of things

vs

A lot of builtin syntactic sugar, for example for literals, but some other things as well (some examples presented below), since there aren’t that many common special syntactic constructs, for ease of use.

The default case, case sensitivity etc are also a thing to discuss, although there isn’t much talk about that part.

This section can be separated in the future if necessary.

Reader macros

Details

The way reader macros should work and interaction between user and reader.

Reasoning

Currently the system does not allow users to hook into the reader. That could be a huge improvement, allowing for various modifications.

Current status

There are libraries that try to extend possibilities, such as named-readtables.

Syntax

Details

If and where can [] or {} be introduced, slot/structure dot . access, etc.

Reasoning

Compare (slot-value (slot-value (slot-value x 'foo) 'bar) 'baz) vs x.foo.bar.baz vs (at x 'foo 'bar 'baz').

Current status

Available in the Access library: dotted and nested access of data structures (including, but not limited to, slot values). Access is shipped in the CIEL project.

Available in the Rutils library: dot notation and index-based access. (elt (nth 1 (foo-slot2 (bar-slot1 obj)) 0) can be written @obj.slot1.slot2#1#0.

Unified order of arguments

Details

Operations (functions, macros etc) have predictable (possibly identical) order of arguments. If an operation takes for example 2 arguments (data structure and an index/key) - it is expected that the data structure is always in the same position and the key is also in the same position across the board, regardless of what the actual positions are.

Reasoning

This issue is an unnecessary burden on one’s mind when developing a project and forces the developer to break their focus to think about an out of place implementation detail instead of on the program’s business logic. It’s just something that is there for no reason other than legacy.

Current status

The standard contains functions like (gethash key hashtable) and (aref array index). In the case of gethash the data structure (hashtable) is the 2nd argument. In the case of aref the data structure (array) is the 1st argument. These are just 2 examples of such inconsistency. One potential example to follow would be the bulk of list operations, like mapcar, which take key (lambda) as the 1st argument and list(s) (data structure) as the 2nd or 3rd (etc) arguments and those kinds of functions are pretty much consistent across the board.

System definitions

Packages

Details

Allow to resolve names at runtime, more convenient export system etc.

Reasoning

Current status

Pathnames

Details

Interaction with pathnames w.r.t. current OS landscape. One standard way to parse a POSIX or Windows path string to a path name, or a URL. Path names should have a “method” for this.

Reasoning

Current status

Separation into libraries

Details

The core language can be separated into libraries with separate condition system, data structures library, algorithms library, math library, concurrency library, iteration library, code-walking library, etc.

Reasoning

Current status

SICL and Clasp compilers are built with this idea in mind.

Memory

GC

Details

GC vs RAII in some form. There are several alternatives. Semantics of the language depends heavily on this as well. As this section is probably one of the most fundamental things for Common Lisp (as well as most other lisps), here’s a more detailed description:

Problems/Limitations/Use-cases with GC
Aspects include real-time systems, working well with foreign code, embedded-systems aka devices with limited memory; maybe something else
The whole page is an interesting read: Garbage_collection_(computer_science) - Wikipedia
https://en.wikipedia.org/wiki/Manual_memory_management#Manual_management_and_correctness
What are the systems where GC is and isn’t an issue (limited-memory systems?)
Default GC, but if GC does have disadvantages, an option to handle the memory manually should be provided.

Reasoning

GC vs (semi) manual memory management is about many things, including performance, convenience of use, precision of time estimations, more complicated structures etc.

Current status

trivial-garbage provides a portable API to finalizers, weak hash-tables and weak pointers on all major implementations of the Common Lisp programming language.

Continuations

Details

A powerful low-level control construct.

Reasoning

It is up to the debate for several reasons, one of them being its interaction with unwind-protect.

Current status

Misc

Useful accessors on macro environment objects.

Is this idea new?

Of course not. Attempts to build low level C-like lisp exist, lots of them: 1, 2, 3, 4 and there are more. Attempts to build low-level statically-typed lisp-like language are also well known: 1, 2 and there are more. Two things they presumably lack are: pre-built well defined specification and community visibility and support.

Same can be said about attempts to just upgrade exiting CL implementation, such as famous CL21.

Useful links:

Common Lisp: The Untold Story and friends have a lot of useful info in them. Paul Khuong blog has many notes on potential compiler improvement, althoug specific to SBCL.

Obstacles

While the discussion is good by itself, the important question is – how can this come to life? There are 3 major components:

Money
Time
People

Conclusion

May not be written until the bulk of this document is finished.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
LICENSE		LICENSE
README.org		README.org

License

commander-trashdin/clng

Folders and files

Latest commit

History

Repository files navigation

Common lisp next generation

Motivation:

Summary:

Alternatives:

Fully backwards compatible changes:

High-level features:

Extensible loop

Details

Reasoning

Current status

Improved hash-tables.

Details

Reasoning

Current status

Unifying variables declarations.

Details

Reasoning

Current status

General features:

Clear policy rules

Details

Reasoning

Current status

Standard parser for lambda and macro lambda lists.

Details

Reasoning

Current status

CFFI

Details

Reasoning

Current status

Low-level features:

Equivalence functions cleanup.

Details

Reasoning

Current status

Unicode support

Details

Reasoning

Current status

Long string literals split across lines with indentation

Details

Reasoning

Current status

Expand-full function

Details

Reasoning

Current status

Allow eval access to environment

Details

Reasoning

Current status

Security (fixing reader eval, …)

Details

Reasoning

Current status

Compilation

Details

Reasoning

Current status

Almost backwards compatible changes:

Extensible sequences

Details

Reasoning

Current status

Native lazy list via lazy-cons type which satisfies consp.

Details

Reasoning

Current status

Standard library redesign

Details

Reasoning

Current status

Standardize the Meta-Object Protocol for CLOS

Allow `eval` access to environment