This document’s main goal is to collect ideas and propositions about next generation CL (standard/specification/othername). This is NOT a template for a new specification, but merely a place for discussion.
Ideas should be pull requests and discussions should take place in issues. Keep in mind that many things proposed here can be contradictory to each other, resulting in several differently designed languages. However this is a step forward compared to single messages and abrupt threads spread across different random chats, blogs and forums. Surely it is time to start accumulating ideas no matter how fundamental or small they are. While the main inspiration comes from Common Lisp, people contributing to this usually know more than one language.
The Common Lisp standard was created in 1994. The major goal was to unite many (relatively big as well as small) dialects of Lisp making it an ultimate successor of MacLisp Lisp series. The natural advantage of such an approach was uniting communities around a single project, thus getting a large amount of manpower for further development. However, as a result the developers had to try to maintain a certain level of backwards compatibility with all the previously existing dialects, resulting in a language with different (not always consistent with each other) features and a certain historical baggage.
So far it seems that all suggestions can be divided into three categories – fully backwards compatible, almost backwards compatible (with very minor refactorizations required)… and basically a new lisp language. Apparently, the new language suggestions tend to lean towards low-level features, as the demand for high-level ones is already satisfied by high flexibility of CL.
Fully backwards compatible propositions share goals with the CDR project, read below.
- CDR Project The project is mainly focused on backwards compatible suggestions, implemented incrementally.
- Honorable mention:
List of proposed ANSI revisions
And there is also List of clarifications
- CIEL is an attempt at creating a batteries-included Common Lisp distribution. It shares similar goals to the first part of our document about fully backwards-compatible changes.
The very reason the Iterate library exists. However, having a loop construct (a basic feature) as a library makes it very non obvious, especially for newcomers. The “Iterate vs LOOP” argument, certain design choices (LOOP being “unlispy” vs Iterate using code-walker) – are issues to be resolved.
There are many possible improvements for hash-tables – readable hash-tables, initial-values for make-hash-table
, typed hash-tables, arbitary :test
function for hash-tables, standard support for weak hash table keys and values.
Let
handling multiple value bindings and/or list-pattern matching. Global variables with defglobal
, see sb-ext:defglobal
.
This includes, but is not limited to, TCO cases, inlining, compiler-macros expansion, general safety of (safety 0)
etc.
There are a lot of generic half-baked ones. It also may include the discussion of what can be added/changed about lambda-lists structure – such as pattern matching.
Existing one is decent but surely there are improvememnts. This part requires specific problems described.
Depending on the implementation numbers sometimes are eq
and sometimes aren’t. This can be complicated for some implementations, therefore maybe this issue is rather a consequence of some other issues that should be highlighted. Some equalp
quirks are also questionable.
(foo bar "this is just one \ \ string literal with only single spaces")
As well as special characters in string literals via something analogous to \x3F, \177, \n, \t, \u+1234.
Perform all expansion on an expression in a given macro environment. Optionally report all free variables.
That implies eval being able to eval things that only make sense in certain environment.
More security on certain areas.
Different ways of compilation in more details, for example block-compilation, akin to what is done in SBCL.
Extensible data structures of different kind. The protocol for sequences is also a thing to discuss.
While laziness can be theoretically speaking implemented as a library, the efficient (that is, for production use) laziness is nontrivial to make. Therefore, it makes sense for maintainers of the language to implement it (at some point) as a part of (semi-)standard library.
Some things that are in there can be in utility libs such as Alexandria, while some things from Alexandria can be too useful to not include them.
Instead of closer-mop we should have just mop. This includes both what currently is in MOP as well as some additions – better definition lookup, all that concerns structures etc.
Macros that can be bound to variables, passed as arguments and returned from functions. A more detailed explanation.
A standard way to build them, maybe in different forms, with/without tree shaking.
(At least) BSD sockets interface standardization.
Every modern language since 1990 includes BSD sockets library in its core.
Interfaces are not standardized. There are incompatible implementation-dependent extensions and projects like trivial-sockets and IOlib.
At least some control over it is in high demand. Better support for dynamic-extent. For more specific examples look here.
Standardized, and a set of basic functions to work with them.
Standardized code walking primitives: one body of user code which correctly walks all special forms.
There is hu.dwim as a library.
As a compatibility library, here is how it looks for a specific implementation.
- Have all constants be named like
+constant+
(wrapped in plus signs). - Have all dynamic variables be named like
*dynamic-variable*
(wrapped in ‘earmuffs’). - Either:
- have all predicates named like
*?
or*-p
for full consistency. - have 2 naming patterns (as it is now) but actually use them consistently:
- have predicates be named like
*p
if*
is 1 word. - have predicates be named like
*-p
if*
is 2+ words separated by hyphens.
- have predicates be named like
- have all predicates named like
Having the standard dictate that incorrect usage of the above be a compilation or even runtime error would make this change (almost?) definitely backwards incompatible (to an unknown (but potentially huge) extent).
It has been the community consensus for many years that these naming patterns should be used based on what meaning a given name holds. Any exception to this rule is just unexpected, unnecessary and probably should be treated as an error because the way a variable is named conveys the intended way of using it. Some compilers (i.e. SBCL) even give warnings about misuse of dynamic-variable-looking variables based on this particular convention.
The names for constants are not consistent across the board. Some examples of ‘incorrectly’ named constants (from just the standard):
Examples of incorrectly named constants:
cl:most-negative-fixnum cl:most-positive-fixnum cl:internal-time-units-per-second cl:array-rank-limit cl:array-dimension-limit
Example of incorrectly used dynamic variable naming pattern (before the proposed change) (SBCL implementation):
(let ((*a* 1)) (setf *a* 2)) ; file: /tmp/slimer0wqXc ; in: LET ((*A* 1)) ; (LET ((COMMON-LISP::*A* 1)) ; (SETF COMMON-LISP::*A* 2)) ; ; caught COMMON-LISP:STYLE-WARNING: ; using the lexical binding of the symbol (COMMON-LISP::*A*), not the ; dynamic binding, even though the name follows ; the usual naming convention (names like *FOO*) for special variables ; ; compilation unit finished ; caught 1 STYLE-WARNING condition
Examples of predicates named *p vs *-p (including a popular 3rd-party library):
‘Good’ examples (before the proposed change):
cl:evenp cl:oddp cl:stringp cl:base-string-p cl:hash-table-p cl:pathname-match-p
‘Bad’ examples (before the proposed change)
bordeaux-threads:lock-p bordeaux-threads:semaphore-p cl:string-lessp cl:string-not-lessp cl:string-greaterp cl:string-not-greaterp
All of the above suggestions apply to this as well however if a new language is being made, it makes sense to care less about any kind of backwards compatibility and more about features. The ones presented here have very general description and are directions rather than something that can be put into actual specification.
How objects should be made? Constructors, destructors, (multiple) inhertiance vs composition etc
There are different object systems for different tasks, but some of them are easier implemented in terms of another. Currently classes/structures cannot be parameterized in any way.
CLOS has classes that are more first-class citizens, compared to structures that are less used and supported. Both of them interact with the type system in a certain way, sometimes not the best.
Which types should and should not be included, boxed vs unboxed types, type parameterization, linear types, dependent types etc. Interaction with other parts such as (compiler-)macros.
The current type system is state of the art, but it is state of the art from the 90s. There was a lot of research in the area for the last decade that should be considered. While not all of this (if any) should go into the spec, the goal is to make it easier for the future developers to incorporate their preferred features into the language.
- Performnance and types in Lisp
- There are various attempts to extend the type system with proper parameterized types, recursive type definitions, more strict type checks and inference, the major example being Coalton.
Depending on the class system, methods or rather “methods” can be organized into traits/typeclasses, or generics, or belong to the class (unlikely). They can also specialize either on classes or types.
THe way generics work in CLOS allows for a certain flexibility and famous runtime redefinition. However at the same time, the performance of the current approach is quite poor, restriciting its use. Sometimes specializing on classes instead of types can be limiting.
There are several atttempts to deal with the inefficiency (in terms of raw performance and safety) of generic functions – including fast-generic-functions, specialization-store, and others. However, they do not fully avoid the limitations mentioned above.
There are also attempts to do something completely different such as LIL – they should not be forgotten.
While not a low-level thing on its own, any changes to syntax can imply changes so breaking (for backwards compatibility) that it cannot be put in any of the above categories. The opinions on the subject heavily differ, the opposite approaches suggested are:
No additional syntax at all (even for strings) with the ability to heavily modify the reader extending it for these types of things
vs
A lot of builtin syntactic sugar, for example for literals, but some other things as well (some examples presented below), since there aren’t that many common special syntactic constructs, for ease of use.
The default case, case sensitivity etc are also a thing to discuss, although there isn’t much talk about that part.
This section can be separated in the future if necessary.
The way reader macros should work and interaction between user and reader.
Currently the system does not allow users to hook into the reader. That could be a huge improvement, allowing for various modifications.
There are libraries that try to extend possibilities, such as named-readtables.
If and where can []
or {}
be introduced, slot/structure dot .
access, etc.
Compare (slot-value (slot-value (slot-value x 'foo) 'bar) 'baz)
vs x.foo.bar.baz
vs (at x 'foo 'bar 'baz')
.
Available in the Access library: dotted and nested access of data structures (including, but not limited to, slot values). Access is shipped in the CIEL project.
Available in the Rutils library: dot notation and index-based
access. (elt (nth 1 (foo-slot2 (bar-slot1 obj)) 0)
can be written @obj.slot1.slot2#1#0
.
Operations (functions, macros etc) have predictable (possibly identical) order of arguments. If an operation takes for example 2 arguments (data structure and an index/key) - it is expected that the data structure is always in the same position and the key is also in the same position across the board, regardless of what the actual positions are.
This issue is an unnecessary burden on one’s mind when developing a project and forces the developer to break their focus to think about an out of place implementation detail instead of on the program’s business logic. It’s just something that is there for no reason other than legacy.
The standard contains functions like (gethash key hashtable) and (aref array index). In the case of gethash the data structure (hashtable) is the 2nd argument. In the case of aref the data structure (array) is the 1st argument. These are just 2 examples of such inconsistency. One potential example to follow would be the bulk of list operations, like mapcar, which take key (lambda) as the 1st argument and list(s) (data structure) as the 2nd or 3rd (etc) arguments and those kinds of functions are pretty much consistent across the board.
Allow to resolve names at runtime, more convenient export system etc.
Interaction with pathnames w.r.t. current OS landscape. One standard way to parse a POSIX or Windows path string to a path name, or a URL. Path names should have a “method” for this.
The core language can be separated into libraries with separate condition system, data structures library, algorithms library, math library, concurrency library, iteration library, code-walking library, etc.
SICL and Clasp compilers are built with this idea in mind.
GC vs RAII in some form. There are several alternatives. Semantics of the language depends heavily on this as well. As this section is probably one of the most fundamental things for Common Lisp (as well as most other lisps), here’s a more detailed description:
- Problems/Limitations/Use-cases with GC
- Aspects include real-time systems, working well with foreign code, embedded-systems aka devices with limited memory; maybe something else
- The whole page is an interesting read: Garbage_collection_(computer_science) - Wikipedia
- https://en.wikipedia.org/wiki/Manual_memory_management#Manual_management_and_correctness
- What are the systems where GC is and isn’t an issue (limited-memory systems?)
- Default GC, but if GC does have disadvantages, an option to handle the memory manually should be provided.
GC vs (semi) manual memory management is about many things, including performance, convenience of use, precision of time estimations, more complicated structures etc.
trivial-garbage provides a portable API to finalizers, weak hash-tables and weak pointers on all major implementations of the Common Lisp programming language.
A powerful low-level control construct.
It is up to the debate for several reasons, one of them being its interaction with unwind-protect.
- Useful accessors on macro environment objects.
Of course not. Attempts to build low level C-like lisp exist, lots of them: 1, 2, 3, 4 and there are more. Attempts to build low-level statically-typed lisp-like language are also well known: 1, 2 and there are more. Two things they presumably lack are: pre-built well defined specification and community visibility and support.
Same can be said about attempts to just upgrade exiting CL implementation, such as famous CL21.
Common Lisp: The Untold Story and friends have a lot of useful info in them. Paul Khuong blog has many notes on potential compiler improvement, althoug specific to SBCL.
While the discussion is good by itself, the important question is – how can this come to life? There are 3 major components:
- Money
- Time
- People
May not be written until the bulk of this document is finished.