-
Notifications
You must be signed in to change notification settings - Fork 11
Coding style
This document describes the coding style in compiler.
Source files should be named according to their function. The compiler contains many modules, such as abcheck
, which is abstract syntax tree verification, and optfoam
, the intermediate code optimisation. File names should consist of alphabetic characters and '_'. If a module becomes large, and needs to be split into sub-modules, these should be prefixed with a short name for the main module, followed by an underscore. E.g. of_cprop
is the optfoam
module for copy propagation.
Different kinds of identifiers have different naming conventions in the compiler code.
- SUE (struct/union/enum) tags must be camelcase, starting with a lowercase letter. E.g.
struct abSyn;
. - Typedef names must
- be camelcase, starting with an uppercase letter. E.g.
typedef struct abSyn AbSyn;
- resemble their respective tag name, if they are typedefs of SUE types, so that
struct abSyn
isAbSyn
andunion foam
isFoam
.
- be camelcase, starting with an uppercase letter. E.g.
- SUE tags and typedef names may not contain underscores.
- Enumerators (constants in an
enum
definition) must start with a short name of their enumeration, followed by an underscore, followed by a camelcase name starting with an uppercase letter. E.g. inenum abSynTag
, we have enumerators such asAB_Id
andAB_Assign
. For grouping and iteration purposes, the enumeration may contain all uppercase, underscore-separated names such asAB_STR_START
andAB_STR_END
, which mark the first and last literal AST node kind, respectively. - Constants may be defined in a nameless enumeration or with
#define
and follow the naming convention of enumerators. - Functions and variables are named in camelcase starting with a lowercase letter. Functions and global variables belonging to a module start with a short name of their module. E.g. all functions and variables in the
absyn
module start withab
, and all functions in thebuffer
module start withbuf
. - Function-style macros (introduced with
#define
) follow the naming convention of functions. - Function declarations should equal their definition in argument types and names.
- Macros introducing new keywords should be named according to the C keyword naming convention, thus all lowercase, underscore-separated. E.g.
bitsizeof
andlocal
. - Local identifiers, types, enumerators, etc. follow the global naming convention, but do not need to be prefixed with their module name. Local names should prefer brevity to descriptiveness, adding a comment to the declaration if necessary, while global names should prefer descriptiveness to brevity.
- File-scope (static/local) functions and variables do not need to be prefixed with their module name, although this practice is recommended. If a file-scope declaration becomes useful outside the module, no renaming has to be performed if the global naming convention was followed.
Some general rules on identifier meanings:
- A camelcase name starting with an uppercase letter, not containing underscores, is a type name.
- A camelcase name starting with a lowercase letter, potentially containing underscores, is an identifier used in function and variable declarations.
- A camelcase name starting with an uppercase letter, containing at least one underscore, is a link time constant.
The above expressed as regular expressions:
Regular expression | Meaning |
---|---|
[A-Z][A-Za-z0-9]* |
type name |
[a-z][A-Za-z0-9]* |
identifier |
[A-Z][A-Za-z0-9]*_[A-Za-z0-9]+ |
constant |
A C source file should start with a comment containing the module name and a short description of what the module does. The same description goes into the header file. Here, we assume we are writing a new module for escape analysis, which can be used in various optimisations. The header is 78 characters wide, with the last *
of the first line and the closing /
of the last line being in the 78th column.
/*****************************************************************************
*
* escanl.c: Escape analysis for FOAM code.
*
* Copyright © 2013 Your name.
*
****************************************************************************/
If you modify an existing file substantially, you should add your own name to the copyright notice, preserving the existing one. If you introduce a new file whose licence is different than the project licence, you must include a short hint at what licence is used. E.g. "This file is licensed under GNU General Public License, version 3 or later".
After the header, all header files used by the module must be included. No #include
directives are allowed after the first declaration. You may comment the directive with a reason for inclusion. Internal header files are included with #include "file"
, external ones with #include <file>
. The first file included must be the header file for the current module. This ensures that each header file is self-contained, so that other module including it won't cause compile errors inside the header due to missing includes.
#include "escanl.h"
#include "list.h"
#include "strops.h"
#include "dflow.h"
#include "foam.h"
The inclusions should be lexicographically ordered according to ASCII, which means '_' comes before all alphabetic characters, digits come before both. You may additionally group and order inclusions by compiler layer (port < general < struct < phases < toplevel). The layer ordering is primary, and the alphabetical ordering is secondary.
All additional types used in the module that are not exported by the header file must be declared directly after the header file inclusion block. This includes typedefs and SUE type definitions. Non-exported (opaque) aggregate type definitions (struct/union) must be defined after all local type definitions.
The local declarations block must declare all file-scope functions and contain all file-scope variables. They may use local types, so this block must come after the type definitions block. Macros that help define local functions may be defined above the block and must be undefined below the block. Using macros for function declarations is discouraged.
Local macros (both function-style and constant-style) must be defined after the function declarations. Although syntactically they could occur anywhere in the code, they may use local functions and types, so they should logically appear after the declarations they use. Function-style macros should be seen as inline functions, and constant-style macros as constants. Therefore, macros should only refer to identifiers declared before the macro definitions, and constant-style macros should expand to the same value in every context.
After all local declarations, the module's functions are defined. File-scope and global functions may be mixed so that semantically related functions are defined together.
Alignment is generally done with tab characters, using spaces to align more precisely. A tab is 8 spaces, and any sequence of 8 spaces outside string literals should be replaced with a tab character.
- The
#
indicating the start of a preprocessor directive must appear in the first column of a line. It is followed by a number of spaces according to the#if
/#ifdef
/#ifndef
nesting level, one space per level. Header guards are not considered a nesting level. - The
include
keyword is followed by a single space and the included file name. - The
define
keyword is followed by a single space and the macro name. Formatting of the macro body follows the context it will be expanded in. - Macro definitions are aligned with tab characters, using spaces to align to smaller units.
#include "list.h"
#if HAVE_GMP
# include <gmp.h>
#endif
#define NUM_ZERO 0
#define NUM_MILLION 1000
- An external (function or variable) declaration in a header file must start with the
extern
keyword, followed by the return type, one or more tab characters and the declared identifier. - Function declarations are additionally followed by zero or more tab characters and an argument list.
- Declarations and arguments should be aligned with tab characters.
- If an argument list extends beyond 80 characters, a line break should be added after the comma and the next line be aligned one character after the last open left parenthesis.
- Both definitions and declarations of nullary functions must have the
void
keyword as their only parameter.
extern void abInit (void);
extern AbSyn abNewEmpty (AbSynTag tag,
Length argc);
extern void abFree (AbSyn ab);
- The return type of a function definition is on a separate line, so that the function name appears in the first column.
- The parameter list in a function definition follows the function name directly, without any spacing.
- Alignment of overlong parameter lists follows the declaration style.
- The opening and closing braces appear in the first column on a separate line.
AbSyn
abNewEmpty(AbSynTag tag,
Length argc)
{
...
}
This section describes how to format all statements and expressions of the C language.
- No space between declarator-name and parameter-list in function definitions.
- One space after
if
,while
andfor
:if (...)
. - One space before and after all operators except
->
,.
,[]
, cast andsizeof
:(int)arr[sizeof(3 + 4)]->val
. - No space before or after
(
and)
in function calls.
Less punctuation is preferred. Parentheses should only be used when the default operator precedence is overridden. E.g. 3 + 2 * 6
instead of 3 + (2 * 6)
. The sizeof
-expression operator does not require parentheses: sizeof 3
instead of sizeof(3)
. The same applies to the return
statement and case
label statements: return 3;
and case 3:
.
Parameter declarations should follow the syntax of local declarations. Thus, function pointers are declared as Bool (*fptr)(String name)
, although the C standard allows Bool fptr(String name)
in parameters. A pointer parameter should also not be declared as incomplete array type, but as pointer type: AbSyn *argv
instead of AbSyn argv[]
.
After control structure initiators such as if
, else
and while
, as well as after an opening brace {
, the indentation level is increased by one, so that the next line contains one more tab character than the one with the initiator. The opening brace is on the same line as the control structure, the closing brace is in the same column as the first character of the control structure.
if (cond) {
...
}
if (othercond)
...;
case
-labels appear in the same column as the switch
keyword. The opening {
and closing }
are placed as in the if
statement. The closing brace of a case
statement is in the same column as the closing brace of its enclosing switch
statement.
switch (tag) {
case AB_Id:
break;
case AB_LitInteger: {
break;
}
}
Labels for the jump statement goto
are always placed in the first column. They may be followed by a tab character if they appear on the same line as the labelled statement.
goto done;
...
done: return 0;
The jump statement should only be used for forward jumps. Backward jumps should generally be encoded in higher level control structures such as while
and for
.
Lines longer than 80 characters should be split over several lines. Following lines should be aligned by tabs and spaces as required. A follow-line must start one column after the last open parenthesis.
ab = abNewFoo(something, some(more),
anotherFunctionCall(argToCall(),
anotherArg),
someMoreStuff());