Skip to content
michalbali256 edited this page Jul 7, 2020 · 3 revisions

The parser component takes tokens produced by the lexer from the token stream and recognizes HLASM statements. The parser inherits from the HLASM recognizer generated by ANTLR (see Third party libraries) to provide further operations.

Parser Workflow

The parser (in code referenced as parser_impl) implements the opencode statement provider interface. This means that, according to the statement passing in statement providers, the parser needs to parse each statement in two steps:

  1. The parser calls the rule label_instr, which parses the label and instruction fields into their respective structures. The operand and remark field are stored as a string.

  2. After retrieving the processing format, the parser selects a corresponding rule to parse operands. With the rule, it parses the remaining string from the previous step.

For the means of parsing remaining strings, the parser subcomponent contains two parsers. The first one parses statement after statement from a source file. The second parses the operands from the string passed by the first parser.

To ensure the operands have correctly set ranges prior to the source file rather than to the passed string, the parser uses a range provider. It helps the second parser to have ranges of reparsed operands consistent with the ranges of other fields. It is initialized with the begin location of the operand field in the statement and all further ranges created during parsing are adjusted to have correct boundaries.

Statement Structure

During the parsing of a statement, several structures are created and collected. They are label_si, instruction_si, operand_si, remark_si (si = semantic information). They are collected with collector and built into the structure statement_si.

Label and instruction structures can contain either an identifier of a symbol or — when in a model statement — a concatenation of strings and variable symbols. The remark field is a string as it serves as a commentary statement field. The operand field contains a list of operands used in the statement. They can be of several formats.

Operand Formats

The statement processor can request the parser to retrieve statements with these operand formats:

  • machine/assembler/conditional assembly/macro – instruction operands. Each type of instruction has its specific format.

  • model – operands for model statements. It is a chain of strings and variable symbols.

  • deferred – operands with not yet known format. Stored as a string.

Each operand format has a corresponding operand structure. They all inherit the abstract operand and each have various children for different kinds of operand format (see the picture below). Assembler and Machine operand structures inherit from the evaluable operand. It is a common structure for operand objects that are composed of resolvable objects (see HLASM context tables#Symbol Dependency Tables.)

Operand structure inheritance.

Concatenation Structures

A model statement is a statement that contains a variable symbol in any of the statement fields. This variable symbol is further to be substituted by an arbitrary string and then re-parsed. Hence, the field is formed by concatenating individual sub-fields, which are represented by specialized structures. The concatenation can be further evaluated to produce the final string.

The helper structures are:

  • char_str – a character string.

  • var_sym – a substitutable variable symbol.

  • dot, equals – characters with a special meaning.

  • sublist – a recursive concatenation enclosed in parentheses.

Grammar Implementation

Grammar rules describing the parser are separated into several files (see the grammar visualization):

  • hlasm_parser.g4 – Top level rules are stored here.

  • lookahead_rules.g4 – Rules for lookahead mode.

  • label_field_rules.g4 – Rules taking care of the label field of statements.

  • instruction_field_rules.g4 – Rules taking care of the instruction field of statements.

  • operand_field_rules.g4 – Rules taking care of the operand field of statements.

  • macro/machine/assembler/ca/model/deferred_operand_rules.g4 – Various operand field rules.

  • ca/asm_expression_rules.g4 – Rules for expressions.

  • data_def_rules.g4 – Rules for data definition.

Clone this wiki locally