orcalinux
released this
25 Dec 11:03
·
10 commits
to main
since this release
Release v1.0.0 - LL(1) TINY Parser
We are pleased to announce the release of version 1.0.0 of the LL(1) TINY Parser. This release marks the completion of a fully functional LL(1) parser designed for the simplified TINY programming language.
Project Overview
The LL(1) TINY Parser is built using the LL(1) parsing technique, a top-down method that uses one lookahead token to make parsing decisions. This parser efficiently analyzes the syntax of TINY programs, ensuring they conform to the defined grammar rules.
How It Works
-
Lexical Analysis (Tokenization):
- The parser begins by tokenizing the input source code, breaking it down into a sequence of tokens such as identifiers, keywords, operators, and delimiters.
-
Parsing Table:
- An LL(1) parsing table is constructed based on the grammar of the TINY language. This table maps pairs of non-terminals and terminal tokens to specific production rules, guiding the parsing process.
-
Parsing Process:
- The parser utilizes a stack to manage the current state of parsing. The stack is initialized with the start symbol of the grammar and an end-of-input marker.
- During parsing, the parser repeatedly examines the symbol at the top of the stack and the current input token:
- If the top symbol is a terminal:
- It must match the current input token. If it does, the symbol is popped from the stack, and the parser advances to the next token.
- If it does not match, a syntax error is reported.
- If the top symbol is a non-terminal:
- The parser consults the parsing table to determine which production rule to apply.
- The non-terminal is then replaced by the symbols on the right-hand side of the chosen production rule, which are pushed onto the stack in reverse order.
- If no valid production exists for the given pair, a syntax error is reported.
- If the top symbol is a terminal:
-
Termination:
- The parsing process successfully terminates when both the stack and the input buffer are empty, indicating that the entire input has been correctly parsed.
- If discrepancies remain, the parser reports a syntax error.
Usage of the Stack
The stack is a critical component in the LL(1) parsing mechanism, serving several key functions:
-
State Management:
- The stack maintains the sequence of grammar symbols that need to be processed. By pushing and popping symbols based on the parsing table, the stack effectively tracks the parser's progress through the grammar.
-
Derivation Control:
- When a non-terminal is encountered, the stack is updated with the corresponding production rule's right-hand side symbols. This allows the parser to recursively handle nested and hierarchical grammar structures.
-
Error Detection:
- The stack facilitates the detection of syntax errors by ensuring that only valid sequences of symbols, as defined by the grammar, are processed. Any deviation results in an immediate error report.