Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
deryaerman authored Jan 31, 2025
1 parent 25280ac commit b181c0f
Showing 1 changed file with 20 additions and 12 deletions.
Original file line number Diff line number Diff line change
@@ -1,34 +1,38 @@
# CASS Parsing and AST Representation

This repository implements a CASS (Context Aware Semantics Structure) parser and visitor-based AST generator using ANTLR4. It consists of the following key components:
This repository implements a CASS (Context Aware Semantics Structure) parser using ANTLR4. It consists of the following key components:

- **`CASS.g4`**: Defines the grammar for parsing C-like syntax using ANTLR4.
- **`MyCASSVisitor.py`**: Implements a visitor pattern to traverse the parse tree and generate an AST.
- **`MyCASSVisitor.py`**: Implements a visitor pattern to generate and traverse the parse tree.
- **`DriverCASS.py`**: Acts as the main entry point for parsing and processing CASS input.
- **`CASSNode.py`**: Defines the AST node structure and serialization utilities.
- **`CASSNode.py`**: Defines the Cass node structure and serialization utilities.

---

## CASS Grammar (`CASS.g4`)
This file defines the ANTLR4 grammar for parsing a subset of C-like syntax, including:
- Function definitions
- Statements (if, while, for, return, etc.)
- Statements (if, while, for, return, switch, case etc.)
- Expressions (arithmetic, logical, assignment)
- Parenthesized expressions
- Function calls
- Variable declarations
- Arrays, lists and pointers

The grammar ensures a well-structured parse tree that is then visited by `MyCASSVisitor.py`.

---

## CASS Visitor (`MyCASSVisitor.py`)
This module implements the visitor pattern for processing the parse tree generated by ANTLR4. Key functionalities include:
- Handling **function definitions** and **compound statements**.
This module implements the visitor pattern for processing all parsed components using the grammar file. Key functionalities include:

- Constructing `CASSNode`'s
- **Labeling for nodes**, including variable declarations, expressions, and operators.
- **Child management**, allowing hierarchical tree representation.
- Distinguishing between **local and global variables**.
- Recognizing **parenthesized expressions** and **operator precedence**.
- Constructing an AST using `CASSNode`.
- Properly formatting function calls and argument lists.
...

The visitor ensures a structured transformation of the parsed syntax into an intermediate AST representation.

Expand All @@ -45,18 +49,22 @@ This script acts as the core engine for testing and processing input files.

---

## AST Node Representation (`CASSNode.py`)
This file defines the `CassNode` class, which represents nodes in the AST. It includes:
- **Labeling for nodes**, including variable declarations, expressions, and operators.
- **Child management**, allowing hierarchical tree representation.
## Node Representation (`CASSNode.py`)
This file defines the `CassNode` class, which represents nodes in our Cass tree. It includes:
- **Serialization to CASS format**, ensuring proper formatting for output.
- **GraphViz DOT export**, enabling visualization of AST structures.
- **GraphViz DOT export**, enabling visualization of the tree structure.

The `CassNode` class provides the foundational structure for storing and manipulating AST representations.

---

## How to Run
Ensure you have ANTLR4 installed and available in your environment. To run the parser:
```
java -jar "antlr-4.13.2-complete.jar" -Dlanguage=Python3 -visitor CASS.g4
python DriverCASS.py <input_file.c>
```

## Our Jupyter Notebook
We also created a Jupyter Notebook Execution.ipynb with integrated vectorization and similarity score calculation using a pretrained graph neural network provided by the authors of the MISIM paper. In order to run it, it might be necessary to install some packages.

0 comments on commit b181c0f

Please sign in to comment.