Project Title
Low-Level Programming language implemented in Python.
This project is available on Github
What?
Petl is a compiled, low-level, stack-based programming language. It is not designed to be used in any production environment, but is instead intended to be a hobbyist project to advance my knowledge of assembly code and compilers. While the compiler is written in Python, it is my intention to make the language self-hosted.Why?
This project was born from a severe amount of boredom one day. Stuck for ideas, I decided it would be a fun project to undertake. Several weeks later, and the language is finally ready to be shared with the world. Please do not expect much of this language - I'm not trying to write the next Rust or anything.Dependencies
Due to the lack of external libraries in this project, only 4 things are required to compile and run Petl code:- GNU/Linux Operating system (Windows Subsystem for Linux (WSL) should work, but is untested)
- Python 3.11 or later
- Netwide Assembler (NASM)
- GCC Compiler Suite (Preinstalled on most UNIX-like systems)
Project Goals
In order to guide the development process of Petl, I have set the following goals for myself:- No use of external Python Libraries -- COMPLETED
- Code compiling straight to Assembly Language, with no IR -- COMPLETED
- Compatability with libc functions such as printf -- COMPLETED
- Static and explicit typing with user-defined data structures -- COMPLETED
- Rigorous type checking -- IN PROGRESS
- Garbage Collection -- NOT STARTED
- Self-Hosted Compiler -- IN PROGRESS
- Syntax documentation and example programs -- NOT STARTED, HIGH PRIORITY
How does it work?
The process of Compiler development is a huge rabit hole that has an entire field of Computer Science dedicated to it. For the implementation of Petl, I have implemented the minimum compiler pipeline to convert source code to executable machine code. It is best described with the following diagram:The Lexer first converts the source code into a list of tokens - objects representing a small part of a program such as a bracket or a number. This puts the code into a format that is easier to work with and manipulate.
After this, the Parser moves through the list of token, building a tree to represent the syntax of the code - creating an Abstract Syntax Tree.
This tree is then passed to the Assembly Code Emitter, which recursively flattens the tree into raw assembly code, which is saved to a file that can then be assembled and linked into an executable file.