Preface
In many ways, this is more than just a passion project; it's a journey through the realms of curiosity and perseverance. While at times it might seem like an exercise in futility, this project and the subsequent documentation of my adventures in its pursuit serve as a testament to the possibility of completion, not only for myself but for all the fellow procrastinators out there.
Esoteric programming languages have always held a particular fascination for me despite the absence of formal training in computer science. This lack, perhaps paradoxically, heightened my interest in implementing these languages. However, attempting to construct languages based on tutorials or delving into The Dragon Book presented challenges due to my unfamiliarity with both C and C++, which are considered the premier programming languages for such endeavors.
The turning point came with the discovery of Rust and the associated The Rust Programming Language Book, affectionately known as The Book in the community. This marked the moment I felt ready to take on a task as significant as creating an interpreter. While much has been written about the merits (and demerits) of using Rust as a backend for programming languages, most of this information was beyond my current understanding. Notably, the Rust community has successfully implemented various languages, such as Rhai, Kind, and Gleam, proving its versatility. For a hobbyist like me, Rust provided the perfect foundation to realize my project.
Now, why Brainf**k[^note]? Firstly, it's amusing; the name alone is sufficient to capture people's attention. More importantly, it's a simple yet expressive language. With only eight instructions, it discards many other useful features, making it an ideal testbed where someone like me can build, test, iterate, and repeat until the end of time. While a Brainf**k interpreter in Rust is not a novel idea and has been done before, the novelty lies in the fact that I hadn't done it, and therein lies the rub.
In conclusion, I wish to express my gratitude to a few individuals.
- My very patient wife and two boisterous daughters, who not only endured my brainstorming sessions but also occasionally tolerated my absence during this project.
- Tris Oaten of Lost Terminal and No Boilerplate fame, among other things. His Rust evangelism kept my interest in Rust alive, and his life lesson videos played a pivotal role in keeping me on track.
[^note] I am censoring for the sake of propriety, not because I have any issues with cursing.
Introduction
BrainFoamKit
, or BFK
as I like to call it, is a project that aims to implement a brainf**k, (or BF) interpreter in Rust. To make things easier for myself, and to keep up with the times, I have added one more instruction However, it's not JUST an interpreter. The broader scope aim of this project is to implement a complete, if poorly designed, and functional Virtual Machine (the BFKVM) that can interpret the instructions and allow for us to take a cross-section look as it executes the BFK
program.
For this purpose, the project has three distinct parts:
brainfoamkit_lib
, a library that implements theBFKVM
and the interpreter forBFK
.bfkrun
, a binary that can be used to run programs written in eitherBF
orBFK
dialects.bfkview
, a TUI application with a (hopefully) intutive interface that lets you step through BF[K] programs.
Rationale
Brainf**k is an interesting esoteric language. It is turing complete and can theoretically be used as a general purpose programming language. However, it has only 8 individual symbols that are used to instruct the interpreter. This makes it both fun and challenging to implement.
While several C and C++ interpreters for Brainf**k exist, I believe that it is particularly well suited for implementation in Rust due the combination of memory-safety, speed and zero-cost abstractions. Additionally, the interpreter is expected to be non-trivial in complexity while still only scratching the surface of the features Rust has to offer. Thus, it provides an excellent educational opportunity for someone trying to learn Rust.
Implementation Details
Briefly, the interpreter is implemented as a Virtual Machine on top of the existing Rust runtime. This frees us from
low-level hardware constraints, allowing us to focus on the core of the program. The language remains focused on Byte
-level operations and the ASCII code.
More details regarding the implementation, including an EBNF grammar and other design tradeoffs can be found in
the language reference pages for the original BF Specification and the BFK
dialect.
(Planned) Features
- A complete brainfuck interpreter capable of ingesting a brainfuck program and behaving appropriately
- A modular system for the said interpreter, allowing for extensions and modifications
- A configurable brainfuck virtual machine to interpret the programs.
- A fully capable TUI to visualize and step through a brainfuck program
Current Status
- Implement basic building blocks for the Virtual Machine
- Implement the Virtual Machine to run the code
- Implement a parser for parsing the input program
- Design the TUI for the visualizer
- Implement the TUI with the Virtual Machine and parser
Progress for the project can be tracked on the Github Issues and the Github Project pages.
Contributing
See the Contributing for details on how to contribute to the project.
You can contribute to the project through GitPod.
Code of Conduct
This project is governed by the Contributor Code of Conduct Covenant. Details are outlined in the CODE OF CONDUCT.
Getting Started
Installation
The software can be installed in different ways, depending on your taste:
Cargo Install
You can install it directly from crates.io using cargo
:
cargo install brainfoamkit
Cargo Binaries
You can also use the excellent cargo-binstall tool to install the binaries directly from the repository:
cargo install cargo-binstall
cargo binstall brainfoamkit
Manual Installation
You can also install the binaries manually by downloading the appropriate binaries from the releases page and placing them in your $PATH
.
Usage
bfkrun
bfkrun
is the main binary that can be used to run programs written in either BF
or BFK
dialects.
bfkrun
accepts both strings and files as input.
# Use a direct input string
bfkrun --input "+[-->-[>>+>-----<<]<--<---]>-.>>>+.>>..+++[.>]<<<<.+++.------.<<-.>>>>+."
#> Hello, World!
# Use a file as input
# Create a file with the program
echo "+[-->-[>>+>-----<<]<--<---]>-.>>>+.>>..+++[.>]<<<<.+++.------.<<-.>>>>+." > hello_world.bf
# Run the program
bfkrun --file hello_world.bf
#> Hello, World!
bfkview
bfkview
is a TUI application with a (hopefully) intuitive interface that lets you step through BF[K] programs.
Detailed instructions coming soon.
The BFK Interpreter - bfkrun
Draft Chapter
The BFK View Tool - bfkview
Draft Chapter
The Brainf**k Language Reference
Background
Brainf**k is an esoteric and minimalist programming language. This language has entertained and puzzled programmers since its inception. The programming language uses only 8 symbols, with 7 primitive operations and works on a memory array. Despite this simplicity, it happens to be Turing Complete.
This chapter provides an overview of the language, as it was specified by Urban Müller in 1993.
Contents
Acknowledgements
- Urban Müller, for creating the Brainf**k language.
- The Esolangs Wiki for doing a ton of research so I didn't have to.
- The Brainf**k Wikipedia Page for introducing me to this intriguing project.
Background
Esoteric Programming Languages
While the majority of programming languages are designed with the express purpose of providing an easier and more human-readable interface for communicating with computers, a minority of languages choose to go a different way.
These esoteric programming languages, or esolangs, are designed for different purposes that include, but are not limited to, Research, Challenge, Art, and Fun. These languages, then, are not designed to be practical or efficient, but rather to be interesting, challenging and forcing oneself to think.
The examples of the languages that follow each of the purposes above are:
- Research: Binary lambda calculus is an esolang that is designed to be representable by bits or bytes, encoding lambda calculus.
- Challenge: Malbolge is an esolang designed to be extremely difficult to program in.
- Art: Piet is an esolang where programs take the shape of bitmaps of abstract art.
- Fun: LOLCODE is an esolang that purportedly reads as if the lolcats cats were writing code.
Brainf**k
Brainf**k is an esoteric programming language that tries to be minimalist, yet expressive. It happens to be one of the most famous esoteric languages worldwide, not in the least due to the expletive that is part of its name.
The language was designed by Urban Muller in 1993 with the goal of creating a language for which he could create the smallest compiler for Amiga OS 2.0. This goal was realized and he managed to write a 240 byte compiler.
Relationship to Other Esolangs
The design for Brainf**k was inspired by another esolang with the same goal: FALSE. The FALSE compiler was 1024 bytes and was implemented for the 68000 assembler.
Another programming language, \( P'' \) bears a lot of similarity to Brainf**k. This language was designed by Corrado Böhm in 1964 as a way to describe a subclass of turing machines. \( P'' \) is designed for a left-infinite turing machine and relies on two symbols and their permutations. The primitive functions for \( P'' \) can be mapped directly to those of Brainf**k.
Turing Completeness
Brainfk is a Turing Complete language under well-defined conditions. This means that if a certain set of conditions is satisfied, then Brainfk can be used to compute any computable function.
The conditions for Turing Completeness are:
- The array is unbounded, or in other words, the memory tape is infinite. OR
- The array is at least three cells long and each cell can store unbounded (infinite) values.
If either of the above mentioned conditions are met, then Brainfk is a Turing Complete language. The requirements for certain things to be unbounded may seem to indicate that it is impossible. However, in practice, the unbounded condition is irrelevant for practical considerations, particularly if you consider modern computers with large memory sizes. In fact, it should be trivial to implement a VM for Brainfk that has a practically unbounded memory tape.
Language Syntax
Language Semantics
EBNF Grammar
Introduction
This chapter serves to provide an annotated Extended Backus-Naur Form (EBNF) grammar for the language. The grammar is not intended to be a formal specification of the language, but rather a tool for understanding the language's syntax.
The EBNF syntax allows us to express the grammar in a concise and readable way. The grammar is annotated with comments that explain the meaning of each rule. The grammar is also annotated with the precedence and associativity of each operator.
EBNF Crash Course
EBNF is a notation for describing the syntax of a language. It is a formal grammar that is used to describe the syntax of many programming languages, including the language of this book.
Briefly, an EBNF grammar consists of a series of rules. The rules can be "terminal" or "non-terminal". A terminal rule is a rule that is defined in terms of a literal string. A non-terminal rule is a rule that is defined in terms of other rules.
BF Grammar
The BF Grammar will be split into two sections:
- The terminal symbols: This will enumerate all the symbols and what they mean.
- The the non-terminal constructs: These would be the constructs, made up of the terminal symbols that will actually appear in the program.
Terminal Symbols
The following rules map the operations to the terminal text symbols:
Increment
The increment
operation is represented by the +
symbol. This operation increments the value at the current cell by one.
increment = "+";
@startebnf
(* Increment the value at the current cell by one. *)
increment = "+";
@endebnf
Decrement
The decrement
operation is represented by the -
symbol. This operation decrements the value at the current cell by one.
decrement = "-";
@startebnf
(* Decrement the value at the current cell by one. *)
decrement = "-";
@endebnf
Move Right
The move_right
operation is represented by the >
symbol. This operation moves the pointer to the right by one cell. This operation translates into incrementing the memory pointer by one. If this operation is performed when the pointer is at the last cell, then the pointer will wrap around to the first cell.
move_right = ">";
@startebnf
(* Move the pointer to the right by one cell. *)
move_right = ">";
@endebnf
Move Left
The move_left
operation is represented by the <
symbol. This operation moves the pointer to the left by one cell. This operation translates into decrementing the memory pointer by one. If this operation is performed when the pointer is at the first cell, then the pointer will wrap around to the last cell.
move_left = "<";
@startebnf
(* Move the pointer to the left by one cell. *)
move_left = "<";
@endebnf
Output
The output
operation is represented by the .
symbol. This operation outputs the value at the current cell as a character. This operation translates into printing the value at the current cell as a character to whichever output stream is being used. The value printed will be the ASCII representation of the value at the current cell.
output = ".";
@startebnf
(* Output the value at the current cell as a character. *)
output = ".";
@endebnf
Input
The input
operation is represented by the ,
symbol. This operation reads a character from input and stores it in the current cell. This operation translates into reading a character from whichever input stream is being used and storing it in the current cell. The value stored in the cell will be the ASCII value of the character.
input = ",";
@startebnf
(* Read a character from input and store it in the current cell. *)
input = ",";
@endebnf
Loop Start
The loop_start
operation is represented by the \[
symbol. This operation marks the start of a loop. The nearest loop_end
instruction will refer to this. This operation translates into marking the current instruction as the start of a loop.
loop_start = "[";
@startebnf
(* Marks the start of a loop. The nearest loop_end instruction will refer to this. *)
loop_start = "[";
@endebnf
Loop End
The loop_end
operation is represented by the ]
symbol. This operation marks the end of a loop. The nearest loop_start
instruction will refer to this. This operation checks the value in the current cell. If the value is not zero, then the instruction pointer will jump back to the matching loop_start
instruction. If the value is zero, then the instruction pointer will continue to the next instruction. This operation translates into marking the current instruction as the end of a loop.
loop_end = "]";
@startebnf
(* If the value at the current cell is not zero, then jump to the instruction after the matching loop start. *)
loop_end = "]";
@endebnf
Non-Terminal Constructs
The following rules define the non-terminal constructs that will appear in the program:
Loop
A loop is a set of zero or more instructions delimited by a loop_start
and a loop_end
. The instructions in the loop will be executed repeatedly until the value at the current cell is zero.
loop = loop_start, statement*, loop_end;
@startebnf
(* A loop is a sequence of statements that are executed repeatedly until the value at the current cell is zero. *)
loop = loop_start, {statement}, loop_end;
@endebnf
Statement
A statement can be one of two things:
- A single instruction that can be executed. This would be one of the following:
increment
decrement
move_right
move_left
output
input
- A
loop
. This would be the same as the loop construct.
statement = increment | decrement | move_right | move_left | output | input | loop;
@startebnf
(* A statement is a single instruction that can be executed, or a loop.*)
statement = increment | decrement | move_right | move_left | output | input | loop;
@endebnf
Program
The program is a series of zero or more statements. This is the top-level construct that will appear in the program.
program = statement*;
@startebnf
(* A program is a sequence of statements. *)
program = {statement};
@endebnf
Complete Grammar
EBNF Specification
The complete grammar is shown below:
(* Increment the value at the current cell by one. *)
increment = "+";
(* Decrement the value at the current cell by one. *)
decrement = "-";
(* Move the pointer to the right by one cell. *)
move_right = ">";
(* Move the pointer to the left by one cell. *)
move_left = "<";
(* Output the value at the current cell as a character. *)
output = ".";
(* Read a character from input and store it in the current cell. *)
input = ",";
(* Marks the start of a loop. The nearest loop_end instruction will refer to this. *)
loop_start = "[";
(* If the value at the current cell is not zero, then jump to the instruction after the matching loop start. *)
loop_end = "]";
(* A loop is a sequence of statements that are executed repeatedly until the value at the current cell is zero. *)
loop = loop_start, {statement}, loop_end;
(* A statement is a single instruction that can be executed, or a loop.*)
statement = increment | decrement | move_right | move_left | output | input | loop;
(* A program is a sequence of statements. *)
program = {statement};
Railroad Diagram
A railroad diagram is a graphical representation of a grammar. The following railroad diagram was generated from the EBNF grammar above:
@startebnf
title The BF Grammar Specification
(* Increment the value at the current cell by one. *)
increment = "+";
(* Decrement the value at the current cell by one. *)
decrement = "-";
(* Move the pointer to the right by one cell. *)
move_right = ">";
(* Move the pointer to the left by one cell. *)
move_left = "<";
(* Output the value at the current cell as a character. *)
output = ".";
(* Read a character from input and store it in the current cell. *)
input = ",";
(* Marks the start of a loop. The nearest loop_end instruction will refer to this. *)
loop_start = "[";
(* If the value at the current cell is not zero, then jump to the instruction after the matching loop start. *)
loop_end = "]";
(* A loop is a sequence of statements that are executed repeatedly until the value at the current cell is zero. *)
loop = loop_start, {statement}, loop_end;
(* A statement is a single instruction that can be executed, or a loop.*)
statement = increment | decrement | move_right | move_left | output | input | loop;
(* A program is a sequence of statements. *)
program = {statement};
@endebnf