EBNF Grammar
Introduction
This chapter serves to provide an annotated Extended Backus-Naur Form (EBNF) grammar for the language. The grammar is not intended to be a formal specification of the language, but rather a tool for understanding the language's syntax.
The EBNF syntax allows us to express the grammar in a concise and readable way. The grammar is annotated with comments that explain the meaning of each rule. The grammar is also annotated with the precedence and associativity of each operator.
EBNF Crash Course
EBNF is a notation for describing the syntax of a language. It is a formal grammar that is used to describe the syntax of many programming languages, including the language of this book.
Briefly, an EBNF grammar consists of a series of rules. The rules can be "terminal" or "non-terminal". A terminal rule is a rule that is defined in terms of a literal string. A non-terminal rule is a rule that is defined in terms of other rules.
BF Grammar
The BF Grammar will be split into two sections:
- The terminal symbols: This will enumerate all the symbols and what they mean.
- The the non-terminal constructs: These would be the constructs, made up of the terminal symbols that will actually appear in the program.
Terminal Symbols
The following rules map the operations to the terminal text symbols:
Increment
The increment
operation is represented by the +
symbol. This operation increments the value at the current cell by one.
increment = "+";
@startebnf
(* Increment the value at the current cell by one. *)
increment = "+";
@endebnf
Decrement
The decrement
operation is represented by the -
symbol. This operation decrements the value at the current cell by one.
decrement = "-";
@startebnf
(* Decrement the value at the current cell by one. *)
decrement = "-";
@endebnf
Move Right
The move_right
operation is represented by the >
symbol. This operation moves the pointer to the right by one cell. This operation translates into incrementing the memory pointer by one. If this operation is performed when the pointer is at the last cell, then the pointer will wrap around to the first cell.
move_right = ">";
@startebnf
(* Move the pointer to the right by one cell. *)
move_right = ">";
@endebnf
Move Left
The move_left
operation is represented by the <
symbol. This operation moves the pointer to the left by one cell. This operation translates into decrementing the memory pointer by one. If this operation is performed when the pointer is at the first cell, then the pointer will wrap around to the last cell.
move_left = "<";
@startebnf
(* Move the pointer to the left by one cell. *)
move_left = "<";
@endebnf
Output
The output
operation is represented by the .
symbol. This operation outputs the value at the current cell as a character. This operation translates into printing the value at the current cell as a character to whichever output stream is being used. The value printed will be the ASCII representation of the value at the current cell.
output = ".";
@startebnf
(* Output the value at the current cell as a character. *)
output = ".";
@endebnf
Input
The input
operation is represented by the ,
symbol. This operation reads a character from input and stores it in the current cell. This operation translates into reading a character from whichever input stream is being used and storing it in the current cell. The value stored in the cell will be the ASCII value of the character.
input = ",";
@startebnf
(* Read a character from input and store it in the current cell. *)
input = ",";
@endebnf
Loop Start
The loop_start
operation is represented by the \[
symbol. This operation marks the start of a loop. The nearest loop_end
instruction will refer to this. This operation translates into marking the current instruction as the start of a loop.
loop_start = "[";
@startebnf
(* Marks the start of a loop. The nearest loop_end instruction will refer to this. *)
loop_start = "[";
@endebnf
Loop End
The loop_end
operation is represented by the ]
symbol. This operation marks the end of a loop. The nearest loop_start
instruction will refer to this. This operation checks the value in the current cell. If the value is not zero, then the instruction pointer will jump back to the matching loop_start
instruction. If the value is zero, then the instruction pointer will continue to the next instruction. This operation translates into marking the current instruction as the end of a loop.
loop_end = "]";
@startebnf
(* If the value at the current cell is not zero, then jump to the instruction after the matching loop start. *)
loop_end = "]";
@endebnf
Non-Terminal Constructs
The following rules define the non-terminal constructs that will appear in the program:
Loop
A loop is a set of zero or more instructions delimited by a loop_start
and a loop_end
. The instructions in the loop will be executed repeatedly until the value at the current cell is zero.
loop = loop_start, statement*, loop_end;
@startebnf
(* A loop is a sequence of statements that are executed repeatedly until the value at the current cell is zero. *)
loop = loop_start, {statement}, loop_end;
@endebnf
Statement
A statement can be one of two things:
- A single instruction that can be executed. This would be one of the following:
increment
decrement
move_right
move_left
output
input
- A
loop
. This would be the same as the loop construct.
statement = increment | decrement | move_right | move_left | output | input | loop;
@startebnf
(* A statement is a single instruction that can be executed, or a loop.*)
statement = increment | decrement | move_right | move_left | output | input | loop;
@endebnf
Program
The program is a series of zero or more statements. This is the top-level construct that will appear in the program.
program = statement*;
@startebnf
(* A program is a sequence of statements. *)
program = {statement};
@endebnf
Complete Grammar
EBNF Specification
The complete grammar is shown below:
(* Increment the value at the current cell by one. *)
increment = "+";
(* Decrement the value at the current cell by one. *)
decrement = "-";
(* Move the pointer to the right by one cell. *)
move_right = ">";
(* Move the pointer to the left by one cell. *)
move_left = "<";
(* Output the value at the current cell as a character. *)
output = ".";
(* Read a character from input and store it in the current cell. *)
input = ",";
(* Marks the start of a loop. The nearest loop_end instruction will refer to this. *)
loop_start = "[";
(* If the value at the current cell is not zero, then jump to the instruction after the matching loop start. *)
loop_end = "]";
(* A loop is a sequence of statements that are executed repeatedly until the value at the current cell is zero. *)
loop = loop_start, {statement}, loop_end;
(* A statement is a single instruction that can be executed, or a loop.*)
statement = increment | decrement | move_right | move_left | output | input | loop;
(* A program is a sequence of statements. *)
program = {statement};
Railroad Diagram
A railroad diagram is a graphical representation of a grammar. The following railroad diagram was generated from the EBNF grammar above:
@startebnf
title The BF Grammar Specification
(* Increment the value at the current cell by one. *)
increment = "+";
(* Decrement the value at the current cell by one. *)
decrement = "-";
(* Move the pointer to the right by one cell. *)
move_right = ">";
(* Move the pointer to the left by one cell. *)
move_left = "<";
(* Output the value at the current cell as a character. *)
output = ".";
(* Read a character from input and store it in the current cell. *)
input = ",";
(* Marks the start of a loop. The nearest loop_end instruction will refer to this. *)
loop_start = "[";
(* If the value at the current cell is not zero, then jump to the instruction after the matching loop start. *)
loop_end = "]";
(* A loop is a sequence of statements that are executed repeatedly until the value at the current cell is zero. *)
loop = loop_start, {statement}, loop_end;
(* A statement is a single instruction that can be executed, or a loop.*)
statement = increment | decrement | move_right | move_left | output | input | loop;
(* A program is a sequence of statements. *)
program = {statement};
@endebnf