In the previous chapter we taught our compiler and VM to handle arithmetic expressions, booleans, and comparison operators. The bytecode goes in one direction: forward. Every instruction executes exactly once, in order. That changes now.
Conditionals require the ability to skip instructions. When the condition of an if
expression is false, the VM must jump over the consequence and execute the alternative
instead. When it is true, it must execute the consequence and jump over the alternative.
This means we need jump instructions, opcodes that change the instruction pointer
rather than just letting it advance by one.
This chapter introduces three new opcodes, a technique called back-patching, and the
plumbing to make if/else expressions work end to end. Follow these steps in order.
New opcodes #
Add three opcodes to the Opcode enum in code.zig:
pub const Opcode = enum(u8) {
op_constant,
op_add,
op_sub,
op_mul,
op_div,
op_true,
op_false,
op_equal,
op_not_equal,
op_greater_than,
op_minus,
op_bang,
op_pop,
op_jump_not_truthy, // conditional jump: pop top of stack, jump if not truthy
op_jump, // unconditional jump: always jump
op_null, // push Null onto the stack
};
Their definitions in the lookup table:
| Opcode | Operand widths | Description |
|---|---|---|
op_jump_not_truthy |
&[_]u8{2} |
Pop value, jump to 2-byte target address if falsy. |
op_jump |
&[_]u8{2} |
Jump to 2-byte target address unconditionally. |
op_null |
&[_]u8{} |
Push Null onto the stack (no operands). |
Both jump opcodes carry a single 2-byte (u16) operand – the absolute address in the
instruction stream to jump to. op_null has no operands. It exists solely to provide a
value when an if expression has no else branch and the condition is false.
Add these to the lookup function:
pub fn lookup(op: Opcode) Definition {
return switch (op) {
.op_constant => .{ .name = "OpConstant", .operand_widths = &[_]u8{2} },
.op_add => .{ .name = "OpAdd", .operand_widths = &[_]u8{} },
.op_sub => .{ .name = "OpSub", .operand_widths = &[_]u8{} },
.op_mul => .{ .name = "OpMul", .operand_widths = &[_]u8{} },
.op_div => .{ .name = "OpDiv", .operand_widths = &[_]u8{} },
.op_true => .{ .name = "OpTrue", .operand_widths = &[_]u8{} },
.op_false => .{ .name = "OpFalse", .operand_widths = &[_]u8{} },
.op_equal => .{ .name = "OpEqual", .operand_widths = &[_]u8{} },
.op_not_equal => .{ .name = "OpNotEqual", .operand_widths = &[_]u8{} },
.op_greater_than => .{ .name = "OpGreaterThan", .operand_widths = &[_]u8{} },
.op_minus => .{ .name = "OpMinus", .operand_widths = &[_]u8{} },
.op_bang => .{ .name = "OpBang", .operand_widths = &[_]u8{} },
.op_pop => .{ .name = "OpPop", .operand_widths = &[_]u8{} },
.op_jump_not_truthy => .{ .name = "OpJumpNotTruthy", .operand_widths = &[_]u8{2} },
.op_jump => .{ .name = "OpJump", .operand_widths = &[_]u8{2} },
.op_null => .{ .name = "OpNull", .operand_widths = &[_]u8{} },
};
}
How conditionals compile #
Before writing any code, we need a mental model of what bytecode an if/else
expression should produce. Take this example:
if (true) { 10 } else { 20 }
The compiled bytecode looks like this:
0000 OpTrue
0001 OpJumpNotTruthy 0010 // if condition is false, jump to 0010
0004 OpConstant 0 // push 10 (consequence)
0007 OpJump 0013 // jump over the alternative
0010 OpConstant 1 // push 20 (alternative)
0013 OpPop // pop the result (top-level expression statement)
The control flow:
-
Condition is true:
OpTruepushestrue.OpJumpNotTruthypops it, sees it is truthy, does not jump.OpConstant 0pushes 10.OpJumpjumps to 0013, skipping the alternative.OpPoppops the 10. -
Condition is false:
OpFalsepushesfalse.OpJumpNotTruthypops it, sees it is not truthy, jumps to 0010.OpConstant 1pushes 20. Execution continues to 0013.OpPoppops the 20.
For an if without an else:
if (true) { 10 }
0000 OpTrue
0001 OpJumpNotTruthy 0010
0004 OpConstant 0 // push 10 (consequence)
0007 OpJump 0011
0010 OpNull // no else, so push Null
0011 OpPop
When the condition is false, we jump to OpNull, which pushes a Null value. This way
if/else always produces a value, which is important because if is an expression
in Monkey.
Back-patching explained #
Look at the bytecode above one more time:
0000 OpTrue
0001 OpJumpNotTruthy 0010 <-- we emit this BEFORE compiling the consequence
0004 OpConstant 0
0007 OpJump 0013 <-- we emit this BEFORE compiling the alternative
0010 OpConstant 1
0013 OpPop
When the compiler emits OpJumpNotTruthy at position 0001, it does not yet know where
the alternative starts. The consequence has not been compiled yet. Its length is unknown.
The same problem applies to OpJump at position 0007 – we do not know where the
alternative ends.
Back-patching solves this in two steps:
-
Emit the jump with a placeholder operand. We use 9999 as the dummy value. It is obviously wrong and will cause a test failure if we forget to patch it, which is exactly what we want.
-
After compiling the target code, go back and overwrite the placeholder with the real address.
In concrete terms:
// Step 1: Emit OpJumpNotTruthy with placeholder.
const jump_not_truthy_pos = try self.emit(.op_jump_not_truthy, &[_]usize{9999});
// Step 2: Compile the consequence.
try self.compileBlockStatement(ie.consequence);
// ... remove trailing pop ...
// Step 3: Emit OpJump with placeholder.
const jump_pos = try self.emit(.op_jump, &[_]usize{9999});
// Step 4: Back-patch OpJumpNotTruthy -- now we know where the alternative starts
const after_consequence = self.instructions.items.len;
self.changeOperand(jump_not_truthy_pos, after_consequence);
// Step 5: Compile the alternative (or emit OpNull).
// ...
// Step 6: Back-patch OpJump because we know where the alternative ends.
const after_alternative = self.instructions.items.len;
self.changeOperand(jump_pos, after_alternative);
The changeOperand function writes the real address into the instruction stream at the
position where the placeholder was. Since jump operands are 2 bytes, it writes a u16
at offset position + 1 (the +1 skips past the opcode byte).
Back-patching is used by real compilers. It is simpler than the alternative (making
multiple passes over the instruction stream or using symbolic labels that get resolved
later). We know the exact byte offset to patch because emit returns the position
where each instruction starts.
Updating the code module #
Add three new variants to the Opcode enum in code.zig:
op_jump_not_truthy,
op_jump,
op_null,
Add these entries to the lookup function:
.op_jump_not_truthy => .{ .name = "OpJumpNotTruthy", .operand_widths = &[_]u8{2} },
.op_jump => .{ .name = "OpJump", .operand_widths = &[_]u8{2} },
.op_null => .{ .name = "OpNull", .operand_widths = &[_]u8{} },
That’s it for code.zig. The existing make, readOperands, and disassemble functions
already know how to handle 2-byte operands (from op_constant) and zero-operand opcodes
(from the arithmetic ops). This is the payoff of building a table-driven encoding scheme.
Compiler changes #
All code in this section goes in compiler.zig.
Tracking emitted instructions #
We need to remember what the last two emitted instructions were. This serves two
purposes. First, back-patching: we need to know positions of recently emitted jumps. Second, removing trailing OpPop: when we compile a block statement inside an if/else, the last statement in the block might be an expression statement, which normally compiles to <expression> OpPop. But we do not want to pop the value because it is the return value of the entire if/else expression. We need to detect and remove that trailing OpPop.
Add a struct, two new fields to the compiler, a deinit method and expand the init method:
const EmittedInstruction = struct {
opcode: code.Opcode,
position: usize,
};
pub const Compiler = struct {
instructions: std.ArrayList(u8),
constants: std.ArrayList(object.Object),
allocator: std.mem.Allocator,
last_instruction: ?EmittedInstruction,
previous_instruction: ?EmittedInstruction,
pub fn init(allocator: std.mem.Allocator) Compiler {
return .{
.instructions = std.ArrayList(u8).empty,
.constants = std.ArrayList(object.Object).empty,
.allocator = allocator,
.last_instruction = null,
.previous_instruction = null,
};
}
pub fn deinit(self: *Compiler) void {
self.instructions.deinit(self.allocator);
self.constants.deinit(self.allocator);
}
// ... methods follow ...
};
Updated emit
#
Update the emit method on the Compiler struct. It now records the position and opcode of each emitted instruction, shifting last_instruction into previous_instruction:
fn emit(self: *Compiler, op: code.Opcode, operands: []const usize) !usize {
const instruction = try code.make(self.allocator, op, operands);
defer self.allocator.free(instruction);
const pos = self.instructions.items.len;
try self.instructions.appendSlice(self.allocator, instruction);
self.previous_instruction = self.last_instruction;
self.last_instruction = .{ .opcode = op, .position = pos };
return pos;
}
The return value is the byte position where the instruction starts. This is critical for back-patching – when we emit a jump, we save this position so we can come back and fix the operand later.
Back-patching helpers #
Add these two methods to the Compiler struct:
fn replaceOperand(self: *Compiler, op_pos: usize, operand: usize) void {
std.mem.writeInt(
u16,
self.instructions.items[op_pos + 1 ..][0..2],
@intCast(operand),
.big,
);
}
fn changeOperand(self: *Compiler, op_pos: usize, operand: usize) void {
self.replaceOperand(op_pos, operand);
}
op_pos is the byte position of the opcode (returned by emit). The operand starts
at op_pos + 1 (one byte after the opcode) and is 2 bytes wide. We write the new value
as a big-endian u16, replacing whatever was there before (the 9999 placeholder).
changeOperand is a wrapper that exists for clarity at the call site. In a more complex
compiler with variable-length instructions, you might need changeOperand to reconstruct
the entire instruction. For now it delegates directly.
Removing the last OpPop
#
When we compile a block like { 10 }, the 10 is an expression statement and normally
compiles to OpConstant 0 followed by OpPop. But inside an if/else, we want the
value 10 to remain on the stack as the result of the conditional. So we remove the
trailing OpPop.
Add these two methods to the Compiler struct:
fn removeLastPop(self: *Compiler) void {
if (self.last_instruction) |last| {
if (last.opcode == .op_pop) {
self.instructions.items.len = last.position;
self.last_instruction = self.previous_instruction;
}
}
}
fn lastInstructionIs(self: *Compiler, op: code.Opcode) bool {
if (self.last_instruction) |last| {
return last.opcode == op;
}
return false;
}
removeLastPop works by truncating the instruction buffer back to the position where
OpPop was written. Since OpPop is a single byte with no operands, this is safe –
we just set items.len back by one byte. We also restore last_instruction to
previous_instruction so the tracking remains accurate.
lastInstructionIs is a predicate we use before calling removeLastPop.
Compiling block statements #
A block statement is a sequence of statements wrapped in curly braces. Add this method to the Compiler struct:
fn compileBlockStatement(self: *Compiler, block: ast.BlockStatement) CompileError!void {
for (block.statements) |stmt| {
try self.compileStatement(stmt);
}
}
This function exists because if/else consequence and alternative branches are both
block statements, and we want a clean call site.
Compiling if expressions #
This is the centerpiece of the chapter. Every piece we have built so far (jump opcodes,
back-patching, removeLastPop, compileBlockStatement) comes together here.
Add this case to the switch (expr) in compileExpression. It goes alongside the existing .integer_literal, .infix, .boolean, and .prefix cases from previous chapters.
.if_expression => |ie| {
// 1. Compile the condition. It leaves a value on the stack.
try self.compileExpression(ie.condition.*);
// 2. Emit OpJumpNotTruthy with a placeholder operand.
// The VM will pop the condition value and jump if it is falsy.
// We do not know the jump target yet.
const jump_not_truthy_pos = try self.emit(
.op_jump_not_truthy,
&[_]usize{9999},
);
// 3. Compile the consequence block.
try self.compileBlockStatement(ie.consequence);
// 4. Remove the trailing OpPop from the consequence.
// The last expression value should stay on the stack,
// it's the result of the if/else expression.
if (self.lastInstructionIs(.op_pop)) {
self.removeLastPop();
}
// 5. Emit OpJump with a placeholder operand.
// This jumps over the alternative so that after executing the
// consequence, we do not fall through into the alternative.
const jump_pos = try self.emit(.op_jump, &[_]usize{9999});
// 6. Back-patch OpJumpNotTruthy.
// Now that we know where the alternative starts, it is the current
// position in the instruction stream (right after OpJump).
const after_consequence = self.instructions.items.len;
self.changeOperand(jump_not_truthy_pos, after_consequence);
// 7. Compile the alternative, or emit OpNull if there is none.
if (ie.alternative) |alt| {
try self.compileBlockStatement(alt);
if (self.lastInstructionIs(.op_pop)) {
self.removeLastPop();
}
} else {
_ = try self.emit(.op_null, &[_]usize{});
}
// 8. Back-patch OpJump.
// Now we know where execution should continue after the
// consequence, right after the alternative.
const after_alternative = self.instructions.items.len;
self.changeOperand(jump_pos, after_alternative);
},
Walk through it step by step with if (true) { 10 } else { 20 }:
| Step | Action | Instructions so far |
|---|---|---|
| 1 | Compile true |
0000 OpTrue |
| 2 | Emit OpJumpNotTruthy 9999 | 0001 OpJumpNotTruthy 9999 |
| 3 | Compile { 10 } |
0004 OpConstant 0 0007 OpPop |
| 4 | Remove trailing OpPop | 0004 OpConstant 0 (OpPop removed) |
| 5 | Emit OpJump 9999 | 0007 OpJump 9999 |
| 6 | Back-patch: 9999 -> 10 | 0001 OpJumpNotTruthy 10 |
| 7 | Compile { 20 } |
0010 OpConstant 1 (trailing OpPop removed) |
| 8 | Back-patch: 9999 -> 13 | 0007 OpJump 13 |
Final bytecode:
0000 OpTrue
0001 OpJumpNotTruthy 10
0004 OpConstant 0 // 10
0007 OpJump 13
0010 OpConstant 1 // 20
0013 ... // whatever comes next
VM changes #
All code in this section goes in vm.zig.
Truthiness #
The VM needs to decide whether a value is truthy (for OpJumpNotTruthy). Add this function to vm.zig at module level (outside the VM struct, it doesn’t need self).
The rules:
trueis truthy,falseis notnullis not truthy- Everything else (integers, etc.) is truthy
fn isTruthy(obj: object.Object) bool {
return switch (obj) {
.boolean => |b| b.value,
.null => false,
else => true,
};
}
IP management: two approaches #
Before implementing the jump opcodes, we need to decide how the VM’s main loop manages the instruction pointer. There are two common approaches, and jump instructions make the difference between them much more visible.
Approach 1: loop increments IP (shown for comparison only, do not use this) #
The main loop increments ip by 1 after each iteration. Each opcode handler
advances ip past its operands (but not past the opcode byte itself – the loop
handles that).
pub fn run(self: *VM) !void {
var ip: usize = 0;
while (ip < self.instructions.len) : (ip += 1) {
const op: code.Opcode = @enumFromInt(self.instructions[ip]);
switch (op) {
.op_constant => {
const const_idx = std.mem.readInt(
u16,
self.instructions[ip + 1 ..][0..2],
.big,
);
ip += 2; // skip operand bytes; loop adds 1 for the opcode
try self.push(self.constants[const_idx]);
},
.op_jump_not_truthy => {
const target = std.mem.readInt(
u16,
self.instructions[ip + 1 ..][0..2],
.big,
);
ip += 2; // skip operand bytes
const condition = self.pop();
if (!isTruthy(condition)) {
ip = target - 1; // -1 because the loop will add 1
}
},
.op_jump => {
const target = std.mem.readInt(
u16,
self.instructions[ip + 1 ..][0..2],
.big,
);
ip = target - 1; // -1 because the loop will add 1
},
.op_null => {
try self.push(.{ .null = .{} });
// no operands to skip; loop adds 1
},
.op_add, .op_sub, .op_mul, .op_div => {
try self.executeBinaryOperation(op);
// no operands to skip; loop adds 1
},
.op_pop => {
_ = self.pop();
},
// ... other single-byte opcodes handled the same way ...
}
}
}
Approach 2: each case manages its own IP (we’ll use this path) #
Each opcode handler advances ip by the full size of the instruction. The main loop
does not touch ip at all. Replace the run method on the VM struct in vm.zig with this version:
pub fn run(self: *VM) !void {
var ip: usize = 0;
while (ip < self.instructions.len) {
const op: code.Opcode = @enumFromInt(self.instructions[ip]);
switch (op) {
.op_constant => {
const const_idx = std.mem.readInt(
u16,
self.instructions[ip + 1 ..][0..2],
.big,
);
ip += 3; // 1 opcode + 2 operand bytes
try self.push(self.constants[const_idx]);
},
.op_jump_not_truthy => {
const target = std.mem.readInt(
u16,
self.instructions[ip + 1 ..][0..2],
.big,
);
ip += 3; // consume the full instruction first
const condition = self.pop();
if (!isTruthy(condition)) {
ip = target; // jump to target directly
}
// if truthy, ip already points past this instruction
},
.op_jump => {
const target = std.mem.readInt(
u16,
self.instructions[ip + 1 ..][0..2],
.big,
);
ip = target; // jump to target directly
},
.op_null => {
try self.push(.{ .null = .{} });
ip += 1;
},
.op_add, .op_sub, .op_mul, .op_div => {
try self.executeBinaryOperation(op);
ip += 1;
},
.op_pop => {
_ = self.pop();
ip += 1;
},
.op_true => {
try self.push(.{ .boolean = .{ .value = true } });
ip += 1;
},
.op_false => {
try self.push(.{ .boolean = .{ .value = false } });
ip += 1;
},
.op_equal, .op_not_equal, .op_greater_than => {
try self.executeComparison(op);
ip += 1;
},
.op_minus => {
try self.executeMinus();
ip += 1;
},
.op_bang => {
try self.executeBang();
ip += 1;
},
}
}
}
- Jump targets are used as-is.
ip = targetis obvious. No off-by-one adjustments. - Each handler is self-contained. You can look at any single case and understand
exactly how
ipchanges. There is no hidden+= 1elsewhere. - Harder to introduce bugs when adding new opcodes.
Executing jumps #
With the per-case IP approach, the jump handlers are clean.
OpJumpNotTruthy reads the 2-byte target address, advances ip past the full 3-byte instruction, pops the condition value, and if it’s not truthy, sets ip = target. Otherwise ip already points to the next instruction.
OpJump reads the 2-byte target address and sets ip = target. No need to advance past the instruction first since we’re jumping away unconditionally.
OpNull pushes the Null object onto the stack and advances ip by 1.
Tests #
Bytecode inspection tests #
Create src/compiler_test.zig. These tests verify that the compiler produces
the exact bytecode we expect. They catch problems in the compiler without involving the VM, which makes debugging much easier.
Add the testCompile and concatInstructions helpers, then the three test blocks:
const std = @import("std");
const code = @import("code.zig");
const Compiler = @import("compiler.zig").Compiler;
const Lexer = @import("lexer.zig").Lexer;
const Parser = @import("parser.zig").Parser;
fn testCompile(
arena: std.mem.Allocator,
input: []const u8,
) !Compiler {
var lexer = Lexer.init(input);
var parser = Parser.init(arena, &lexer);
const program = try parser.parseProgram();
var comp = Compiler.init(arena);
try comp.compile(program);
return comp;
}
fn concatInstructions(allocator: std.mem.Allocator, slices: []const []const u8) ![]u8 {
var buf: std.ArrayList(u8) = .empty;
for (slices) |s| {
try buf.appendSlice(allocator, s);
}
return buf.toOwnedSlice(allocator);
}
test "compiler: if expression with consequence and alternative" {
const allocator = std.testing.allocator;
var arena = std.heap.ArenaAllocator.init(allocator);
defer arena.deinit();
const comp = try testCompile(arena.allocator(), "if (true) { 10 } else { 20 }");
// Expected instructions:
// 0000 OpTrue
// 0001 OpJumpNotTruthy 10
// 0004 OpConstant 0
// 0007 OpJump 13
// 0010 OpConstant 1
// 0013 OpPop
const expected_instructions = [_][]const u8{
try code.make(allocator, .op_true, &[_]usize{}),
try code.make(allocator, .op_jump_not_truthy, &[_]usize{10}),
try code.make(allocator, .op_constant, &[_]usize{0}),
try code.make(allocator, .op_jump, &[_]usize{13}),
try code.make(allocator, .op_constant, &[_]usize{1}),
try code.make(allocator, .op_pop, &[_]usize{}),
};
defer for (expected_instructions) |instr| allocator.free(instr);
const expected = try concatInstructions(allocator, &expected_instructions);
defer allocator.free(expected);
try std.testing.expectEqualSlices(u8, expected, comp.instructions.items);
// Verify constants
try std.testing.expectEqual(@as(usize, 2), comp.constants.items.len);
try std.testing.expectEqual(@as(i64, 10), comp.constants.items[0].integer.value);
try std.testing.expectEqual(@as(i64, 20), comp.constants.items[1].integer.value);
}
test "compiler: if expression without alternative" {
const allocator = std.testing.allocator;
var arena = std.heap.ArenaAllocator.init(allocator);
defer arena.deinit();
const comp = try testCompile(arena.allocator(), "if (true) { 10 }");
// Expected instructions:
// 0000 OpTrue
// 0001 OpJumpNotTruthy 10
// 0004 OpConstant 0
// 0007 OpJump 11
// 0010 OpNull
// 0011 OpPop
const expected_instructions = [_][]const u8{
try code.make(allocator, .op_true, &[_]usize{}),
try code.make(allocator, .op_jump_not_truthy, &[_]usize{10}),
try code.make(allocator, .op_constant, &[_]usize{0}),
try code.make(allocator, .op_jump, &[_]usize{11}),
try code.make(allocator, .op_null, &[_]usize{}),
try code.make(allocator, .op_pop, &[_]usize{}),
};
defer for (expected_instructions) |instr| allocator.free(instr);
const expected = try concatInstructions(allocator, &expected_instructions);
defer allocator.free(expected);
try std.testing.expectEqualSlices(u8, expected, comp.instructions.items);
// Only one constant (10)
try std.testing.expectEqual(@as(usize, 1), comp.constants.items.len);
try std.testing.expectEqual(@as(i64, 10), comp.constants.items[0].integer.value);
}
test "compiler: if expression with comparison condition" {
const allocator = std.testing.allocator;
var arena = std.heap.ArenaAllocator.init(allocator);
defer arena.deinit();
var comp = try testCompile(arena.allocator(), "if (1 < 2) { 10 } else { 20 }");
defer comp.deinit();
// Note: 1 < 2 is rewritten as 2 > 1 by the compiler (from Chapter 7).
// Expected instructions:
// 0000 OpConstant 0 (constant: 2)
// 0003 OpConstant 1 (constant: 1)
// 0006 OpGreaterThan
// 0007 OpJumpNotTruthy 16
// 0010 OpConstant 2 (constant: 10)
// 0013 OpJump 19
// 0016 OpConstant 3 (constant: 20)
// 0019 OpPop
const expected_instructions = [_][]const u8{
try code.make(allocator, .op_constant, &[_]usize{0}),
try code.make(allocator, .op_constant, &[_]usize{1}),
try code.make(allocator, .op_greater_than, &[_]usize{}),
try code.make(allocator, .op_jump_not_truthy, &[_]usize{16}),
try code.make(allocator, .op_constant, &[_]usize{2}),
try code.make(allocator, .op_jump, &[_]usize{19}),
try code.make(allocator, .op_constant, &[_]usize{3}),
try code.make(allocator, .op_pop, &[_]usize{}),
};
defer for (expected_instructions) |instr| allocator.free(instr);
const expected = try concatInstructions(allocator, &expected_instructions);
defer allocator.free(expected);
try std.testing.expectEqualSlices(u8, expected, comp.instructions.items);
}
When a bytecode test fails, the error message shows raw byte sequences. A useful
debugging aid is to add disassemble output to the failure message. This is an optional pattern you can use inside any bytecode test, not a specific function to add to the codebase:
if (!std.mem.eql(u8, expected, compiler.instructions.items)) {
const expected_str = try code.disassemble(allocator, expected);
defer allocator.free(expected_str);
const actual_str = try code.disassemble(allocator, compiler.instructions.items);
defer allocator.free(actual_str);
std.debug.print("instructions wrong.\nexpected:\n{s}\ngot:\n{s}\n", .{
expected_str,
actual_str,
});
return error.TestUnexpectedResult;
}
VM integration tests #
Add these tests to vm_test.zig. They verify the full pipeline: parse, compile, run,
check the value left on the stack.
This chapter introduces an ExpectedValue tagged union and a testExpectedValue helper
to handle tests that can return integers, booleans, or null. Add this import to the top of vm_test.zig (alongside the existing imports from chapter 6):
const Compiler = @import("compiler.zig").Compiler;
Then replace the testVMRun helper and the old test "integer arithmetic" and test "boolean expressions" tests from chapters 6 and 7 with the code below. The new helpers support integers, booleans, and null, and all future chapters reuse them:
const ExpectedValue = union(enum) {
integer: i64,
boolean: bool,
null: void,
};
fn testVMRun(arena: *std.heap.ArenaAllocator, input: []const u8) !object.Object {
const alloc = arena.allocator();
var l = Lexer.init(input);
var p = Parser.init(alloc, &l);
const program = try p.parseProgram();
var comp = Compiler.init(alloc);
try comp.compile(program);
var machine = VM.init(comp.bytecode());
try machine.run();
return machine.lastPoppedStackElem() orelse error.NoResult;
}
fn testExpectedValue(actual: object.Object, expected: ExpectedValue) !void {
switch (expected) {
.integer => |exp| {
try std.testing.expectEqual(exp, actual.integer.value);
},
.boolean => |exp| {
try std.testing.expectEqual(exp, actual.boolean.value);
},
.null => {
try std.testing.expect(actual == .null);
},
}
}
test "vm: integer arithmetic" {
const allocator = std.testing.allocator;
const tests = [_]struct { input: []const u8, expected: ExpectedValue }{
.{ .input = "1", .expected = .{ .integer = 1 } },
.{ .input = "2", .expected = .{ .integer = 2 } },
.{ .input = "1 + 2", .expected = .{ .integer = 3 } },
.{ .input = "1 - 2", .expected = .{ .integer = -1 } },
.{ .input = "1 * 2", .expected = .{ .integer = 2 } },
.{ .input = "4 / 2", .expected = .{ .integer = 2 } },
.{ .input = "50 / 2 * 2 + 10 - 5", .expected = .{ .integer = 55 } },
.{ .input = "5 + 5 + 5 + 5 - 10", .expected = .{ .integer = 10 } },
.{ .input = "2 * 2 * 2 * 2 * 2", .expected = .{ .integer = 32 } },
.{ .input = "5 * 2 + 10", .expected = .{ .integer = 20 } },
.{ .input = "5 + 2 * 10", .expected = .{ .integer = 25 } },
.{ .input = "5 * (2 + 10)", .expected = .{ .integer = 60 } },
.{ .input = "-5", .expected = .{ .integer = -5 } },
.{ .input = "-10", .expected = .{ .integer = -10 } },
.{ .input = "-50 + 100 + -50", .expected = .{ .integer = 0 } },
.{ .input = "(5 + 10 * 2 + 15 / 3) * 2 + -10", .expected = .{ .integer = 50 } },
};
for (tests) |t| {
var arena = std.heap.ArenaAllocator.init(allocator);
defer arena.deinit();
const result = try testVMRun(&arena, t.input);
try testExpectedValue(result, t.expected);
}
}
test "vm: boolean expressions" {
const allocator = std.testing.allocator;
const tests = [_]struct { input: []const u8, expected: ExpectedValue }{
.{ .input = "true", .expected = .{ .boolean = true } },
.{ .input = "false", .expected = .{ .boolean = false } },
.{ .input = "1 < 2", .expected = .{ .boolean = true } },
.{ .input = "1 > 2", .expected = .{ .boolean = false } },
.{ .input = "1 < 1", .expected = .{ .boolean = false } },
.{ .input = "1 > 1", .expected = .{ .boolean = false } },
.{ .input = "1 == 1", .expected = .{ .boolean = true } },
.{ .input = "1 != 1", .expected = .{ .boolean = false } },
.{ .input = "1 == 2", .expected = .{ .boolean = false } },
.{ .input = "1 != 2", .expected = .{ .boolean = true } },
.{ .input = "true == true", .expected = .{ .boolean = true } },
.{ .input = "false == false", .expected = .{ .boolean = true } },
.{ .input = "true == false", .expected = .{ .boolean = false } },
.{ .input = "true != false", .expected = .{ .boolean = true } },
.{ .input = "(1 < 2) == true", .expected = .{ .boolean = true } },
.{ .input = "(1 < 2) == false", .expected = .{ .boolean = false } },
.{ .input = "(1 > 2) == true", .expected = .{ .boolean = false } },
.{ .input = "(1 > 2) == false", .expected = .{ .boolean = true } },
.{ .input = "!true", .expected = .{ .boolean = false } },
.{ .input = "!false", .expected = .{ .boolean = true } },
.{ .input = "!5", .expected = .{ .boolean = false } },
.{ .input = "!!true", .expected = .{ .boolean = true } },
.{ .input = "!!false", .expected = .{ .boolean = false } },
.{ .input = "!!5", .expected = .{ .boolean = true } },
};
for (tests) |t| {
var arena = std.heap.ArenaAllocator.init(allocator);
defer arena.deinit();
const result = try testVMRun(&arena, t.input);
try testExpectedValue(result, t.expected);
}
}
test "vm: conditionals" {
const allocator = std.testing.allocator;
const tests = [_]struct {
input: []const u8,
expected: ExpectedValue,
}{
.{ .input = "if (true) { 10 }", .expected = .{ .integer = 10 } },
.{ .input = "if (true) { 10 } else { 20 }", .expected = .{ .integer = 10 } },
.{ .input = "if (false) { 10 } else { 20 }", .expected = .{ .integer = 20 } },
.{ .input = "if (1) { 10 }", .expected = .{ .integer = 10 } },
.{ .input = "if (1 < 2) { 10 }", .expected = .{ .integer = 10 } },
.{ .input = "if (1 < 2) { 10 } else { 20 }", .expected = .{ .integer = 10 } },
.{ .input = "if (1 > 2) { 10 } else { 20 }", .expected = .{ .integer = 20 } },
.{ .input = "if (1 > 2) { 10 }", .expected = .{ .null = {} } },
.{ .input = "if (false) { 10 }", .expected = .{ .null = {} } },
.{
.input = "if ((if (false) { 10 })) { 10 } else { 20 }",
.expected = .{ .integer = 20 },
},
};
for (tests) |t| {
var arena = std.heap.ArenaAllocator.init(allocator);
defer arena.deinit();
const result = try testVMRun(&arena, t.input);
try testExpectedValue(result, t.expected);
}
}
The last test case deserves attention:
if ((if (false) { 10 })) { 10 } else { 20 }
The inner if (false) { 10 } has no else branch, so it evaluates to Null. The outer
if then tests Null for truthiness. Since Null is not truthy, the outer if takes
the else branch and produces 20. This test verifies that:
- Nested
ifexpressions compile correctly. OpNullis pushed when there is no else branch and the condition is false.- Null is correctly treated as falsy by
isTruthy.
Verify it works #
Make sure you’ve uncommented "src/compiler_test.zig" in build.zig, then run zig build test. No output means all tests passed.