This article specifically talks about control flow in relation to CPSC 213, using C and SM213 assembly.

Control Flow With Pointers

Control flow in C can be easily done with loops and conditionals. Some new information is that using pointer arithmetic also enables varying ways of accessing data.

int s = 0;
int *p;
int a[] = {2, 4, 8, 10, 12, 14};
void foo() {
	p = a;
 
	while(p < a + 6){
		s += *p++;
	}
	
	//the above is the equivalent to
	while(p < a + 6){
		s += *p;
		p++;
	}
	//Both of these loops access all elements of the array
	//Note that at the end of the loop, the pointer p is pointing to the element after the end of the array, meaning dereferencing it would be UB
}

The above example again might seem confusing at first but is again a result of operator precedence in C.

The pointer arithmetic approach to array access is generally slightly faster, however most modern compilers will optimize it anyways.

In the Machine

In the toy machine in CPSC 213, and also in most actual in use instruction sets, the program counter is an essential part of the control flow in the machine. The program counter holds the instruction of the next address to be executed. When the CPU executes the next fetch sequence, it will read that address from the PC and start the CPU execution cycle. This effectively means that we can lead the code to where we want it to execute by changing the value of the PC.

Extending the ISA

To do so, we need to extend the functionality of the ISA, so we’re adding new instructions for jumps and conditional jumps, which will also enable conditional control flow.

Unconditional jumps always jump no matter what, and conditionals are based on the comparison in between a value in a register and zero. They are common in RISC, used in SM213.

Non-testable

There also exists conditionals based on the result of the last ALU instruction, which is used in INTEL- based architecture.

Relative Addressing

In a lot of cases, the distance in between the start of a loop and the end of a loop or conditional is quite small. However, the address itself is quite large. This situation resulted in an optimization called relative addressing, where the value of a certain address in relative to the current address, as opposed to being absolute. One of those new instructions we got, jump instructions, can use relative addressing. Instead of specifying the destination address, they specify the offset from the current location in code. In this case, it is generally called a branch.

To do so:

  • We use the current value of PC as the base address
  • Remember PC is the value of the next instruction, not the current one
  • Calculate the difference in addresses in between the PC and the instruction we want to jump to taking into account the above-mentioned offset
  • Use jump instruction to redirect flow to that value

To note:

  • Jumps using relative addresses are called branches
  • Assembly still specifies the actual address/label
  • It is converted to an absolute value by the assembler

When branching, because all instructions are either 2 bytes or 6 bytes, the addresses of instructions are always even. Thus, when using relative addressing/branching, we can take the current PC instruction, subtract the target address, divide by two, and it will give a us a neat, small 2 byte instruction.

This does however mean that branching has a limit. We can reverse that theoretical limit from the 2 byte jump instruction. Since it’s two bytes, and the first byte is the instruction itself, we only have one byte of data used to target the address itself. That byte needs to be signed, so we have the range from [-2^7, 2^7] possible values. Since the offset we pass to the assembler is divided by two (because all addresses are even), it can store double that, so the maximum range of addresses for a relative jump instruction is : [-2^8, 2^8].

Static Control Flow

Static control flow is called as such because jumps and branches know at compile time where they will jump. This is in opposition to dynamic control flow which is commonly used for procedure returns.

Static control flow in relation to procedures means that the machine knows exactly where to go, it is fully resolved at compile time. Static method invocations are effectively just what we’ve seen in terms of assembly, where you push arguments onto the stack, save a return address and jump to the address of the procedure.

This is in contrast to what happens in OOP oriented languages like Java or C++, where you can get dynamic procedure calls, where calling foo() on two different types might get you a completely different result.

To note that also, for procedures to work properly, you still need dynamic control flow for the return to work. “Static method invocations” refers to the static-ness of the procedure call, not the callee.

Dynamic Control Flow

A procedure is a set of reusable code logic. In SM213 assembly, we can see that by using jumps we can easily redirect code to execute a specific procedure we’ve defined elsewhere. However, we know from experience using higher level programming languages that a procedure will always (well in most cases) return to the code calling it when it’s done. However, how is a procedure supposed to know where to return to? The answer is that it can’t. This is why dynamic control flow exists. Control flow where the compiler cannot tell in advance where this code is going to end up.