17.8 Application: Control Flow Graphs

In this section, we’ll give a brief introduction to one of the applications of graphs to modelling computer programs. Back in Chapter 16, you learned about abstract syntax trees, which we can use to represent program code. Now, we’re going to look at how to represent not just program code, but also the different possible execution paths through a program. This type of modelling is an extremely powerful tool, and is used by both code checking tools like PythonTA as well as compilers for various programming languages to generate efficient machine instructions from program code.

What is a control flow graph?

Intuitively, a control flow graph is a representation of the different blocks of code in a Python program, and the different paths that the Python interpreter can take through the code. To get a clearer sense of what this means, let’s introduce one foundational definition.

A basic block is a sequence of non-compound statements and expressions in a program’s code that are guaranteed to execute together, one after the other.

Here are some examples and non-examples of basic blocks.

 # A single statement is a basic block. x = 1 # A sequence of multiple statements and function calls is a basic block. x = 5 y = x + 2 z = f(x, y) print(x + y + z) # A basic block can end with a return or raise statement. x = 5 y = x + 2 return f(x, y) # But a sequence of statements with a return/raise in the middle is # NOT a basic block, since the statements after the return/raise aren't # going to execute. x = 5 return x y = x + 2 # Will never execute! # An if statement is not a basic block, since it is a compound statement. # The statements it contains aren't guaranteed to execute one after the other. if x > 5: y = 3 else: y = 4

Typically we treat basic blocks as being maximal, i.e., as large as possible. So if we have a sequence of assignment statements ( x = 5 , y = x + 2 , etc.), we treat them as one big block rather than consisting of multiple single-statement blocks.

Now let’s look at that if statement example in more detail. We can divide it up into three basic blocks: one for the condition ( x > 5 ), then one for the if branch ( y = 3 ) and one for the else branch ( y = 4 ). When we first learned about if statements in Section 3.4, we drew simple diagrams to show the possible ways the Python interpreter could take through our code. We can now formalize this idea, and extend it to other kinds of control flow statements like loop.

A control-flow graph (CFG) of a program is a graph \(G = (V, E)\) where: