Recursion and Iteration
Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.mit.edu) to authenticate, and then you will be redirected back to this page.
This reading is relatively new, and your feedback will help us
improve it! If you notice mistakes (big or small), if you have questions, if
anything is unclear, if there are things not covered here that you'd like to
see covered, or if you have any other suggestions; please get in touch during
office hours or open lab hours, or via email at 6.101help@mit.edu
.
Table of Contents
 1) Introduction
 2) Recursion vs. Iteration
 3) Listlike Patterns
 4) Treelike Patterns
 5) Graphlike Patterns
 6) When To Use Recursion and When To Use Iteration
 7) Common Mistakes in Recursion
 8) Summary
 9) Additional Reading: Generators
1) Introduction§
This reading examines recursion more closely by comparing and contrasting it with iteration. Both approaches create repeated patterns of computation. Recursion produces repeated computation by calling the same function recursively, on a simpler or smaller subproblem. Iteration produces repeated computation using for loops or while loops.
If we can come up with an iterative version, do we need recursion at all? In one sense, no, we don't need recursion  any function we can write recursively could also be written iteratively. But some problems lend themselves naturally to a recursive solution. When we try to solve those kinds of problems iteratively instead, we find ourselves simulating recursion anyway, using an explicit agenda to keep track of what we need to do next. For those kinds of problems, it's more straightforward to express the recursive algorithm directly, letting Python handle the agenda bookkeeping using its call stack.
The converse is also true: any function we can write iteratively, with a loop, could also be written recursively. So it's important to be able to think in both ways and choose the one that is most natural and appropriate to the problem we are trying to solve.
This reading looks at the essential equivalence between these approaches and some of their tradeoffs in simplicity and performance. We'll return to some of the functions we've written in previous readings, both recursive and iterative, and show how to write them using the respective other techniques.
2) Recursion vs. Iteration§
Let's start with the simple example we used in the last reading: factorial. Mathematically, we could have used either of the following definitions of factorial:
or
The first definition is recursive, but the second is actually iterative  it doesn't mention n! on the righthand side of the definition, and the righthand side effectively describes a loop.
In terms of code, these definitions translate directly to either a recursive or iterative implementation:
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n1)
def factorial(n):
out = 1
for i in range(1, n+1):
out *= i
return out
Which of these implementations would you choose? What are the benefits and drawbacks of each approach?
Here are some points of comparison:
 The iterative version might be more efficient, because it doesn't need to create new frames for the recursive calls.
 The recursive version might feel simpler, a better match for the mathematical definition of factorial.
 The two versions might behave differently with illegal inputs, like n < 0.
factorial(1)
do for the iterative version of factorial
above?
factorial(1)
do for the recursive version of factorial
above?
For either version, it would be better if the function produced an error right away when it gets an illegal input:
def factorial(n):
assert n >= 0, "need a nonnegative integer"
if n == 0:
return 1
else:
return n * factorial(n1)
The result of factorial(1)
for the recursive version points at an important difference between recursion and iteration in many programming languages (including Python): recursion is limited by a maximum callstack depth. In Python, this default limit is 1000 calls. When a function tries to call deeper than that, Python produces RecursionError
. This limit can be increased using sys.setrecursionlimit()
, but it can't be made unbounded, because the operating system itself puts a limit on Python's call stack.
factorial(1001)
do for the iterative version of factorial
above?
factorial(1001)
do for the recursive version of factorial
above (assuming default Python settings)?
3) Listlike Patterns§
The rest of this reading is structured by looking at several common kinds of repeated computation, depending on the shape of the data it is processing: listlike (i.e., a sequence of values), treelike, and graphlike. We will reach back to previous readings for a recursive or iterative algorithm that we have previously looked at for each of these forms, discussing how we might write the algorithm the other way.
Let's start with lists. Recall sum_list
from the Recursion reading, which we wrote both iteratively and recursively:
def sum_list(x):
sum = 0
for num in x:
sum += num
return sum
def sum_list(x):
if not x:
return 0
else:
return x[0] + sum_list(x[1:])
The iterative version naturally uses a for
loop.
The recursive version decomposes the list into its first element x[0]
and the rest of the list x[1:]
. The restoflist part is the smaller recursive subproblem. This first/rest decomposition is a very common way to deal with listlike data recursively.
What other recursive decompositions might be natural or plausible for listlike data?
What decomposition does binary search (also known as bisection search) use?
What decomposition does merge sort use?
Show/Hide
We might also imagine a decomposition that takes out the last element x[1]
and then recurses on the prefix of the list before it x[:1]
. This might be the right decomposition for a linear search from the end of the list, for example.
Binary search examines the element in the middle of a sorted list, say m = len(x)//2
and then recurses on either the left part x[:m]
or the right part x[m+1:]
, depending on whether the target element is less than or greater than x[m]
. This decomposition might be called middle/left/right.
Merge sort sorts a list by first dividing it into a left half and a right half, recursively sorting each half, and then recombining the sorted sublists by merging them together.
Notice that in each of these decompositions, the algorithm must be careful to make the recursive subproblems smaller. The first/rest and middle/left/right decompositions do it by peeling off one element of the list (the first or the middle) and excluding it from the recursive subproblems, so that they are definitely smaller than the original problem. This is how these decompositions guarantee that the subproblems are smaller than the original problem, so we will eventually terminate at the base case (which is often, but not always, the empty list).
sum_list
that uses a left/right recursive decomposition:
def sum_list(x):
if not x:
return 0
else:
m = len(x) // 2
return sum_list(x[0:m]) + sum_list(x[m:])
What does sum_list([1,2,3])
do for this version of sum_list
?
3.1) Accumulating Results Using a Helper Function§
One difference between the recursive and iterative versions of sum_list
is the order in which they do the summation. Although both versions consider the list elements in order from beginning to end, the recursive version defers any addition until it has reached the end of the list, and then it does the sum backwards. So the computation ends up looking like this:
 iterative version: start with
0
, then addx[0]
, thenx[1]
, thenx[2]
, ..., finally addx[n1]
 recursive version: start with
0
, then addx[n1]
, thenx[n2]
, thenx[n3]
, ..., finally addx[0]
The recursive version is saving up the additions for its recombination step, when it combines the results of the recursive subproblems.
But we don't have to write it this way. We can write a recursive version that does the addition as it goes, by defining a recursive helper function to which we pass the partial sum computed so far:
def sum_list(x):
def sum_helper(sum, lst):
if not lst:
return sum
else:
num = lst[0]
rest = lst[1:]
return sum_helper(sum + num, rest)
return sum_helper(0, x)
Although the order of addition doesn't really matter in this case, because addition is associative and commutative, this example demonstrates a general principle: an iterative loop can be transformed into a recursive helper function, by using parameters to carry the local variables that the iterative loop updates on each iteration.
In the loop in sum_list
, the relevant local variable is sum
, and here is how the loop is transformed into the recursive version shown above:
def sum_list(x):
sum = 0 # becomes initial call to sum_helper(0, x)
for num in x: # becomes num = lst[0], rest = lst[1:]
sum += num # becomes recursion step sum_helper(sum+num, rest)
return sum # becomes the base case
sum_list([1,2,3,4])
 specifically, how many calls to sum_helper
are made?One trick that makes this pattern even shorter: we can eliminate the helper function sum_helper
and instead recurse on sum_list
itself, by giving the additional parameters default initial values. That way, the original call initializes the recursion, and subsequent recursive calls update the parameters:
def sum_list(x, sum=0):
if not x:
return sum
else:
return sum_list(x[1:], sum + x[0])
3.2) Optimizations§
Listlike recursion in Python suffers from a couple of performance drawbacks:

Every recursive call creates a new frame, and the recursion depth is limited (to 1000 calls by default). For first/rest decompositions, this means that the length of the list that can be processed is similarly limited. (But for other decompositions, like lefthalf/righthalf, it's not a limitation at all. Why not? What is the maximumlength list that could be processed with successive halving if you are limited to 1000 recursive calls?)

Every recursive call in a first/rest decomposition needs to copy the rest of the list, using a slice like
lst[1:]
. Over the course of the entire computation, all this copying adds up to time proportional to the square of the length of the list.
Both of these issues can be fixed by clever optimizations:

The recursiondepth limit can be fixed by tailcall optimization. If the recursion is written so that the recursive call is the last thing done in the body of the function  like the line
return sum_helper(...)
in the recursive version ofsum_list
above  then this recursive call is called a tail call, coming as it does at the tail end of the work the function has to do. Tailcall optimization means that when the runtime system encounters a tail call, it deduces that it will no longer need the frame for the current call and can simply reuse it for the new recursive call, rather than creating a new frame. With tailcall optimization, every recursive call tosum_helper
simply reuses the same frame, the recursion depth never exceeds 1, and the performance of the recursive version is essentially like a loop.Tailcall optimization can't be applied to a recursive call that isn't at the very end of the function. If
sum_list
were written as we originally had it, withreturn x[0] + sum_list(x[1:])
, then this is not a tail call, because the function still needs to do some more work (addingx[0]
) after the recursive call comes back. Tailcall optimization is also blocked if the frame needs to be kept for a function object that was created during the call.Python unfortunately does not implement tailcall optimization, but other languages do.

The listcopying problem can be addressed by implementing the list with a linked list (which we will study in detail in a later week) rather than an array, so that the "rest" of a list can be obtained in constant time. Python doesn't do that, so another way to avoid list copying in Python is to use an index to represent the rest of the list, here called
i
:def sum_list(x, i=0, sum=0): if i >= len(x): return sum else: return sum_list(x, i+1, sum + x[i])
4) Treelike Patterns§
Having examined sequential, listlike data, let's turn now to treeshaped data, which might go to arbitrary depth.
For example, recall sum_nested
from the Recursion reading:
def sum_nested(x):
"""
>>> sum_nested([[1, 2], [3, [4, 5]], [[[[[6]]]]]])
21
"""
if not x:
return 0
elif isinstance(x[0], list):
return sum_nested(x[0]) + sum_nested(x[1:])
else:
return x[0] + sum_nested(x[1:])
Why do we think of the input to sum_nested
as treeshaped? Because each sublist in x
is like an internal node of a tree, with children that might be further sublists, until we reach simple numbers, which are the leaves.
Treeshaped data has a straightforward recursive decomposition that mirrors the tree structure: we make recursive calls to all children.
But notice that sum_nested
as written here is not only decomposing the tree structure recursively but also decomposing the list of children recursively.
sum_nested
are called on children of x
, when considering x
like a tree?
sum_nested
are called on the rest of a first/rest decomposition of x
, when considering x
as a list of children?To convert this recursive function into an iterative loop, we will reach back to an idea that we used in earlier readings: an agenda. Here, our agenda will keep track of recursive calls that we haven't made yet. Specifically, each item on the agenda will consist of the parameters for a recursive call. The agenda is initialized with the original list (which we will rename to original_x
so that we can keep using x
for the rest of the code without confusion), and then every time we encounter a recursive call to sum_nested()
, we put it on the agenda instead, so that it will be handled in a later iteration of the loop. Here is the code:
def sum_nested(original_x):
sum = 0
agenda = [original_x] # agenda consists of parameters for pending recursive calls
while agenda:
x = agenda.pop(0)
if not x:
sum += 0
elif isinstance(x[0], list):
agenda.insert(0, x[0])
agenda.insert(0, x[1:])
else:
sum += x[0]
agenda.insert(0, x[1:])
return sum
We opted in this iterative implementation to add and remove items at the start of the agenda, so that it explores the tree in the same order that the recursive algorithm does  depthfirst. But we didn't necessarily have to make that choice  how could we change the code so that the iterative version is traversing the tree breadthfirst instead? Either way, it doesn't matter for sum_nested
, since the numbers can be added in any order, but for other problems, the order might matter.
Looking at the transformation we just did from the other point of view, there is an agenda data structure hiding inside the recursive version  the agenda is just the call stack, managed by Python, and keeping track of the remaining work that needs to be done in the recursive process. So recursion gives us a convenient and simple way to express a depthfirst traversal without having to manage our own agenda data structure.
5) Graphlike Patterns§
Now that we have explored the equivalence between recursive traversal and iterative (depthfirst) traversal using an explicit agenda, let's go back to graphtraversal problems we explored in earlier readings. "Graphlike" data can be regarded as vertices connected by edges, but (unlike a tree) a graph may have multiple paths reaching the same vertex, or it may have cycles (paths that return to the same vertex again).
In previous readings, we solved graphtraversal problems using iteration, like this floodfill function:
def flood_fill(image, starting_location, new_color):
original_color = get_pixel(image, *starting_location)
agenda = [starting_location] # agenda: all of the cells we need to color in
visited = {starting_location} # visited set: all pixels ever added to the agenda
while agenda:
location = agenda.pop(0)
set_pixel(image, *location, new_color)
for neighbor in get_neighbors(location):
if (neighbor not in visited
and get_pixel(image, *neighbor) == original_color):
visited.add(neighbor)
agenda.append(neighbor)
Now that we know that we can use the recursive call stack as an agenda, how can we write this recursively? Everywhere we would put a work item on the agenda, let's make a call to a recursive helper function instead, using the work item as its parameter. This recursive helper function replaces the while
loop:
def flood_fill(image, starting_location, new_color):
original_color = get_pixel(image, *starting_location)
visited = {starting_location} # visited set: all pixels ever added to the agenda
def fill_from(location):
set_pixel(image, *location, new_color)
for neighbor in get_neighbors(location):
if (neighbor not in visited
and get_pixel(image, *neighbor) == original_color):
visited.add(neighbor)
fill_from(neighbor)
fill_from(starting_location)
Let's reflect on how we're using recursion here and think carefully about whether the recursion would actually stop.
What makes the recursive call fill_from(neighbor)
a smaller or simpler subproblem?
And what is the base case of the recursion?
Show/Hide
Before we make the recursive call to fill_from(neighbor)
, we first add neighbor
to the visited
set, so that we won't visit that pixel again. The graph of pixels we're trying to fill has effectively become one pixel smaller.
The base case happens when we have no recursive calls to make  when we reach a pixel which is a dead end, because all its neighbor pixels are either not original_color
or already visited
.
The observation we just made leads to a further observation. For flood fill, we have another guarantee that the recursive subproblem is smaller. What is it?
Show/Hide
The fact that every recursive call changes a pixel's color from
original_color
to new_color
also makes the problem smaller. So we actually don't need a visited set when we solve flood fill recursively!
def flood_fill(image, starting_location, new_color):
original_color = get_pixel(image, *starting_location)
if original_color == new_color:
return # nothing to do
def fill_from(location):
set_pixel(image, *location, new_color)
for neighbor in get_neighbors(location):
if get_pixel(image, *neighbor) == original_color:
fill_from(neighbor)
fill_from(starting_location)
6) When To Use Recursion and When To Use Iteration§
How might you decide to use recursion?
 When the problem is easy to express recursively (as factorial)
 When the data itself is defined recursively (like a tree often is)
 When depthfirst traversal is fine
 When the recursion won't get too deep (this constraint may be relaxed for programming languages other than Python)
Recursive solutions are often shorter and simpler, by moving agenda bookkeeping into the programming language rather than representing it explicitly.
As you gain comfort with recursion, you may find that a recursive solution is easier to express at first, because of natural decompositions (like first/rest, or lefthalf/righthalf, or recursing into children or graph neighbors). So it may be good to start with a recursive approach and then convert it to an iterative algorithm (using one of the strategies here) if you run into problems like recursion depth limits or excessive copying.
7) Common Mistakes in Recursion§
Here are three common ways that a recursion can go wrong:
 The base case may be missing entirely, or the problem needs more than one base case, but not all the base cases are covered.
 The recursive step doesn't reduce to a smaller subproblem, so the recursion doesn't converge.
 Aliases to a mutable data structure are inadvertently shared, and mutated, among the recursive calls.
We saw the first two issues in an exercise above, when we tried to convert sum_list
to use a left/right decomposition. When the subdivided list reached length 1, the decomposition was no longer able to make two smaller subproblems, so we needed to make that into a new base case.
Aliases to mutable data are sometimes necessary in recursion, as we saw above when we shared the visited
set across all calls in a recursive traversal of a graph. For flood fill, of course, sharing the mutable image
is necessary, because the whole point of the recursion is to fill in more pixels. But when you are doing a recursive depthfirst search without a visited
set  which, as we saw with flood fill, is often possible when a move in the search automatically makes the problem smaller or simpler  it can be tempting to keep the search state mutable. This then becomes tricky, because then multiple pending recursive calls have aliases to the same mutable state. We will see an example of this in the next reading, when we use recursive search to solve Sudoku puzzles.
Look for these mistakes when you're debugging. A RecursionError
is a sign of the first two. The third bug can be harder to spot, because it's fundamentally an aliasing bug.
8) Summary§
We've explored several kinds of problems, based on the shape of the data or repeated computation that needs to be done (listlike, treelike, graphlike), and seen how to do the repeated computation both recursively and iteratively.
Along the way, we've seen more strategies for thinking recursively, including:
 recursive decompositions that are commonly used for listlike data (first/rest, lefthalf/righthalf), treelike data (recursing into children), and graphlike data (recursing to neighbors);
 using a recursive helper function with parameters that accumulate partial results;
 using recursion for a depthfirst traversal of a tree or graph.
9) Additional Reading: Generators§
The general purpose reading on generators introduces these as a powerful feature of Python, useful in conjunction with both recursive and iterative functions.