Backtracking Algorithm

Backtracking algorithm is an elegant way to realize nested loop using recursive to search the solution space for your problem, especially for those problems that the depth of nested loop could not be determined beforehand. In this post, I try to demonstrate how to understand and code backtracking algorithm step by step with lots of examples.

Take Algorithm as Technology!

What is Backtracking Algorithm?

I would like to think of backtracking algorithm as a search method. Imaging you are in a forest and you need find a way out. There are multiple choices at each intersection, and you have no idea about which direction or path is right. So, you pick out one direction randomly or in any order you like and keep on going. If it is a dead end, you go back and choose another direction, otherwise you find a way out. During the whole process, there might be lots of “going back and choosing again” moments which is actually the spirit of backtracking.

In analogy, backtracking alrorithm is similar to Depth First Search (DFS) algorithm in graph data structure, but in a more general way. There most likely have no clear graph data structure, and you need to figure out a way to treat the solution space of your problem as a graph. The very basic search method is to construct loops, especially nested loop. However, both DFS and backtracking algorithm leveraging the more powerful and flexible recursive way!

Backtracking algorithm is an effective way to tackle Constraint Satisfaction Problems (CSP), which means to find out a set of objects whose state must satisfy some kind of constraints or limitations. The objects could be anything, such as paths or numbers, as long as the constraint would be met by them. A very classic example of applying backtracking strategy in textbook is the Eight Queen Puzzle problem (or N-Queen Puzzle in general). Normally, there are more than one solution when applying backtracking algorithm.

Tom is sleeping while I am studying backtracking algorithm

Other two types of problems: Free Optimisation Problem (FOP), such as Travelling Salesman Problem (TSP), and Constrained Optimisation Problem (COP), such as city X must be visited after city Y in TSP. We might have more than one way to formalise the same problem. In terms of optimisation problem, if the search space S is defined by continuous variables, we call them numerical optimisation problems, if S is defined by discrete variables, they are combinational optimisation problems.

Backtracking Algorithm Pseudo-Code

This is the backbone of backtracking algorithm:

Function backtracking(current_state):
    if current_state is complete and valid:
        find a solution!
        return
    for each step in possible steps according to the current_state:
        if step is valid (does not violate constraints):
            add step to current_state
            backtracking(current_state)
            remove step from current_state

In different scenarios, how to find the possible steps in accordance with valid current state and constraints is the most difficult part while coding.

Example: Find Out Combinations

Suppose the task is to find out all the $2$-element pairs from a set (mathematical set) such as ABCD. The code needs a nested loop with depth of $2$.

>>> elem = tuple('ABCD')
>>> size = len(elem)
>>> for i in range(size):
...     for j in range(i+1,size):
...         print(elem[i], elem[j])
...         
A B
A C
A D
B C
B D
C D

If the task is changed to find out all $3$-element combinations. The code should be constructed by a nested loop with depth of $3$.

>>> elem = tuple('ABCD')
>>> size = len(elem)
>>> for i in range(size):
...     for j in range(i+1,size):
...         for k in range(j+1,size):
...             print(elem[i], elem[j], elem[k])
...             
A B C
A B D
A C D
B C D

Now you might see the pattern of the code and the hidden issue. When the size of the set is $N$, the task is to find out all $M$-element combinations and both $N$ and $M$ $(M <= N)$ could not be determined while coding, backtracking algorithm kicks in. By leveraging the power of recursive, backtracking gives us a way to tackle this problem.

class comb:

    def __init__(self, theset):
        self.set = theset
        self.N = len(theset)

    def _backtracking(self, M):
        # recursive function starts from end condition
        if len(self.comb) == M:
            yield self.comb
            return
        for i in range(self.ipos, self.N):
            self.comb.append(self.set[i])
            self.ipos = i + 1  # loop starts from next position
            yield from self._backtracking(M)
            self.comb.pop()
    
    def __call__(self, M):
        assert 0 < M <= self.N
        self.ipos = 0
        self.comb = []
        yield from self._backtracking(M)  # generator


print('3-element combinations from ABCD:')
for c in comb(tuple('ABCD'))(3):
    print(c)

print('2-element combinations from ABC:')
for c in comb(tuple('ABC'))(2):
    print(c)

print('4-element combinations from ABCDE:')
for c in comb(tuple('ABCDE'))(4):
    print(c)

The output of this piece of code is below:

3-element combinations from ABCD:
['A', 'B', 'C']
['A', 'B', 'D']
['A', 'C', 'D']
['B', 'C', 'D']
2-element combinations from ABC:
['A', 'B']
['A', 'C']
['B', 'C']
4-element combinations from ABCDE:
['A', 'B', 'C', 'D']
['A', 'B', 'C', 'E']
['A', 'B', 'D', 'E']
['A', 'C', 'D', 'E']
['B', 'C', 'D', 'E']

Perfect! Backtracking works… Actually, recursive is more powerful than loop. In this case, whatever $N$ and $M$ could be, we don’t need to change the code anymore. Backtracking is just like Depth First Search (DFS) algorithm for graph structure, but in a more general way.

This combination problem has two constraints:

fixed size of object set
no duplicates

Example: Find Out Permutation

Find out permutation is a bit harder than combination. First of all, I give out the code implemented with nested loop for later comparison. The task is to find out all $2$-element permutation from ABC.

>>> elem = tuple('ABC')
>>> size = len(elem)
>>> for i in range(size):    
...     for j in range(size):
...         result = elem[i], elem[j]
...         if len(set(result)) == 2:  # length without duplicates
...             print(result)
... 
('A', 'B')
('A', 'C')
('B', 'A')
('B', 'C')
('C', 'A')
('C', 'B')

The issue for this code is the same, not flexible enough. When $M$ and $N$ change, the existed code would become useless. Now, I use backtracking trick to solve this issue.

class perm:

    def __init__(self, theset):
        self.set = theset
        self.N = len(theset)

    def _backtracking(self, M):
        if len(self.perm) == M:
            yield self.perm
            return
        for i in range(0,self.N):
            if self.set[i] not in self.perm:
                self.perm.append(self.set[i])
                yield from self._backtracking(M)
                self.perm.pop()
    
    def __call__(self, M):
        assert 0 < M <= self.N
        self.perm = []
        yield from self._backtracking(M)


print('1-element permutations from AB:')
for c in perm(tuple('AB'))(1):
    print(c)

print('2-element permutations from ABC:')
for c in perm(tuple('ABC'))(2):
    print(c)

print('3-element permutations from ABC:')
for c in perm(tuple('ABC'))(3):
    print(c)

Output:

1-element permutations from AB:
['A']
['B']
2-element permutations from ABC:
['A', 'B']
['A', 'C']
['B', 'A']
['B', 'C']
['C', 'A']
['C', 'B']
3-element permutations from ABC:
['A', 'B', 'C']
['A', 'C', 'B']
['B', 'A', 'C']
['B', 'C', 'A']
['C', 'A', 'B']
['C', 'B', 'A']

Great! The code for permutation is a bit more brutal than for combination. One more constraint compare to combination problem is that the order matters.

Example: N-Queen Problem

On a square $N\times N$ chessboard, the constraints for N-Queen problem is no two queens threaten each other. So, no two queens should be on the same row, column and diagonal.

class NQueen:

    def __init__(self, N):
        self.N = N
        self.chessboard = [[0 for _ in range(N)] for _ in range(N)]
    
    def _check_position(self, col):
        for i in range(self.row):
            # check vertical line
            if self.chessboard[i][col] == 1:
                return False
            # position offset
            pos_off = self.row - i
            # check forward diagonal
            check_pos = col + pos_off
            if 0 <= check_pos < self.N:
                if self.chessboard[i][check_pos] == 1:
                    return False
            # check backward diagonal
            check_pos = col - pos_off
            if 0 <= check_pos < self.N:
                if self.chessboard[i][check_pos] == 1:
                    return False
        return True
    
    def _backtracking(self):
        if self.row == self.N:
            yield self.chessboard
            return
        for i in range(self.N):
            if self._check_position(i) is True:
                self.chessboard[self.row][i] = 1
                self.row += 1
                yield from self._backtracking()
                self.row -= 1
                self.chessboard[self.row][i] = 0
    
    def __call__(self):
        self.row = 0
        yield from self._backtracking()


for n in (i for i in range(1,6)):
    print(f'* N-Queen Problem for N={n}')
    for i,q in enumerate(NQueen(n)(),1):
        print(f'Solution {i}:')
        for i in range(n):
            print(q[i])

Output:

* N-Queen Problem for N=1
Solution 1:
[1]
* N-Queen Problem for N=2
* N-Queen Problem for N=3
* N-Queen Problem for N=4
Solution 1:
[0, 1, 0, 0]
[0, 0, 0, 1]
[1, 0, 0, 0]
[0, 0, 1, 0]
Solution 2:
[0, 0, 1, 0]
[1, 0, 0, 0]
[0, 0, 0, 1]
[0, 1, 0, 0]
* N-Queen Problem for N=5
Solution 1:
[1, 0, 0, 0, 0]
[0, 0, 1, 0, 0]
[0, 0, 0, 0, 1]
[0, 1, 0, 0, 0]
[0, 0, 0, 1, 0]
Solution 2:
[1, 0, 0, 0, 0]
[0, 0, 0, 1, 0]
[0, 1, 0, 0, 0]
[0, 0, 0, 0, 1]
[0, 0, 1, 0, 0]
Solution 3:
[0, 1, 0, 0, 0]
[0, 0, 0, 1, 0]
[1, 0, 0, 0, 0]
[0, 0, 1, 0, 0]
[0, 0, 0, 0, 1]
Solution 4:
[0, 1, 0, 0, 0]
[0, 0, 0, 0, 1]
[0, 0, 1, 0, 0]
[1, 0, 0, 0, 0]
[0, 0, 0, 1, 0]
Solution 5:
[0, 0, 1, 0, 0]
[1, 0, 0, 0, 0]
[0, 0, 0, 1, 0]
[0, 1, 0, 0, 0]
[0, 0, 0, 0, 1]
Solution 6:
[0, 0, 1, 0, 0]
[0, 0, 0, 0, 1]
[0, 1, 0, 0, 0]
[0, 0, 0, 1, 0]
[1, 0, 0, 0, 0]
Solution 7:
[0, 0, 0, 1, 0]
[1, 0, 0, 0, 0]
[0, 0, 1, 0, 0]
[0, 0, 0, 0, 1]
[0, 1, 0, 0, 0]
Solution 8:
[0, 0, 0, 1, 0]
[0, 1, 0, 0, 0]
[0, 0, 0, 0, 1]
[0, 0, 1, 0, 0]
[1, 0, 0, 0, 0]
Solution 9:
[0, 0, 0, 0, 1]
[0, 1, 0, 0, 0]
[0, 0, 0, 1, 0]
[1, 0, 0, 0, 0]
[0, 0, 1, 0, 0]
Solution 10:
[0, 0, 0, 0, 1]
[0, 0, 1, 0, 0]
[1, 0, 0, 0, 0]
[0, 0, 0, 1, 0]
[0, 1, 0, 0, 0]

Excellent! Interestingly, when $N=2$ and $N=3$, there is no solution.

Complexity Analysis

Time Complexity

Backtracking algorithm does not bring down the time complexity associated with the corresponding nested loop method as we can see in the combination example. So, time complexity is still polynomial $O(N^M)$, which is also the potential size of solution space. For different problems, the hidden coefficient could be distinct significantly. The earlier to determine the dead ends during backtracking, the more efficient the algorithm. Like the N-Queen example code, in _check_position method, only check the row numbers smaller than current row.

However, backtracking would also lead to NP complexity. If we don’t know $M$ before kicking off backtracking (we know it even though it is a variable in previous examples), its potential search space is all subsets of $N$. This is a classic NP time complexity problem, such as subset sum problem with exponential time complexity of $O(2^N)$ (if using backtracking algorithm).

Space Complexity

Backtracking’s intention is to find solutions by recursive way. There is only some spaces used on calling stack and no intermediate results are explicitly stored during execution. So, we can say the space complexity of backtracking algorithm is the lovely $O(1)$.

Summary

This post introduces the backtracking algorithm as a powerful recursive technique for exploring solution spaces, particularly when the depth of iteration is unknown. It could be conceptualized as a systematic search method, akin to Depth First Search (DFS) in graph theory, where invalid paths are discarded and alternatives are explored. The core pseudo-code illustrates its recursive nature. Practical examples, including finding combinations, permutations, and solving the N-Queen problem, demonstrate its versatility and how it elegantly handles problems where traditional nested loops fall short due to dynamic depth requirements. While backtracking algorithm doesn’t reduce time complexity, it offers a flexible, recursive approach, and its space complexity is generally $O(1)$ due to minimal explicit storage.

What is Backtracking Algorithm?#

Backtracking Algorithm Pseudo-Code#

Example: Find Out Combinations#

Example: Find Out Permutation#

Example: N-Queen Problem#

Complexity Analysis#

Time Complexity#

Space Complexity#

Summary#