Calculating Maximum Flow with 1 Simple Example

Maximum Flow addresses the problem of finding the largest possible throughput between 2 nodes in a directed graph. In this post, we will first illustrate the concept of Maximum Flow, before properly defining the solution outlined in the Ford-Fulkerson – Edmonds-Karp algorithm. This algorithm will then be implemented into Python code, and tested on a small example. All code presented here can be found on my GitHub.

Table of Contents

Calculating Maximum Flow with 1 Simple Example – image by Abi Plata

What is Maximum Flow?

The problem of Maximum Flow (Max Flow) concerns the largest possible throughput between 2 nodes in a directed graph. In this scenario, each edge in the graph has a source and target node. All the edges have capacity limits, that restrict the throughput to a maximum amount. Figure 1 illustrates this setup below.

Figure 1: A simple directed graph with 4 nodes and 5 edges. Integers next to each edge represent the capacity limits. This example is based off of the one provided here.

An intuitive example would be to think of a series of pipes that carry water. These pipes connect with one another at junction points, and each pipe can have a different diameter. Here each pipe is an edge, and the junction points are nodes. The diameter of the pipe limits how much water can flow through it at any point in time, and therefore is analogous to the capacity limit for the edge.

Looking at Figure 1, we can see that nodes $A$ , $B$ , $C$ , and $D$ represent the junction points for edges $A \rightarrow B$ , $A \rightarrow C$ , $B \rightarrow C$ , $B \rightarrow D$ , and $C \rightarrow D$ , which symbolize our pipes. We can now ask ourselves, how much water can possibly flow from $A$ to $D$ ?

To answer this question, we need to find all possible paths through the graph that start at $A$ , and end at $D$ . For each path, we take note of the minimum capacity as the upper bound for what that path can carry. Here we can see that there are 3 paths: $A \rightarrow B \rightarrow D$ , $A \rightarrow C \rightarrow D$ , and $A \rightarrow B \rightarrow C \rightarrow D$ . The first 2 paths each have a minimum capacity of 2. The third path technically has a minimum capacity of 3, however because of the flow already accounted for with the previous 2 paths, the amount of flow here is limited to 1. Summing these together, we see that the Max Flow for this example is 5.

Ford-Fulkerson - Edmonds-Karp Algorithm

Let’s now outline the algorithm we will use to compute Max Flow. I will be making use of the Ford-Fulkerson method with the Edmonds-Karp algorithm.

Start with a directed graph, with each edge having a positive capacity $C_{edge}$ value associated with it. I assume that there is only 1 edge going from source to target
Prepare a residual graph: this involves adding new edges to the graph with reversed direction, and an initial capacity of 0
Use BFS to identify a path through the residual graph from source to target, where the capacity for all edges involved is > 0
Determine the minimum capacity $C_{min}$ along the discovered path
Update the residual graph in the following way:
- For all edges (source to target) in path update the capacity as follows: $C_{edge} \leftarrow C_{edge} – C_{min}$
- For all edges (target to source), update the capacity as follows: $C_{edge} \leftarrow C_{edge} + C_{min}$
Go back to step 3 above, and repeat for as along as there is a viable path from source to target

Why a Residual Graph?

You might have noticed that the algorithm outlined above is somewhat more complicated than the simple example we walked through with Figure 1. In particular, the residual graph on step 2 should stand out. To outline why we go through the trouble of computing a residual graph, let’s extend our simple example by paying particular attention to the order with which we treat the different paths from $A$ to $D$ .

Now let’s imagine a scenario where the first path we look at is $A \rightarrow B \rightarrow C \rightarrow D$ . The minimum capacity along this path is 3, and once we account for this our graph will look like the one in Figure 2.

Figure 2: Effect of following path $A \rightarrow B \rightarrow C \rightarrow D$ first through the graph. The path is highlighted in green, and remaining capacities along the affected edges are shown.

We can see, the updated capacities along edges $A \rightarrow B$ and $C \rightarrow D$ are now 0. As such, there is no longer any path that traverses from $A$ to $D$ where all the edges involved have some remaining capacity. Therefore, the algorithm would terminate with an incorrect value of 3 for the Max Flow.

The residual graph enables our algorithm to work regardless of the order of paths we analyse. To demonstrate this, let’s work through the above example again but include the residual graph.

Figure 3: Residual graph added (dotted lines for new edges)

To build the residual graph, we add a new set of edges on top of those that are already present. One new edge is added for every existing edge in the example, with its direction reversed and an initial capacity of 0. This is illustrated in Figure 3 above.

Now let’s follow the Ford-Fulkerson – Edmonds-Karp algorithm. Again we’ll start by choosing the $A \rightarrow B \rightarrow C \rightarrow D$ path, and update our graph according to step 5 of our procedure:

Figure 4: Effect of following path $A \rightarrow B \rightarrow C \rightarrow D$ first through the residual graph. The path is shown in green, and the updated reversed edges are shown in orange.

Not only have we updated the capacities along the path, but we have also updated all the introduced reversed edges also. One of these is of particular importance to us: $C \rightarrow B$ . Now we can see that this edge has an available capacity of 3! This therefore opens up the following pathway for us: $A \rightarrow C \rightarrow B \rightarrow D$ .

Figure 6: Effect of following path $A \rightarrow C \rightarrow B \rightarrow D$ second through the residual graph. The path is shown in green, and the updated reversed edges are shown in orange.

The effects of following this second path are shown in Figure 6. This adds a further flow of 2 to node $D$ . Since the first path contributed a flow of 3, this then sums to the correct value of 5.

Implement Maximum Flow in Python

We can now work through an implementation of Max Flow in Python. Let’s start by importing all the required packages:

import numpy as np
import pandas as pd
from typing import Tuple, List

I will now define a set of classes that will encapsulate much of the work we will need to do. The first, MaxFlowGraph, controls operations that directly affect the graph:

class MaxFlowGraph(object):

    def __init__(self, edges: List[Tuple]) -> None:
        self.edges = edges
        nodes = set()
        for edge in edges:
            nodes.update(list(edge)[:-1])
        self.nodes = list(nodes)
        self.nodes.sort()
        self.number_nodes = len(self.nodes)
        self.reset_residual_graph()
        
    def reset_residual_graph(self) -> None:
        self.residual_graph = self.edges.copy()
        count = len(self.residual_graph)
        for i in range(count):
            edge = self.residual_graph[i]
            self.residual_graph.append((edge[1],edge[0],0,))

    def get_adjacency_matrix(self) -> np.array:
        # work out adjacency matrix
        O = np.zeros((self.number_nodes, self.number_nodes))
        for edge in self.residual_graph:
            src, dst, capacity = edge
            if capacity > 0:
                O[self.nodes.index(dst),self.nodes.index(src)] = 1
        return O

    def compute_residual_capacity_on_edge(self, src: str, dst: str, reduction_amount: float) -> None:
        # here I assume there is only 1 edge from src to dst
        for i, edge in enumerate(self.residual_graph):
            edge_src, edge_dst, edge_capacity = edge
            if (src == edge_src) and (dst == edge_dst) and (edge_capacity > 0):
                self.residual_graph[i] = (edge_src, edge_dst, edge_capacity - reduction_amount,)
            elif (src == edge_dst) and (dst == edge_src):
                self.residual_graph[i] = (edge_src, edge_dst, edge_capacity + reduction_amount,)

__init__(self, edges): initializer for a class instance
reset_residual_graph(self): sets up a fresh residual graph (step 2 of our algorithm). This function is meant to be run before each Max Flow calculation
get_adjacency_matrix(self): calculates and return the adjacency matrix for the residual graph
compute_residual_capacity_on_edge(self, src, dot, reduction_amount): function to update the edge $src \rightarrow dst$ that lies along the path found, as well as its mirror opposite edge $dst \rightarrow src$ . This is step 5 in our algorithm

We can now define a class that will handle the graph traversal:

class BFS(object):

    def __init__(self, graph: MaxFlowGraph) -> None:
        self.graph = graph

    def _prepare_for_traversal(self, source: str) -> None:
        self.queue = [source]
        self.visited = dict(zip(self.graph.nodes, len(self.graph.nodes) * [False]))
        self.visited[source] = True

    def _process_unvisited_neighbours(self, node: str) -> None:
        node_index = self.graph.nodes.index(node)
        neighbours_indices = self.graph.get_adjacency_matrix()[:,node_index].nonzero()[0].tolist()
        neighbours = np.array(self.graph.nodes)[neighbours_indices].tolist()
        for neighbour in neighbours:
            if not self.visited[neighbour]:
                self.queue.append(neighbour)
                self.visited[neighbour] = True
    
    def traverse(self, source: str) -> None:
        # prepare for the traversal
        self._prepare_for_traversal(source)
        # process queue
        while self.queue:
            # pop a node from the queue
            node = self.queue.pop(0)
            # go to each unvisited neighbour of node
            self._process_unvisited_neighbours(node)

Note that this class is closely based off the one I implemented in my Breadth-First Search post. Please refer to that article for a detailed explanation for how BFS works.

I will now build upon BFS, to implement a class that can return paths through the graph:

class Path(object):

    def __init__(self, label: str):
        self.label = label
        self.next_level = []

class PathFinder(BFS):
    
    def _add_to_paths(self, paths: Path, node: str, neighbour: str) -> None:
        if paths.label == node:
            paths.next_level.append(Path(neighbour))
        elif (paths.label != node) and paths.next_level:
            for path in paths.next_level:
                self._add_to_paths(path, node, neighbour)    
    
    def _process_unvisited_neighbours(self, node: str) -> None:
        node_index = self.graph.nodes.index(node)
        neighbours_indices = self.graph.get_adjacency_matrix()[:,node_index].nonzero()[0].tolist()
        neighbours = np.array(self.graph.nodes)[neighbours_indices].tolist()
        for neighbour in neighbours:
            if not self.visited[neighbour]:
                self.queue.append(neighbour)
                self.visited[neighbour] = True
                self._add_to_paths(self.paths, node, neighbour)

    def _determine_shortest_path(self, paths: Path, target: str, path_string: str) -> None:
        if paths.label == target:
            self.shortest_path = path_string + target
        elif paths.next_level:
            for path in paths.next_level:
                self._determine_shortest_path(path, target, path_string + paths.label)

    def find_shortest_path(self, source: str, target: str) -> bool:
        # initialize path string
        self.shortest_path = ""
        # define a Path instance to store all possible paths from source
        self.paths = Path(source)
        # traverse graph
        self.traverse(source)
        # identify shortest path that terminates at target
        self._determine_shortest_path(self.paths, target, self.shortest_path)
        # return status
        if self.shortest_path:
            return True
        return False

PathFinder is another class that is very close to one with the same namesake in my previous post on BFS. Please refer to that article for a detailed outline for how this class works. For the present discussion it’s important to highlight:

_process_unvisited_neighbours(self, node): this function is overwrites the one with the same name in BFS
find_shortest_path(self, source, target): uses BFS to find a path from source to target where all edge capacities involved are > 0. This is step 3 of our algorithm

Now we can finally put together a function that can compute Max Flow:

# iterate over all possible paths, compute max flow
def maxflow(pf: PathFinder, src: str, dst: str) -> float:
    
    maxflow = 0
    while pf.find_shortest_path(src, dst):
        # prepare path
        path = pf.shortest_path
        path = list(path)
    
        # get minimum capacity along path
        dfEdges = pd.DataFrame(pf.graph.residual_graph, columns=['src', 'dst', 'capacity'])
        min_capacity = float('inf')
        for i in range(len(path)-1):
            row = dfEdges[(dfEdges.src == path[i]) & (dfEdges.dst == path[i+1])]
            if row.capacity.values[0] < min_capacity:
                min_capacity = row.capacity.values[0]

        # max flow
        maxflow += min_capacity
    
        # compute residual graph
        for i in range(len(path)-1):
            pf.graph.compute_residual_capacity_on_edge(path[i], path[i+1], min_capacity)
    
    return maxflow

The function maxflow takes in a PathFinder object, along with the labels of the source src and target dst nodes we wish to do the computation for.

Test the Implementation

Now we can make use of the classes already defined to test out a calculation of Max Flow. Let’s create a list of edges that represent the same graph as is illustrated in Figure 1:

edges = [("A", "B", 3),("B", "D", 2),("A", "C", 2),("B", "C", 5),("C", "D", 3)]

Note that edges is a list of tuples, with each tuple consisting of: (source, target, capacity).

We can create an instance of a MaxFlowGraph, which should automatically setup the residual graph:

# create a graph instance
graph = MaxFlowGraph(edges)

# view initial residual graph
graph.residual_graph

[('A', 'B', 3),
 ('B', 'D', 2),
 ('A', 'C', 2),
 ('B', 'C', 5),
 ('C', 'D', 3),
 ('B', 'A', 0),
 ('D', 'B', 0),
 ('C', 'A', 0),
 ('C', 'B', 0),
 ('D', 'C', 0)]

We can see that the initializer has already created a residual graph, with all the reverse edges having a capacity of 0.

Let’s check to see that our instance has correctly identified the nodes, and setup the adjacency matrix:

# view detected nodes
graph.nodes

['A', 'B', 'C', 'D']

# view initial adjacency matrix
graph.get_adjacency_matrix()

array([[0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [1., 1., 0., 0.],
       [0., 1., 1., 0.]])

Both of these look good. Note that for the adjacency matrix, the columns represent the source nodes (in the same order as the graph.nodes list), while the rows are the target nodes. Elements within this matrix will only be set to 1 if there is capacity available at that edge.

Now for the real test: let’s create a PathFinder instance, and then attempt to compute the Max Flow from $A$ to $D$ :

# create a PathFinder instance
pf = PathFinder(graph)

# compute maxflow
maxflow(pf, 'A', 'D')

This is the correct answer! Now how have the residual graph and adjacency matrix been modified?

# view final residual graph
graph.residual_graph

[('A', 'B', 0),
 ('B', 'D', 0),
 ('A', 'C', 0),
 ('B', 'C', 4),
 ('C', 'D', 0),
 ('B', 'A', 3),
 ('D', 'B', 2),
 ('C', 'A', 2),
 ('C', 'B', 1),
 ('D', 'C', 3)]

# view final adjacency matrix
graph.get_adjacency_matrix()

array([[0., 1., 1., 0.],
       [0., 0., 1., 1.],
       [0., 1., 0., 1.],
       [0., 0., 0., 0.]])

We can see that both have been updated, as we would expect from an execution of this algorithm.

Final Remarks

In this post you have learned:

What is Maximum Flow (Max Flow)
What the Ford-Fulkerson – Edmonds-Karp algorithm is, and why it was developed
How to implement Max Flow from scratch in Python

I hope you enjoyed this article, and gained some value from it. If you would like to take a closer look at the code presented here, please take a look at my GitHub. If you have any questions or suggestions, please feel free to add a comment below. Your input is greatly appreciated.

Note I have started a New Monthly Newsletter! At the end of each month I will send out this free newsletter to each of my subscribers by email. This is the best way to stay on top of my latest content. Sign up for the newsletter here!

Hi I'm Michael Attard, a Data Scientist with a background in Astrophysics. I enjoy helping others on their journey to learn more about machine learning, and how it can be applied in industry.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.