An early look at the Dart programming language

To content | To menu | To search

Algorithms in Dart: Kosaraju's Algorithm

Let's apply our DirectedGraph class to one of the most well known of graph algorithms - Kosaraju's algorithm for finding strongly connected components of a directed graph. The algorithm performs two passes of a depth first search (DFS) - the first on a directed graph will all arcs reversed and the second on the original graph. The second DFS discovers the strongly connected components by traversing the nodes in the original graph in the reverse order of their finishing times returned by the first DFS on the transformed graph.

Introduction


Now that we have a efficient way to create an adjacency set representation of a graph using our DirectedGraph class that we created in the previous article [1], let's apply it to one of the most well known of graph algorithms - Kosaraju's algorithm for finding strongly connected components of a directed graph. The algorithm performs two passes of a depth first search (DFS) - the first on a directed graph will all arcs reversed and the second on the original graph. We can summarize the key elements of the algorithm in three concise steps:

  1. Compute a transform of the input graph g: g -> grev
  2. Perform the first DFS on grev: Returns order of visit nodes in grev
  3. Perform a second DFS on g in the reverse visit order returned by first DFS

The second DFS will discover the strongly connected components by traversing the nodes in the original graph in the reverse order of their finishing times returned by the first DFS on the transformed graph. Note that we will also introduce another class from Dart's collection API, the ListQueue() [2], which will store our finish times from the first DFS.

The Algorithm


We start off by defining a new private class and adding it to our graphlab library, which also handles importing the required dart:collections library. The graphlab library is the same library that contains our DirectedGraph class:

part of graphlab;

class _Kosaraju {
}

Let's then create a method for transforming the input graph. We could, of course, simply perform the first DFS by traversing all the arcs in the original graph in the reverse direction, but for illustration purposes, we will go ahead and build a new graph to correlate with the three steps that we outlined above. Step 1 can then be implemented as follows:

/// Given a directed graph, return the reverse of that graph.
DirectedGraph transformGraph(DirectedGraph g) {
  DirectedGraph grev = new DirectedGraph();
  // Copy the nodes to grev graph.
  for (var node in g) {
    grev.addNode(node);
  }
  // Then copy the edges while reversing them.
  for (var start in g) {
    for (var end in g.edgesFrom(start)) {
      grev.addEdge(end, start);
    }
  }
  return grev;
}

Both the graph g and the returned graph grev are of type DirectedGraph. Now that we have our transformed graph, we can perform the first DFS (step 2 in the algorithm). Let's take a look at the pseudocode first. We'll actually define two methods - the first controls the outer loop of the DFS which will iterate through each node of the graph and the second updates the finishing times based on the results of exploring all nodes reachable from the start vertex s [3]:

# Pseudocode for first DFS using grev
InitFirstDFS(graph grev)
  mark all nodes unexplored
  current label -> n
  for each vertex v in grev
    FirstDFS(grev, v)

FirstDFS(graph grev, start vertex s)
  for every edge (s, v)
    if v not yet explored
      mark v explored
      FirstDFS(grev, v)
  set finish time (s) -> current label
  current label -> current label - 1

In Dart, we keep track of the finish time of the vertex s by using a ListQueue() from the dart:collections library. A ListQueue guarantees constant time peek and remove operations, and amortized constant time add operations, leading to the O(m + n) upper bound on the DFS running time, where m is the number of edges and n is the number of nodes reachable from the source vertex s. We can use the now familiar HashSet() to keep track of which nodes have been visited as we loop through the DFS recursively. Therefore, implementing the above pseudocode in Dart might look like the following [4]:

/// Given a graph, returns a queue containing the nodes of that graph in
/// the order in which a DFS of that graph finishes expanding the nodes.
ListQueue initFirstDFS(DirectedGraph g) {
  // The resulting ordering of the nodes.
  ListQueue visitOrder = new ListQueue();
  // The set of nodes that we've visited so far.
  HashSet visited = new HashSet();
  // Perform a DFS for each node in g and whose origin is given by node.
  for (var node in g) {
    firstDFS(node, g, visitOrder, visited);
  }
  return visitOrder;
}

/// Recursively explores the given node with a DFS, adding it to the end of the
/// queue once the exploration is complete.
void firstDFS(var node, DirectedGraph g, Queue visitOrder, HashSet visited) {
  // If we've already been at this node, don't explore it again.
  if (visited.contains(node)) {
    return;
  } else {
    // Otherwise, mark that we've been here.
    visited.add(node);
  }
  // Recursively explore all the node's children.
  for (var endpoint in g.edgesFrom(node)) {
    firstDFS(endpoint, g, visitOrder, visited);
  }
  // We're done exploring this node, so add it to the ordered queue of
  // visited nodes.
  visitOrder.addLast(node);
}

Of course, implementing the DFS recursively can lead to stack overflow problems even for relatively small graph sizes. We will need to unwind the recursive implementation above and instead turn to an iterative approach using an explicit stack, thus removing the recursion depth issue. We also replace the HashSet that is tracking our visited nodes with a List of type boolean. This allows us to monitor which nodes have been visited by merely checking a boolean value rather than probing a set of nodes using the contains method. First, we create a simple data structure to track the variables that are pushed onto the programming stack during the recursive algorithm [5]:

/// StackBuffer is a simple data structure that keeps track of stack
/// variables when performing an iterative DFS using an explicit stack.
class StackBuffer {
  final index;
  final node;
  final next;

  StackBuffer(this.node, this.index, this.next);
}

Then we can implement the iterative DFS for the first part of Kosaraju's algorithm as follows:

/// Given a graph, returns a queue containing the nodes of that graph in
/// the order in which a DFS of that graph finishes expanding the nodes.
ListQueue initFirstDFS(DirectedGraph g) {
  // The resulting ordering of the nodes.
  ListQueue visitOrder = new ListQueue();
  // The set of nodes that we've visited so far.
  List visited = new List.filled(g.length + 1, false);
  // Perform a DFS for each node in g and whose origin is given by node.
  for (var node in g) {
    if (!visited[node]) {
      firstDFS(node, g, visitOrder, visited);
    }
  }
  return visitOrder;
}

/// Iteratively explores the given node with a DFS, adding it to the output
/// list once the exploration is complete.
void firstDFS(var node, DirectedGraph g, ListQueue visitOrder, List visited) {
  // Define an explicit stack of type StackBuffer and
  // set its initial value to null.
  StackBuffer stack = null;
  var i = 0;
  bool flag = visited[node] ? false : true;
  // Explicit stack while loop.
  while (true) {
    if (i == 0) {
      visited[node] = flag;
    }
    List children = g.edgesFrom(node).toList();
    var n = children.length;
    // Main iteration loop, loops through children.
    for (; i < n; i++) {
      var child = children[i];
      if (visited[child] != flag) {
        stack = new StackBuffer(node, i + 1, stack);
        node = child;
        i = 0;
        break;
      }
    }
    // If we iterate until no more children to explore,
    // add this node to the visit order.
    if (i == n) {
      visitOrder.addLast(node);
      if (stack == null) {
        return;
      }
      node = stack.node;
      i = stack.index;
      stack = stack.next;
    }
  }
}

Now that we have an ordered list of the nodes according to their finish times in a transformed graph, step 3 of Kosaraju's algorithm says that we can discover the strongly connected components of this graph by exploring the original graph in the reverse order of their finish times using a modified DFS. The strongly connected components of the transformed graph is the same as the strongly connected components of the original graph. The pseudocode for the modified DFS would look like the following:

# Pseudocode for second DFS using g
SecondDFS(graph g, start vertex s)
  if s not yet explored
    label node
    for each endpoint in g reachable from s
      SecondDFS(g, endpoint)

Implementing the above pseudocode in Dart:

/// Recursively marks all nodes reachable from the given node by a DFS with
/// the current label.
void secondDFS(var node, DirectedGraph g, HashMap result, int label) {
  // If we've visited this node before, just return.
  if (result.containsKey(node)) return;
  // Otherwise label the node with the current label.
  result[node] = label;
  // Explore all nodes reachable from this node.
  for (var endpoint in g.edgesFrom(node)) {
    secondDFS(endpoint, g, result, label);
  }
}

Here again, we will need to unwind the recursive algorithm to prevent reaching the recursion depth limit for larger graphs. The iterative implementation of Kosaraju's algorithm would then look like the following:

/// Iteratively mark all nodes reachable from the given node
/// by a DFS with the current label.
void secondDFS(var node, DirectedGraph g, HashMap result, int label,
                 List resultHasKey) {
  ListQueue stack = new ListQueue();
  stack.add(node);
  while (!stack.isEmpty) {
    var child = stack.removeLast();
    if (!resultHasKey[child]) {
      resultHasKey[child] = true;
      result[child] = label;
      for (var endpoint in g.edgesFrom(child)) {
        if (!resultHasKey[endpoint]) {
          stack.add(endpoint);
        }
      }
    }
  }
}

Now that we have developed the methods for each step in the algorithm, let's complete the class that we defined earlier. We add a final method, computeSCC(), that returns an object of type HashMap. It also accepts several optional named parameters. The first is the DirectedGraph grev, which can be passed if it is more efficient to build the reverse graph at the same time the original graph is constructed. The second optional named parameter is the boolean negIndex, which when set to true, allows node values less than zero. We will see the need for this in the next article when we look at boolean satisfiability problems. The HashMap result stores each node and which strongly connected component it belongs to..

part of graphlab;

/**
 * Returns a map of the strongly connected components of a directed graph.
 * Implements an iterative version of Kosaraju's algorithm.
 */

class _Kosaraju {
  static var offset;
  static var scale;

  HashMap computeSCC(DirectedGraph g, {DirectedGraph grev, bool negIndex: false}) {
    /// Create the result map and a counter to keep track of which
    /// depth first search iteration this is.
    HashMap result = new HashMap();
    int iteration = 0;
    if (!negIndex) {
      offset = 0;
      scale = 1;
    } else {
      offset = g.length;
      scale = 2;
    }

    /// Run a depth-first search in the reverse graph to get the order in
    /// which the nodes should be processed and save the results in a queue.
    if (grev == null) {
      grev = transformGraph(g);
    }
    ListQueue visitOrder = initFirstDFS(grev);
    /// Continuously process the the nodes from the queue by running a
    /// depth first search from each unmarked node encountered.
    List resultHasKey = new List.filled(scale * g.length + 1, false);
    while (!visitOrder.isEmpty) {
      // If the last node has already been visited, ignore it and
      // continue on with the next node in the queue until empty.
      var startPoint = visitOrder.removeLast();
      if (resultHasKey[startPoint + offset]) {
        continue;
      }
      // Otherwise, run a depth first search from the node contained in
      // startPoint, updating the result map with everything visited as being
      // at the current iteration level.
      secondDFS(startPoint, g, result, iteration, resultHasKey);
      // Increase the number of the next SCC to label.
      iteration++;
    }
    return result;
  }

  // Step 1:
  /// Given a directed graph, return the reverse of that graph.
  DirectedGraph transformGraph(DirectedGraph g) {
    DirectedGraph grev = new DirectedGraph();
    // Copy the nodes to grev graph.
    for (var node in g) {
      grev.addNode(node);
    }
    // Then copy the edges while reversing them.
    for (var start in g) {
      for (var end in g.edgesFrom(start)) {
        grev.addEdge(end, start);
      }
    }
    return grev;
  }

  // Step 2:
  /// Given a graph, returns a queue containing the nodes of that graph in
  /// the order in which a DFS of that graph finishes expanding the nodes.
  ListQueue initFirstDFS(DirectedGraph g) {
    // The resulting ordering of the nodes.
    ListQueue visitOrder = new ListQueue();
    // The set of nodes that we've visited so far.
    List visited = new List.filled(scale * g.length + 1, false);
    // Perform a DFS for each node in g and whose origin is given by node.
    for (var node in g) {
      if (!visited[node + offset]) {
        firstDFS(node, g, visitOrder, visited);
      }
    }
    return visitOrder;
  }

  /// Iteratively explores the given node with a DFS, adding it to the output
  /// list once the exploration is complete.
  void firstDFS(var node, DirectedGraph g, ListQueue visitOrder, List visited) {
    // Define an explicit stack of type StackBuffer and
    // set its initial value to null.
    StackBuffer stack = null;
    var i = 0;
    bool flag = visited[node + offset] ? false : true;
    // Explicit stack while loop.
    while (true) {
      if (i == 0) {
        visited[node + offset] = flag;
      }
      var children = g.edgesFrom(node).toList();
      var n = children.length;
      // Main iteration loop, loops through children.
      for (; i < n; i++) {
        var child = children[i];
        if (visited[child + offset] != flag) {
          stack = new StackBuffer(node, i + 1, stack);
          node = child;
          i = 0;
          break;
        }
      }
      // If we iterate until no more children to explore,
      // add this node to the visit order.
      if (i == n) {
        visitOrder.addLast(node);
        if (stack == null) {
          return;
        }
        node = stack.node;
        i = stack.index;
        stack = stack.next;
      }
    }
  }

  // Step 3:
  /// Iteratively mark all nodes reachable from the given node
  /// by a DFS with the current label.
  void secondDFS(var node, DirectedGraph g, HashMap result, int label,
                 List resultHasKey) {
    ListQueue stack = new ListQueue();
    stack.add(node);
    while (!stack.isEmpty) {
      var child = stack.removeLast();
      if (!resultHasKey[child + offset]) {
        resultHasKey[child + offset] = true;
        result[child] = label;
        for (var endpoint in g.edgesFrom(child)) {
          if (!resultHasKey[endpoint + offset]) {
            stack.add(endpoint);
          }
        }
      }
    }
  }
}

/// StackBuffer is a simple data structure that keeps track of stack
/// variables when performing an iterative DFS using an explicit stack.
class StackBuffer {
  final index;
  final node;
  final next;

  StackBuffer(this.node, this.index, this.next);
}

The code for the graphlab library, including the iterative version of Kosaraju's algorithm, can be found on Github [6].

Conclusion


In our next article, we wrap a couple of Future based top level functions around our Kosaraju class and use these functions to solve some typical problems involving strongly connected components, including its use in providing a linear time solution to a special case of boolean satisfiability known as the 2-SAT problem [7] [8].


Works Cited

[1] scribeGriff Studios - Collections in Dart: The Directed Graph
[2] Dart API Reference - ListQueue<E>
[3] Algorithms: Design and Analysis, Part I - Stanford Univ at coursera.org
[4] Keith Schwarz (Stanford U) - Kosaraju.java
[5] DFS with Explicit Stack
[6] The graphlab library on Github
[7] Wikipedia: Strongly Connected Components and the 2-SAT Problem
[8] "A linear-time algorithm for testing the truth of certain quantified boolean formulas", Aspvall et all (1979), Information Processing Letters 8: pgs 121–123 (pdf)