How are graphs presented in problems?

In linked list problems, the head of the linked list is provided. In binary tree problems, the root of the tree is given. In graph problems, however, only information about the graph is provided. This information can come in various common formats, and we'll explore a few of them.

It's important to understand that in linked lists and binary trees, you are given actual objects in memory containing data and pointers. With graphs, the graph doesn't physically exist in memory.

Essentially, only the "concept" of the graph exists. The input will provide some information about it, and it’s your task to figure out how to represent and traverse the graph using code.

Often, the nodes of a graph are labeled from 0 to n - 1. The problem statement might not explicitly say that the input is a graph. Sometimes, there might be a narrative, and you need to deduce that the input represents a graph. For example, "there are n cities labeled from 0 to n - 1". You can consider each city as a node, with each city having a unique label.

With binary trees, traversal is straightforward because each node only references node.left and node.right. This lets us focus solely on traversal methods like DFS or BFS. In graphs, however, a node can have many neighbors. Before starting our traversal, we often need to ensure that we can quickly access all the neighbors of any given node.

What does this mean exactly? It’s easiest to understand with an example.

First input format: array of edges

In this format, the input is a 2D array where each element is in the form [x, y], indicating an edge between nodes x and y. The problem might frame these edges within a context, such as "[x, y] means there`'s a highway connecting city x and city y".

Edges can be directed or undirected, as specified in the problem description.

So, why can't we start traversal right away? Imagine we want to begin a DFS from node 0. How do we find its neighbors? We'd have to iterate through the entire input to find all edges involving node 0. Moving to a neighbor node would require another full iteration to find its neighbors, and so on.

This process—iterating through the entire input to find neighbors at each node—is inefficient and slow.

To streamline traversal, we can preprocess the input to quickly access all neighbors of any node. Ideally, we want a data structure that, given a node, returns a list of its neighbors. A hash map is an excellent tool for this.

Consider a hash map graph that maps integers to lists of integers. We can iterate over the input array and, for each [x, y] pair, add y to the list associated with graph[x]. If the edges are undirected, we also add x to the list associated with graph[y]. Once this hash map is built, calling graph[0] will immediately provide all neighbors of node 0.

A useful analogy: imagine you're on Facebook and want to see a list of all your friends. If Facebook stored its graph as an array of edges, you'd need to scan through every single connection worldwide to find your friends, which would be extremely slow given the vast number of connections. Instead, by pre-building the graph, you can simply click the friends tab on your profile to instantly see your friends.

1
2unordered_map<int, vector<int>> buildGraph(vector<vector<int>>& edges) {
3    unordered_map<int, vector<int>> graph;
4    for (vector<int>& edge : edges) {
5        int x = edge[0], y = edge[1];
6        graph[x].push_back(y);
7        // graph[y].push_back(x); // uncomment the above line if the graph is undirected
8    }
9    return graph;
10}
11

Second input format: adjacency list

In an adjacency list, the nodes will be numbered from 0 to n - 1. The input will be a 2D integer array , let's call it graph. graph[i] will be a list of all the outgoing edges from the 𝑖^th node.

The graph shown in the image can be represented using an adjacency list like this: graph = [[1], [2], [0, 3], []].

With this format, we can directly access all the neighbors of any node without needing any pre-processing. This makes the adjacency list a very convenient format. For example, to find all the neighbors of node 6, we simply look at graph[6].

Third input format: adjacency matrix

The next format is an adjacency matrix. Once again, the nodes will be numbered from 0 to n - 1 . You will be given a 2D matrix of size n x n, let's call it graph. If graph[i][j] == 1, that means there is an outgoing edge from node i to node j.

When given this format, you have two options. During traversal, for any node, you can iterate over graph[node], and if graph[node][i] == 1, you know that node i is a neighbor. Alternatively, you can pre-process the graph similar to how we did with an array of edges. You can build a hash map and iterate over the entire graph. If graph[i][j] == 1, then add j to the list associated with graph[i]. This way, during traversal, you won't need to iterate n times at each node to find its neighbors. This method is especially useful when nodes have only a few neighbors, but n is large.

Both approaches have a time complexity of 𝑂(𝑛²).

Last input format: matrix

The final format we'll discuss is more subtle but quite common. Here, the input is a 2D matrix, and the problem describes a scenario where each square represents something, and these squares are interconnected in some way. For example, "Each square of the matrix is a village. Villages trade with their neighboring villages, which are directly above, left, right, or below them.".

In this scenario, each square at (row, col) in the matrix is a node, and its neighbors are at row - 1, col), (row, col - 1), (row + 1, col), and (row, col + 1), provided they are within bounds.

Unlike other formats, the nodes in these graphs aren't labeled from 0 to n . Instead, each element in the matrix is a node, and the edges are defined by the problem description, not the input. In the given example, the problem description specifies that villages trade with directly adjacent villages. Thus, the edges are between squares within one step. You'll need to carefully analyze the problem to understand these types of graphs..

Theory Editor

Let's talk about

How are graphs presented in problems?

First input format: array of edges

Second input format: adjacency list

Third input format: adjacency matrix

Last input format: matrix