
Problem Statement
Given a directed acyclic graph with n nodes labeled from 0 to n-1, determine the smallest number of initial nodes such that you can access all the nodes by traversing edges. Return these nodes.
Examples
- Example 1:
-
Input:
n = 6, edges = [[0,1],[0,2],[2,5],[3,4],[4,2]] -
Expected Output:
[0,3]
- Justification: Starting from nodes 0 and 3, you can reach all other nodes in the graph. Starting from node 0, you can reach nodes 1, 2, and 5. Starting from node 3, you can reach nodes 4 and 2 (and by extension 5).
- Example 2:
- Input:
n = 3edges = [[0,1],[2,1]]
- Expected Output:
[0,2]
- Justification: Nodes 0 and 2 are the only nodes that don't have incoming edges. Hence, you need to start from these nodes to reach node 1.
- Example 3:
- Input:
n = 5edges = [[0,1],[2,1],[3,4]]
- Expected Output:
[0,2,3]
- Justification: Node 1 can be reached from both nodes 0 and 2, but to cover all nodes, you also need to start from node 3.
Constraints:
- 2 <= n <= 10^5
- 1 <= edges.length <= min(10^5, n * (n - 1) / 2)
- edges[i].length == 2
- 0 <= from<sub>i</sub>, to<sub>i</sub> < n
- All pairs (from<sub>i</sub>, to<sub>i</sub>) are distinct.
Solution
To solve the problem of determining the minimum number of vertices needed to reach all nodes in a directed graph, we focus on the concept of "in-degree" which represents the number of incoming edges to a node. In a directed graph, if a node doesn't have any incoming edges (in-degree of 0), then it means that the node cannot be reached from any other node. Hence, such nodes are mandatory starting points to ensure that every node in the graph can be reached. Our algorithm thus identifies all nodes with an in-degree of 0 as they are potential starting points to traverse the entire graph.
Step-by-Step ALgorithm
-
Initialization:
- Create a boolean array
hasIncomingEdgeof sizeninitialized tofalse. This array tracks whether a node has any incoming edges.
- Create a boolean array
-
Mark Nodes with Incoming Edges:
- For each edge in the
edgeslist, sethasIncomingEdge[edge[1]] = trueto indicate that the destination node of the edge has an incoming edge.
- For each edge in the
-
Identify Nodes without Incoming Edges:
- Initialize an empty list
resultto store nodes without incoming edges. - Iterate through all nodes from
0ton-1:- If
hasIncomingEdge[i] == false, add nodeito theresultlist.
- If
- Initialize an empty list
-
Return the Result:
- Return the
resultlist as the smallest set of vertices from which all other nodes are reachable.
- Return the
Algorithm Walkthrough:
Input:
- Number of Nodes (
n): 6 - Edges:
[[0,1], [0,2], [2,5], [3,4], [4,2]]
Execution Steps:
-
Initialization:
- Create
hasIncomingEdgearray of size6: hasIncomingEdge = [false, false, false, false, false, false]
- Create
-
Mark Nodes with Incoming Edges:
Process each edge and mark the destination node:
- Edge
[0,1]:hasIncomingEdge[1] = true- hasIncomingEdge = [false, true, false, false, false, false]
- Edge
[0,2]:hasIncomingEdge[2] = true- hasIncomingEdge = [false, true, true, false, false, false]
- Edge
[2,5]:hasIncomingEdge[5] = true- hasIncomingEdge = [false, true, true, false, false, true]
- Edge
[3,4]:hasIncomingEdge[4] = true- hasIncomingEdge = [false, true, true, false, true, true]
- Edge
[4,2]:hasIncomingEdge[2] = true(alreadytrue)- hasIncomingEdge = [false, true, true, false, true, true]
-
Identify Nodes without Incoming Edges:
-
Iterate over all nodes (
0to5) and check forhasIncomingEdge[i] == false:- Node
0:hasIncomingEdge[0] == false, so add0toresult. - Node
1:hasIncomingEdge[1] == true, skip. - Node
2:hasIncomingEdge[2] == true, skip. - Node
3:hasIncomingEdge[3] == false, so add3toresult. - Node
4:hasIncomingEdge[4] == true, skip. - Node
5:hasIncomingEdge[5] == true, skip.
- Node
-
Result list after iteration:
result = [0, 3]
-
-
Return the Result:
- The final result is
[0, 3].
- The final result is
Code
Here is the code for this algorithm:
Complexity Analysis
Time Complexity
1. Mark Nodes with Incoming Edges
- The first for loop iterates over all edges in the graph:
- Each edge updates the
hasIncomingEdgearray in constant time O(1).
- Each edge updates the
- If the number of edges is denoted by E, this operation takes O(E) time.
2. Identify Nodes Without Incoming Edges
- The second for loop iterates over all vertices in the graph:
- Checking the
hasIncomingEdgearray for each node takes constant time O(1).
- Checking the
- If the number of vertices is denoted by V, this operation takes O(V) time.
3. Overall Time Complexity
\text{Total Time Complexity} = O(E) + O(V)
Space Complexity
1. Boolean Array (hasIncomingEdge)
- The
hasIncomingEdgearray has a size equal to the number of vertices, n. - Space Requirement: O(V).
2. Result List
- The
resultlist stores nodes without incoming edges. - In the worst case (e.g., a graph with no incoming edges), all vertices will be added to the list.
- Space Requirement: O(V).
4. Overall Space Complexity
- The total space required is: O(V)
.....
.....
.....