Core Logic and Mathematical Principles
This chapter focuses on a derived tactical form of the Minimum Spanning Tree (MST) theory: Kruskal Reconstruction Tree (Kruskal Tree). This is an advanced data structure that transforms edge weight information of a graph into node attributes of a tree.
1. Dimensional Mapping from Algebraic Space to Topological Space
The traditional Kruskal algorithm merges two connected components by simply pointing one root node to another in the disjoint set. The core physical essence of the Kruskal Reconstruction Tree is: A virtual node is created during each merge, serving as a common parent for the root nodes of the two connected components.
This tree construction paradigm introduces a remarkably sophisticated algebraic isomorphic property:
- Node Count Reconstruction: The original graph has $N$ nodes, and after $N-1$ merges, the reconstruction tree will contain $N$ leaf nodes (corresponding to the nodes of the original graph) and $N-1$ non-leaf internal nodes (corresponding to the edges during merges), totaling $2N-1$ nodes.
- Edge Weight Attribute Nodeification: The edge weights of the original graph are assigned to each newly created virtual node. If Kruskal sorts by edge weight in ascending order (to find the MST), the reconstruction tree will naturally satisfy the max-heap property (i.e., the weight of any virtual node is greater than or equal to the weights of all virtual nodes in its subtree).
2. Spatial Transformation of the Bottleneck Path Problem
To solve the problem of "What is the minimum value of the maximum edge weight among all paths from point $u$ to point $v$" on the original graph, is essentially an MST bottleneck path problem. On the reconstruction tree, due to the max-heap property, the maximum edge weight on the path from leaf node $u$ to leaf node $v$ is elegantly transformed into the weight of the node corresponding to their Lowest Common Ancestor (LCA) in the reconstruction tree.
$$\text{Filter\_Edge\_Weight}(u, v) = \text{val}[\text{LCA}(u, v)]$$
This transformation reduces the complex graph path search directly into a static LCA query on the tree.
Algorithm Derivation and State Design
1. Dynamic Construction of the Reconstruction Tree
Define a counter node_idx = n to allocate identifiers for newly generated virtual nodes. When Kruskal encounters a valid edge $(u, v)$ with weight $w$:
- Find the root nodes of the disjoint set: $root_u = \text{find}(u), \, root_v = \text{find}(v)$.
- If $root_u \neq root_v$, trigger a reconstruction upgrade:
$$\text{node\_idx} \leftarrow \text{node\_idx} + 1$$
$$\text{val}[\text{node\_idx}] = w$$
- Establish directed parent-child edges in the reconstruction tree: connect $\text{node\_idx}$ to both $root_u$ and $root_v$.
- Update the disjoint set base: $\text{fa}[root_u] = \text{node\_idx}, \, \text{fa}[root_v] = \text{node\_idx}$.
2. Topology-Driven State Queries (Connectivity Reachability)
Classic Problem Description: Determine which nodes can be reached from point $v$ using only edges with weights $\le x$.
- State Derivation: In the reconstruction tree built in ascending order, due to the max-heap property, we can start from leaf node $v$ and perform upward binary jumps (using the
fa[v][k]matrix) until we reach the shallowest ancestor node $anc$ that satisfies $\text{val}[anc] \le x$. - Interval Transformation: At this point, all leaf nodes in the subtree rooted at $anc$ represent the set of reachable points from $v$ that satisfy the constraints. By running a DFS on the reconstruction tree to establish a DFS order (Euler tour), we can map the "all leaf nodes in the subtree" to a continuous array interval $[L[anc], R[anc]]$. Coupled with a persistent segment tree or a Fenwick tree, this can evolve into various complex interval retrieval problems within $O(\log N)$.
C++ Standard Source Code (Kruskal Reconstruction Tree Basic Framework Template)
#include <iostream>
#include <vector>
#include <algorithm>
const int MAXN = 100005;
const int MAXM = 300005;
struct Edge {
int u, v;
long long w;
bool operator<(const Edge& rhs) const { return w < rhs.w; }
};
Edge edges[MAXM];
int dsu_fa[MAXN * 2]; // The reconstruction tree node doubles, and the disjoint set array must allocate double space
long long val[MAXN * 2]; // Store the edge weight attributes of virtual nodes
std::vector<int> tr[MAXN * 2]; // Adjacency list of the reconstruction tree
// LCA doubling state
int depth[MAXN * 2];
int parent[MAXN * 2][20];
int find_set(int i) {
if (dsu_fa[i] == i) return i;
return dsu_fa[i] = find_set(dsu_fa[i]);
}
// DFS preprocess to build doubling ancestors of the reconstruction tree
void dfs_build(int u, int p, int d) {
depth[u] = d;
parent[u][0] = p;
for (int k = 1; k < 20; ++k) {
parent[u][k] = parent[parent[u][k-1]][k-1];
}
for (size_t i = 0; i < tr[u].size(); ++i) {
int v = tr[u][i];
dfs_build(v, u, d + 1);
}
}
// Find the highest ancestor reachable only through edges with weight <= max_w
int get_highest_ancestor(int v, long long max_w) {
for (int k = 19; k >= 0; --k) {
// If the ancestor exists and its associated edge weight still satisfies <= max_w, jump up
if (parent[v][k] != 0 && val[parent[v][k]] <= max_w) {
v = parent[v][k];
}
}
return v;
}
int main() {
std::ios_base::sync_with_stdio(false);
std::cin.tie(NULL);
int n, m, q;
if (!(std::cin >> n >> m >> q)) return 0;
for (int i = 0; i < m; ++i) {
std::cin >> edges[i].u >> edges[i].v >> edges[i].w;
}
// Core driver: sort edges in ascending order by weight
std::sort(edges, edges + m);
// Initialize the disjoint set base, space must be fully allocated as 2 * n
for (int i = 1; i < 2 * n; ++i) dsu_fa[i] = i;
int node_idx = n; // Virtual node numbering starts from n + 1
int edge_cnt = 0;
for (int i = 0; i < m; ++i) {
int root_u = find_set(edges[i].u);
int root_v = find_set(edges[i].v);
if (root_u != root_v) {
node_idx++;
val[node_idx] = edges[i].w; // New node takes over edge weight
// Build edges in the reconstruction tree: from the new node to the old roots of the two connected components
tr[node_idx].push_back(root_u);
tr[node_idx].push_back(root_v);
// Merge the disjoint set into the new virtual node
dsu_fa[root_u] = node_idx;
dsu_fa[root_v] = node_idx;
edge_cnt++;
if (edge_cnt == n - 1) break;
}
}
// Note: If the graph itself is not connected, the reconstruction tree will be a forest.
// To fully establish the doubling state, we must run DFS for each root of the connected components (i.e., points where dsu_fa[i] == i)
for (int i = node_idx; i >= 1; --i) {
if (depth[i] == 0) { // Not visited, indicating it's the root of a tree
int root = find_set(i);
dfs_build(root, 0, 1);
}
}
// Respond to reachability bottleneck queries
while (q--) {
int v;
long long x;
std::cin >> v >> x;
int ans_node = get_highest_ancestor(v, x);
// At this point, all leaf nodes (with IDs <= n) in the subtree rooted at ans_node are the desired reachable point set
std::cout << "Highest node ID: " << ans_node << ", Boundary weight: " << val[ans_node] << "\n";
}
return 0;
}
NOIP Practical Pitfall Guide
- Disjoint Set and Related Reconstruction Tree Array Space Not Allocated Double:
This is the most classic failure point for the Kruskal Reconstruction Tree. The original graph only has $N$ nodes, but the reconstruction tree will produce a total of $2N-1$ nodes. If contestants allocate
dsu_fa,val,deptharrays or the doubling matrixparentaccording to traditional inertia asMAXN, when the program executes tonode_idx > n, it will cause serious memory overflow leading to runtime crashes (RE) or bizarre wrong outputs. The iron rule must be: arrays related to the reconstruction tree node set must allocate double space. - Multiple Roots of Forest Not Handled Leading to Doubling Topology Gaps:
If the original graph given in the problem is not fully connected, Kruskal will form multiple independent trees (a reconstruction forest) after running. If contestants blindly write a single line
dfs_build(node_idx, 0, 1);in themainfunction, the depths of other unconnected isolated components' virtual nodes and leaf nodes will all remain 0, and the doubling ancestors cannot be established. Subsequent queries on these isolated components will lead to infinite loops or complete wrong outputs.
Classic NOIP/Luogu True Questions
1. Luogu P1967 [NOIP2013 Advanced Group] Truck Transportation
- Problem Description: Given a graph consisting of $N$ nodes and $M$ undirected edges, with weight limits on the edges. There are $Q$ queries, each providing a starting point $x$ and an endpoint $y$, asking for the maximum weight of goods that can be transported by the truck from $x$ to $y$ (i.e., the maximum value of the minimum edge weight along all paths).
- Essence and Core Idea of the Problem:
The most standard introductory question for the Kruskal Reconstruction Tree.
The requirement for the "maximum of the minimum values" belongs to the maximum spanning tree bottleneck path problem. The core tactic: sort the edge weights in descending order and construct the Kruskal Maximum Reconstruction Tree. Since it is a maximum reconstruction tree, its virtual nodes naturally satisfy the min-heap property (the higher up, the smaller the weight). After the tree is built, for each query $(x, y)$, if the two points are not connected in the disjoint set, directly output -1. If they are connected, the maximum weight the truck can transport is strictly equal to the weight of the node at their reconstruction tree's Lowest Common Ancestor (LCA),
val[LCA(x, y)]. Using the doubling LCA can perfectly solve this in $O(\log N)$ time.
2. Luogu P4768 [NOI2018] Return Trip
- Problem Description: A graph has $N$ nodes and $M$ edges. Each edge has two attributes: length $l$ and elevation $a$. Each day of rain has a water level $p$. If an edge's elevation $a > p$, the truck can pass through it smoothly (without consuming stamina); if $a \le p$, the edge gets flooded, and the truck cannot pass through and must walk. Walking through an edge consumes stamina equal to its length $l$. The truck driver wants to know the minimum stamina required to return to point 1 starting from point $v$.
- Essence and Core Idea of the Problem: This is a classic application of Kruskal Reconstruction Tree + Single Source Shortest Path. Core Breakdown Tactics: The necessary condition for the vehicle to pass freely is to only walk on edges where the elevation $a > p$. This essentially requires finding the set of reachable points from point $v$ that can be reached by only using edges with elevation $> p$.
- Base Layer Foundation: First, ignoring the rain restrictions, run Dijkstra's algorithm on the original graph with point 1 as the source point to find the physical shortest walking distance to point 1 for each point, denoted as $mindist[i]$.
- Space Reconstruction: Sort all edges by elevation from high to low (in descending order) and build the Kruskal Maximum Reconstruction Tree based on elevation attributes. The virtual nodes of the reconstruction tree record the elevation.
- Tree Retrieval: When given a specific water level $p$ and starting point $v$, use the doubling method to quickly jump upwards to find the shallowest ancestor node $anc$ whose elevation
val[anc] > p. At this point, all leaf nodes in the subtree rooted at $anc$ are the cities that can be reached by truck without charge. - Extreme Value Killing: To minimize the stamina consumed while walking, we only need to find the minimum value of $mindist[i]$ among the reachable leaf nodes. We can preprocess the tree with a single DFS to record the minimum $mindist$ value for all leaf nodes in the subtree rooted at
uusingmin_d[u]. During queries, simply outputmin_d[anc], with a single query time complexity of only $O(\log N)$.