Core Logic and Mathematical Principles
This chapter focuses on the derived tactical form of Minimum Spanning Tree (MST) theory: Kruskal Reconstruction Tree (Kruskal Tree). This is an advanced data structure that transforms the edge weight information of a graph into node attributes of a tree.
1. Dimensional Mapping from Algebraic Space to Topological Space
The traditional Kruskal algorithm, when merging two connected components, merely points one root node to another in the disjoint set. The core physical essence of the Kruskal Reconstruction Tree is: a virtual node is created during each merge, which serves as a common parent of the root nodes of the two connected components.
This tree-building paradigm brings about an exquisite algebraic isomorphism property:
- Node Count Reconstruction: The original graph has $N$ nodes. After $N-1$ merges, the reconstruction tree will contain $N$ leaf nodes (corresponding to the original graph's nodes) and $N-1$ non-leaf internal nodes (corresponding to the edges during merges), totaling $2N-1$ nodes.
- Edge Weight Attribute Nodeification: The edge weights of the original graph are assigned to each newly created virtual node. If Kruskal sorts by edge weight in ascending order (to find the MST), the reconstruction tree will naturally satisfy the max-heap property (i.e., the weight of any virtual node is greater than or equal to the weights of all virtual nodes in its subtree).
2. Spatial Transformation of the Bottleneck Path Problem
Solving the problem of "what is the minimum of the maximum edge weight among all paths from node $u$ to node $v$" in the original graph is essentially an MST bottleneck path problem. In the reconstruction tree, due to the max-heap property, the maximum edge weight on the path from leaf node $u$ to leaf node $v$ is elegantly transformed into the weight of the node at their Lowest Common Ancestor (LCA) in the reconstruction tree.
$$\text{Filter\textunderscore Edge\textunderscore Weight}(u, v) = \text{val}[\text{LCA}(u, v)]$$
This transformation flattens the complex graph path search directly into a static LCA query on the tree.
Algorithm Derivation and State Design
1. Dynamic Construction of the Reconstruction Tree
Define a counter node_idx = n to allocate numbers for newly generated virtual nodes. When Kruskal traverses an effective edge $(u, v)$ with weight $w$:
- Find the root nodes in the disjoint set: $root_u = \text{find}(u), \, root_v = \text{find}(v)$.
- If $root_u \neq root_v$, trigger a reconstruction upgrade:
$$\text{node\textunderscore idx} \leftarrow \text{node\textunderscore idx} + 1$$
$$\text{val}[\text{node\textunderscore idx}] = w$$
- Establish directed parent-child edges in the reconstruction tree: connect $ ext{node\textunderscore idx}$ to both $root_u$ and $root_v$.
- Update the base of the disjoint set: $\text{fa}[root_u] = \text{node\textunderscore idx}, \, \text{fa}[root_v] = \text{node\textunderscore idx}$.
2. Topologically Driven State Query (Reachability of Connected Components)
Classic Problem Description: Given a starting point $v$, which points can be reached by only passing through edges with weight $\ ext{\le} x$?
- State Derivation: In the reconstruction tree built in ascending order, due to the max-heap property, we can start from leaf node $v$ and exponentially jump upwards (using the
fa[v][k]matrix) until we reach the shallowest ancestor node $anc$ that satisfies $ ext{val}[anc] \le x$. - Interval Transformation: At this point, all leaf nodes in the subtree rooted at $anc$ form the reachable set of points from $v$ under the constraint. By running a DFS on the reconstruction tree to establish a DFS sequence (Euler tour), we can map "all leaf nodes in the subtree" to a continuous array interval $[L[anc], R[anc]]$. Coupled with a persistent segment tree or a Binary Indexed Tree, this can evolve into various complex interval query problems in $O(\log N)$ time.
C++ Standard Source Code (Basic Framework Template for Kruskal Reconstruction Tree)
#include <iostream>
#include <vector>
#include <algorithm>
const int MAXN = 100005;
const int MAXM = 300005;
struct Edge {
int u, v;
long long w;
bool operator<(const Edge& rhs) const { return w < rhs.w; }
};
Edge edges[MAXM];
int dsu_fa[MAXN * 2]; // The reconstruction tree nodes are doubled, and the disjoint set array must be allocated double space
long long val[MAXN * 2]; // Store the edge weight attributes of virtual nodes
std::vector<int> tr[MAXN * 2]; // Adjacency list of the reconstruction tree
// LCA doubling state
int depth[MAXN * 2];
int parent[MAXN * 2][20];
int find_set(int i) {
if (dsu_fa[i] == i) return i;
return dsu_fa[i] = find_set(dsu_fa[i]);
}
// DFS preprocess to build the doubling ancestors of the reconstruction tree
void dfs_build(int u, int p, int d) {
depth[u] = d;
parent[u][0] = p;
for (int k = 1; k < 20; ++k) {
parent[u][k] = parent[parent[u][k-1]][k-1];
}
for (size_t i = 0; i < tr[u].size(); ++i) {
int v = tr[u][i];
dfs_build(v, u, d + 1);
}
}
// Find the highest ancestor reachable by only passing through edges with weight <= max_w
int get_highest_ancestor(int v, long long max_w) {
for (int k = 19; k >= 0; --k) {
// If the ancestor exists and its associated original graph edge weight still satisfies <= max_w, jump upwards
if (parent[v][k] != 0 && val[parent[v][k]] <= max_w) {
v = parent[v][k];
}
}
return v;
}
int main() {
std::ios_base::sync_with_stdio(false);
std::cin.tie(NULL);
int n, m, q;
if (!(std::cin >> n >> m >> q)) return 0;
for (int i = 0; i < m; ++i) {
std::cin >> edges[i].u >> edges[i].v >> edges[i].w;
}
// Core driver: sort edges in ascending order of weights
std::sort(edges, edges + m);
// Initialize the disjoint set base, space must be allocated for 2 * n
for (int i = 1; i < 2 * n; ++i) dsu_fa[i] = i;
int node_idx = n; // Virtual node numbering starts from n + 1
int edge_cnt = 0;
for (int i = 0; i < m; ++i) {
int root_u = find_set(edges[i].u);
int root_v = find_set(edges[i].v);
if (root_u != root_v) {
node_idx++;
val[node_idx] = edges[i].w; // New node takes on edge weight
// Build edge in the reconstruction tree: point from new node to the old roots of both connected components
tr[node_idx].push_back(root_u);
tr[node_idx].push_back(root_v);
// Merge into the new virtual node in the disjoint set
dsu_fa[root_u] = node_idx;
dsu_fa[root_v] = node_idx;
edge_cnt++;
if (edge_cnt == n - 1) break;
}
}
// Note: If the graph itself is not connected, the reconstruction tree will be a forest.
// To completely establish the doubling state, we must run DFS for each root of connected components (i.e., points where dsu_fa[i] == i)
for (int i = node_idx; i >= 1; --i) {
if (depth[i] == 0) { // Not visited, indicating it is the root of some tree
int root = find_set(i);
dfs_build(root, 0, 1);
}
}
// Respond to reachability bottleneck queries
while (q--) {
int v;
long long x;
std::cin >> v >> x;
int ans_node = get_highest_ancestor(v, x);
// At this point, all leaf nodes in the subtree rooted at ans_node (numbered <= n) form the desired reachable point set
std::cout << "Highest node ID: " << ans_node << ", Boundary weight: " << val[ans_node] << "\n";
}
return 0;
}
NOIP Practical Pitfall Guide
- Disjoint set and related reconstruction tree arrays not allocated double space:
This is the classic point of failure in Kruskal's reconstruction tree. The original graph has only $N$ nodes, but the reconstruction tree will produce a total of $2N-1$ nodes. If contestants still allocate
dsu_fa,val,deptharrays, or the doubling matrixparentwith the traditional inertia ofMAXN, when the program executes tonode_idx > n, it will cause serious memory overflow, leading to runtime crashes (RE) or bizarre incorrect outputs. The iron rule must be: arrays involving the reconstruction tree node set must allocate double space. - Multiple roots of the forest not handled leading to doubling topology gaps:
If the original graph given in the problem is not fully connected, running Kruskal will create multiple independent trees (a reconstruction forest). If contestants blindly write a single line
dfs_build(node_idx, 0, 1);in themainfunction, then the depth of other unconnected isolated blocks' virtual nodes and leaf nodes will remain 0, and the doubling ancestors cannot be established. Subsequent queries for points within these isolated blocks will lead to infinite loops or complete misinterpretation.
Classic NOIP/Luogu Problems
1. Luogu P1967 [NOIP2013 Advanced Group] Truck Transport
- Problem Description: Given a graph consisting of $N$ nodes and $M$ undirected edges, with weight limits on the edges. There are $Q$ queries, each providing a starting point $x$ and an endpoint $y$, asking for the maximum weight of goods that can be transported by a truck from $x$ to $y$ (i.e., the maximum of the minimum edge weights on all paths).
- Essence and Core Idea of the Problem:
This is the standard opening question for the Kruskal reconstruction tree.
The requirement for "the maximum of the minimum values" belongs to the bottleneck path problem of the maximum spanning tree. The core tactic: sort the edge weights in descending order and construct the Kruskal maximum reconstruction tree. Since it is a maximum reconstruction tree, its virtual nodes naturally satisfy the min-heap property (the higher up, the smaller the weight). After the tree is built, for each query $(x, y)$, if the two points are ultimately not connected in the disjoint set, output -1 directly. If they are connected, the maximum weight that the truck can transport is exactly equal to the weight of the node at their Lowest Common Ancestor (LCA) in the reconstruction tree,
val[LCA(x, y)]. Utilizing the doubling LCA can perfectly handle this in $O(\log N)$ time.
2. Luogu P4768 [NOI2018] Return Trip
- Problem Description: A graph has $N$ nodes and $M$ edges. Each edge has two attributes: length $l$ and altitude $a$. Every day it rains and there is a water level $p$. If an edge has an altitude $a > p$, the truck can pass through it smoothly (without consuming stamina); if $a \le p$, the edge gets flooded, and the truck cannot pass through, requiring the driver to walk. Walking through an edge consumes stamina equal to the length $l$. The truck driver wants to know the minimum stamina required to return to point 1 starting from point $v$.
- Essence and Core Idea of the Problem: This is a classic application of the Kruskal reconstruction tree + single-source shortest path. Core Breakdown Tactic: The condition for the vehicle to pass for free is to only travel on edges where altitude $a > p$. This essentially seeks to find the reachable set of points that can be reached from starting point $v$ by only using edges with altitude $> p$.
- Base Layer: First, ignoring the rain restrictions, run Dijkstra's algorithm once on the original graph with point 1 as the source, to find the physical shortest walking distance $mindist[i]$ from each point to point 1.
- Space Reconstruction: Sort all edges by altitude from high to low (descending) and establish the Kruskal maximum reconstruction tree based on altitude attributes. The virtual nodes of the reconstruction tree record the altitude.
- Tree Query: When given a specific water level $p$ and starting point $v$, quickly jump upwards using doubling to find the shallowest virtual ancestor node $anc$ that satisfies
val[anc] > p. At this point, all leaf nodes in the subtree rooted at $anc$ are the cities that can be reached by truck for free. - Extreme Value Killing: To minimize the walking stamina consumption, we only need to find the minimum value of $mindist[i]$ among the leaf nodes that can be reached by truck. We can preprocess once on the tree using DFS to record
min_d[u], the minimum $mindist$ of all leaf nodes in the subtree rooted atu. During queries, simply outputmin_d[anc], and the single query time complexity is only $O(\log N)$.