NeFut Logo NeFut
Admin Login

Kruskal Reconstruction Tree: An Advanced Data Structure for Minimum Spanning Trees and Its Applications

Published at: 2026-05-29 01:15 Last updated: 2026-06-06 13:04
#algorithm #Data Structure #Graph

Core Logic and Mathematical Principles

This chapter focuses on a derived tactical form of the Minimum Spanning Tree (MST) theory: Kruskal Reconstruction Tree (Kruskal Tree). This is an advanced data structure that transforms edge weight information of a graph into node attributes of a tree.

1. Dimensional Mapping from Algebraic Space to Topological Space

The traditional Kruskal algorithm merges two connected components by simply pointing one root node to another in the disjoint set. The core physical essence of the Kruskal Reconstruction Tree is: A virtual node is created during each merge, serving as a common parent for the root nodes of the two connected components.

This tree construction paradigm introduces a remarkably sophisticated algebraic isomorphic property:

2. Spatial Transformation of the Bottleneck Path Problem

To solve the problem of "What is the minimum value of the maximum edge weight among all paths from point $u$ to point $v$" on the original graph, is essentially an MST bottleneck path problem. On the reconstruction tree, due to the max-heap property, the maximum edge weight on the path from leaf node $u$ to leaf node $v$ is elegantly transformed into the weight of the node corresponding to their Lowest Common Ancestor (LCA) in the reconstruction tree.

$$\text{Filter\_Edge\_Weight}(u, v) = \text{val}[\text{LCA}(u, v)]$$

This transformation reduces the complex graph path search directly into a static LCA query on the tree.


Algorithm Derivation and State Design

1. Dynamic Construction of the Reconstruction Tree

Define a counter node_idx = n to allocate identifiers for newly generated virtual nodes. When Kruskal encounters a valid edge $(u, v)$ with weight $w$:

  1. Find the root nodes of the disjoint set: $root_u = \text{find}(u), \, root_v = \text{find}(v)$.
  2. If $root_u \neq root_v$, trigger a reconstruction upgrade:

$$\text{node\_idx} \leftarrow \text{node\_idx} + 1$$

$$\text{val}[\text{node\_idx}] = w$$

  1. Establish directed parent-child edges in the reconstruction tree: connect $\text{node\_idx}$ to both $root_u$ and $root_v$.
  2. Update the disjoint set base: $\text{fa}[root_u] = \text{node\_idx}, \, \text{fa}[root_v] = \text{node\_idx}$.

2. Topology-Driven State Queries (Connectivity Reachability)

Classic Problem Description: Determine which nodes can be reached from point $v$ using only edges with weights $\le x$.


C++ Standard Source Code (Kruskal Reconstruction Tree Basic Framework Template)

#include <iostream>
#include <vector>
#include <algorithm>

const int MAXN = 100005;
const int MAXM = 300005;

struct Edge {
    int u, v;
    long long w;
    bool operator<(const Edge& rhs) const { return w < rhs.w; }
};

Edge edges[MAXM];
int dsu_fa[MAXN * 2]; // The reconstruction tree node doubles, and the disjoint set array must allocate double space
long long val[MAXN * 2]; // Store the edge weight attributes of virtual nodes
std::vector<int> tr[MAXN * 2]; // Adjacency list of the reconstruction tree

// LCA doubling state
int depth[MAXN * 2];
int parent[MAXN * 2][20];

int find_set(int i) {
    if (dsu_fa[i] == i) return i;
    return dsu_fa[i] = find_set(dsu_fa[i]);
}

// DFS preprocess to build doubling ancestors of the reconstruction tree
void dfs_build(int u, int p, int d) {
    depth[u] = d;
    parent[u][0] = p;
    for (int k = 1; k < 20; ++k) {
        parent[u][k] = parent[parent[u][k-1]][k-1];
    }
    for (size_t i = 0; i < tr[u].size(); ++i) {
        int v = tr[u][i];
        dfs_build(v, u, d + 1);
    }
}

// Find the highest ancestor reachable only through edges with weight <= max_w
int get_highest_ancestor(int v, long long max_w) {
    for (int k = 19; k >= 0; --k) {
        // If the ancestor exists and its associated edge weight still satisfies <= max_w, jump up
        if (parent[v][k] != 0 && val[parent[v][k]] <= max_w) {
            v = parent[v][k];
        }
    }
    return v;
}

int main() {
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(NULL);

    int n, m, q;
    if (!(std::cin >> n >> m >> q)) return 0;

    for (int i = 0; i < m; ++i) {
        std::cin >> edges[i].u >> edges[i].v >> edges[i].w;
    }

    // Core driver: sort edges in ascending order by weight
    std::sort(edges, edges + m);

    // Initialize the disjoint set base, space must be fully allocated as 2 * n
    for (int i = 1; i < 2 * n; ++i) dsu_fa[i] = i;

    int node_idx = n; // Virtual node numbering starts from n + 1
    int edge_cnt = 0;

    for (int i = 0; i < m; ++i) {
        int root_u = find_set(edges[i].u);
        int root_v = find_set(edges[i].v);

        if (root_u != root_v) {
            node_idx++;
            val[node_idx] = edges[i].w; // New node takes over edge weight

            // Build edges in the reconstruction tree: from the new node to the old roots of the two connected components
            tr[node_idx].push_back(root_u);
            tr[node_idx].push_back(root_v);

            // Merge the disjoint set into the new virtual node
            dsu_fa[root_u] = node_idx;
            dsu_fa[root_v] = node_idx;

            edge_cnt++;
            if (edge_cnt == n - 1) break;
        }
    }

    // Note: If the graph itself is not connected, the reconstruction tree will be a forest.
    // To fully establish the doubling state, we must run DFS for each root of the connected components (i.e., points where dsu_fa[i] == i)
    for (int i = node_idx; i >= 1; --i) {
        if (depth[i] == 0) { // Not visited, indicating it's the root of a tree
            int root = find_set(i);
            dfs_build(root, 0, 1);
        }
    }

    // Respond to reachability bottleneck queries
    while (q--) {
        int v;
        long long x;
        std::cin >> v >> x;

        int ans_node = get_highest_ancestor(v, x);
        // At this point, all leaf nodes (with IDs <= n) in the subtree rooted at ans_node are the desired reachable point set
        std::cout << "Highest node ID: " << ans_node << ", Boundary weight: " << val[ans_node] << "\n";
    }

    return 0;
}

NOIP Practical Pitfall Guide

  1. Disjoint Set and Related Reconstruction Tree Array Space Not Allocated Double: This is the most classic failure point for the Kruskal Reconstruction Tree. The original graph only has $N$ nodes, but the reconstruction tree will produce a total of $2N-1$ nodes. If contestants allocate dsu_fa, val, depth arrays or the doubling matrix parent according to traditional inertia as MAXN, when the program executes to node_idx > n, it will cause serious memory overflow leading to runtime crashes (RE) or bizarre wrong outputs. The iron rule must be: arrays related to the reconstruction tree node set must allocate double space.
  2. Multiple Roots of Forest Not Handled Leading to Doubling Topology Gaps: If the original graph given in the problem is not fully connected, Kruskal will form multiple independent trees (a reconstruction forest) after running. If contestants blindly write a single line dfs_build(node_idx, 0, 1); in the main function, the depths of other unconnected isolated components' virtual nodes and leaf nodes will all remain 0, and the doubling ancestors cannot be established. Subsequent queries on these isolated components will lead to infinite loops or complete wrong outputs.

Classic NOIP/Luogu True Questions

1. Luogu P1967 [NOIP2013 Advanced Group] Truck Transportation

2. Luogu P4768 [NOI2018] Return Trip

  1. Base Layer Foundation: First, ignoring the rain restrictions, run Dijkstra's algorithm on the original graph with point 1 as the source point to find the physical shortest walking distance to point 1 for each point, denoted as $mindist[i]$.
  2. Space Reconstruction: Sort all edges by elevation from high to low (in descending order) and build the Kruskal Maximum Reconstruction Tree based on elevation attributes. The virtual nodes of the reconstruction tree record the elevation.
  3. Tree Retrieval: When given a specific water level $p$ and starting point $v$, use the doubling method to quickly jump upwards to find the shallowest ancestor node $anc$ whose elevation val[anc] > p. At this point, all leaf nodes in the subtree rooted at $anc$ are the cities that can be reached by truck without charge.
  4. Extreme Value Killing: To minimize the stamina consumed while walking, we only need to find the minimum value of $mindist[i]$ among the reachable leaf nodes. We can preprocess the tree with a single DFS to record the minimum $mindist$ value for all leaf nodes in the subtree rooted at u using min_d[u]. During queries, simply output min_d[anc], with a single query time complexity of only $O(\log N)$.
Original Source: local://11.4

[h] Back to Home