NeFut Logo NeFut
Admin Login

In-Depth Analysis of Minimum Spanning Tree Algorithms: Comparison and Implementation of Kruskal and Prim

Published at: 2026-05-27 09:17 Last updated: 2026-06-06 13:04
#algorithm #Graph #MST

Core Logic and Mathematical Principles

The essence of the Minimum Spanning Tree (MST) problem is: in a weighted undirected connected graph $G=(V, E)$ containing $N$ vertices and $M$ edges, select a set of edges $T \subseteq E$ such that $(V, T)$ forms a topologically connected tree, and the sum of the weights of all edges in the tree $\sum_{e \in T} w(e)$ achieves the global minimum value.

Within the NOIP framework, the Kruskal and Prim algorithms are constructed based on completely different algebraic principles of graph theory:

1. Edge Set Greedy Paradigm: Kruskal's Algorithm

The core mathematical foundation of Kruskal's algorithm is the Cut Property and Greedy Expansion.

2. Vertex Set Topological Expansion Paradigm: Prim's Algorithm

The core physical essence of Prim's algorithm is the segmentation expansion of blue and white point sets, which is structurally highly isomorphic to Dijkstra's algorithm.


Algorithm Derivation and State Design

1. Kruskal State Design

Define the Disjoint Set array fa[i].

  1. Edge Structure Encapsulation:
struct Edge {
    int u, v;
    long long w;
    bool operator<(const Edge& rhs) const { return w < rhs.w; }
};
  1. State Transition (Disjoint Set Cycle Check): For the sorted edge $e_i$, let $root_u = \text{find}(e_i.u), \, root_v = \text{find}(e_i.v)$.

$$\text{if } (root_u \neq root_v) \quad \text{fa}[root_u] = root_v, \,\, \text{edge\_cnt} \leftarrow \text{edge\_cnt} + 1$$

2. Prim Heap Optimization State Design

Define $dist[i]$ as the topological shortest distance from white point $i$ to the currently generated MST edge set tree surface.
The state stored in the priority queue is a tuple $(d, u)$, where $u$ is the node index and $d$ is the current $dist[u]$.
The top element of the min-heap drives the state transition equation:

$$dist[v] = \min(dist[v], w(u, v)) \quad (\forall v \in T, \, (u, v) \in E)$$

Physical Meaning: When the blue point set expands to include $u$, the shortest distance from the floating point $v$ to the tree surface may be refreshed by the newly included boundary edge $w(u, v)$.


C++ Standard Source Code (Kruskal Industrial Template)

#include <iostream>
#include <vector>
#include <algorithm>

const int MAXN = 100005;
const int MAXM = 300005;

struct Edge {
    int u, v;
    long long w;
    // Strictly sorted in non-decreasing order of edge weights
    bool operator<(const Edge& rhs) const {
        return w < rhs.w;
    }
};

Edge edges[MAXM];
int fa[MAXN];

// Disjoint Set Core: Path Compression
int find_set(int i) {
    if (fa[i] == i) return i;
    return fa[i] = find_set(fa[i]); // Key pitfall: path compression must be performed, otherwise chain trees will degrade complexity
}

// Merge connected components in the Disjoint Set
bool union_set(int i, int j) {
    int root_i = find_set(i);
    int root_j = find_set(j);
    if (root_i != root_j) {
        fa[root_i] = root_j;
        return true; // Successful merge, indicating previously disconnected, this edge is valid
    }
    return false; // Already connected, adding this edge will create a cycle, reject
}

int main() {
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(NULL);

    int n, m;
    if (!(std::cin >> n >> m)) return 0;

    for (int i = 0; i < m; ++i) {
        std::cin >> edges[i].u >> edges[i].v >> edges[i].w;
    }

    // Core driver: quickly sort the entire edge set
    std::sort(edges, edges + m);

    // Initialize the base of the Disjoint Set
    for (int i = 1; i <= n; ++i) fa[i] = i;

    long long mst_weight = 0;
    int edges_selected = 0;

    for (int i = 0; i < m; ++i) {
        if (union_set(edges[i].u, edges[i].v)) {
            mst_weight += edges[i].w;
            edges_selected++;
            if (edges_selected == n - 1) break; // Early termination: already collected N-1 edges
        }
    }

    // Topological connectivity check
    if (edges_selected < n - 1) {
        std::cout << "orz\n"; // The graph itself is not connected, unable to construct a spanning tree
    } else {
        std::cout << mst_weight << "\n";
    }

    return 0;
}

NOIP Practical Pitfall Guide

  1. Blindly using Kruskal for dense grid graphs without considering data characteristics: If the given graph is strictly dense, such as a complete graph or a large cardinality grid graph, satisfying $V=5000, M \approx 1.2 \times 10^7$. If Kruskal's algorithm is used, its sorting overhead $M \log M$ will cause memory bloat and CPU time to directly trigger TLE. At this time, it is essential to switch to the naive Prim or heap-optimized Prim algorithm with $O(V^2)$ complexity, as Prim's relaxation state does not depend on the global sorting of the number of edges.
  2. Omitting Disjoint Set initialization or failing to perform path compression in deep recursion: In the main function of Kruskal, it is very easy to miss the initialization for (int i = 1; i <= n; ++i) fa[i] = i;. If not initialized, all fa will remain 0, causing find_set to instantly enter an infinite loop or memory overflow. Additionally, if the rolling assignment fa[i] = find_set(fa[i]) (path compression) is omitted in find_set, the disjoint set will degenerate into a long chain, making single merges degrade to $O(V)$, leading Kruskal's algorithm to degenerate to $O(MV)$ brute force, completely losing efficiency.

Classic NOIP/Luogu Problem Examples

1. Luogu P3366 [Template] Minimum Spanning Tree

2. Luogu P1194 Buying Gifts

Original Source: local://11.1

[h] Back to Home