In-Depth Analysis of Prim's Algorithm: Constructing Minim...

Core Logic and Mathematical Principles

Although Kruskal's algorithm performs excellently on sparse graphs, its sorting time overhead of $O(M \log M)$ can become catastrophic in complete graphs or extremely dense graphs (where the number of edges $M \approx N^2$). In contrast, the Prim's algorithm, based on the topological expansion of point sets, does not depend on the global sorting of edges for state transitions, thus providing a dominant time-space advantage in dense graph scenarios.

1. Cut Property and Partition Space

The correctness of Prim's algorithm is strictly based on the cut property of graph theory:

Mathematical Definition: Let the vertex set $V$ of an undirected connected graph $G=(V, E)$ be arbitrarily partitioned into two disjoint non-empty subsets: the set of vertices already included in the minimum spanning tree $S$ (blue points) and the set of vertices not yet included $V-S$ (white points).
Theorem Content: Among all crossing edges that connect $S$ and $V-S$ (i.e., one end in $S$ and the other in $V-S$), the edge with the minimum weight must belong to some minimum spanning tree of the graph.

2. Two Physical Implementations of State Transition

Depending on the scale of the graph, Prim's algorithm has two distinctly different state maintenance methods:

Scheme A: Naive Prim's Algorithm (for extremely dense graphs $M \approx N^2$)

Directly use an adjacency matrix to store the graph. In each round, perform a brute-force scan of $O(N)$ to find the point in the white set that is closest to the current tree. The total time complexity is strictly $O(N^2)$. When $M$ approaches $N^2$, this algorithm runs efficiently without the overhead of a priority queue and the constants associated with heap adjustments.

Scheme B: Heap-Optimized Prim's Algorithm (for regular dense graphs)

Utilize a min-heap (priority queue) to maintain the minimum distance from all white points to the current tree surface. Each extraction operation from the heap costs $O(\log N)$, and the cost of refreshing the edge weights of surrounding points is $O(M \log N)$, leading to a total time complexity of $O((N+M)\log N)$.

Algorithm Derivation and State Design

Design a dynamic array $dist[i]$, whose physical meaning must be strictly defined: it represents the absolute shortest distance from the node $i$, which has not yet joined the tree, to the currently established tree surface $S$.
Note: This is fundamentally different from Dijkstra's algorithm, where dist represents the "distance to the source point".

State Transition Equation

Let the newly included blue point in the tree be $u$, for all directly connected white points $v \in (V-S)$:

$$dist[v] = \min(dist[v], w(u, v))$$

Initial state: Choose any source point (usually point 1), set $dist[1] = 0$, and all other points $dist[i] = \infty$.
Marking: Use a boolean array vis[i] to indicate whether node $i$ has been included in the blue point set $S$.

C++ Standard Source Code (Two Industrial-Grade Templates)

Template 1: Naive Prim's Algorithm ($O(N^2)$ Ultimate Tool for Extremely Dense Graphs)

#include <iostream>
#include <vector>
#include <algorithm>

const long long INF = 0x3f3f3f3f3f3f3f3fLL;
const int MAXN = 5005;

long long graph[MAXN][MAXN]; // Adjacency matrix for dense graph storage
long long dist[MAXN];        // Store the shortest distance from white points to the current tree surface
bool vis[MAXN];              // Mark whether it has been included in the spanning tree (blue points)

void prim_dense(int n) {
    // Initialize physical boundaries
    for (int i = 1; i <= n; ++i) {
        dist[i] = INF;
        vis[i] = false;
    }

    dist[1] = 0; // Choose point 1 as the topological root of the spanning tree
    long long mst_weight = 0;
    int node_count = 0;

    for (int step = 1; step <= n; ++step) {
        int u = -1;
        long long min_d = INF;

        // O(N) brute force scan to find the closest white point to the current tree
        for (int i = 1; i <= n; ++i) {
            if (!vis[i] && dist[i] < min_d) {
                min_d = dist[i];
                u = i;
            }
        }

        // Topological connectivity check: if no valid point is found, the graph is disconnected
        if (u == -1) {
            std::cout << "orz\n";
            return;
        }

        vis[u] = true; // Mark as blue point
        mst_weight += min_d;
        node_count++;

        // Refresh the distance from all remaining white points to the tree surface using the newly added blue point u
        for (int v = 1; v <= n; ++v) {
            if (!vis[v] && graph[u][v] < dist[v]) {
                dist[v] = graph[u][v]; // Strictly follow: dist[v] = min(dist[v], w(u,v))
            }
        }
    }

    std::cout << mst_weight << "\n";
}

Template 2: Heap-Optimized Prim's Algorithm ($O(M \log N)$ General Template)

#include <iostream>
#include <vector>
#include <queue>

const long long INF = 0x3f3f3f3f3f3f3f3fLL;
const int MAXN = 100005;

struct Edge {
    int to;
    long long weight;
};

struct Node {
    long long d;
    int u;
    bool operator>(const Node& rhs) const { return d > rhs.d; }
};

std::vector<Edge> adj[MAXN];
long long dist[MAXN];
bool vis[MAXN];

void prim_heap(int n) {
    std::priority_queue<Node, std::vector<Node>, std::greater<Node> > pq;

    for (int i = 1; i <= n; ++i) {
        dist[i] = INF;
        vis[i] = false;
    }

    dist[1] = 0;
    pq.push(Node{0, 1});

    long long mst_weight = 0;
    int count = 0;

    while (!pq.empty()) {
        Node curr = pq.top();
        pq.pop();

        int u = curr.u;
        if (vis[u]) continue; // Intercept stale data in the heap

        vis[u] = true;
        mst_weight += curr.d;
        count++;

        for (size_t i = 0; i < adj[u].size(); ++i) {
            int v = adj[u][i].to;
            long long w = adj[u][i].weight;

            if (!vis[v] && w < dist[v]) {
                dist[v] = w;
                pq.push(Node{dist[v], v}); // Push better edge weights and nodes into the heap
            }
        }
    }

    if (count < n) std::cout << "orz\n";
    else std::cout << mst_weight << "\n";
}

NOIP Practical Pitfall Guide

Confusing Prim's and Dijkstra's Relaxation Equations: This is the most common mistake among beginners. Dijkstra's relaxation is based on path accumulation: dist[u] + w < dist[v]. In contrast, Prim maintains the distance from points to the tree surface, absolutely not requiring the accumulation of predecessor weights, its equation is strictly w < dist[v]. If mistakenly written in Prim as Dijkstra's accumulation form, what will be obtained is a single-source shortest path tree instead of a minimum spanning tree.
Initialization Fails to Avoid Overlapping Edges Damaging the Adjacency Matrix: When using the naive Prim algorithm (adjacency matrix), it is essential to perform overlap interception when reading edges: graph[u][v] = min(graph[u][v], w). If the min filter is not applied, larger overlapping edges read later will mercilessly overwrite smaller valid edges, destroying the algebraic foundation of the entire minimum spanning tree.

Classic NOIP/Luogu Problems

1. Luogu P1265 Highway Network

Problem Description: Given the coordinates of $N$ cities on a plane. Any two cities can build a road, and the construction cost equals the Euclidean distance between them. Due to financial constraints, the constructed road network must allow any two cities to be reachable, and the total cost must be minimized. Additionally, there is a special restriction: each city must prioritize connecting to the physically closest city. Find the minimum total cost. ($N \le 5000$)
Core Essence and Key Idea: This problem is a standard minimum spanning tree (MST) problem for a complete planar graph (dense graph). Since there are edges between every pair of cities, the number of edges $M = \frac{N(N-1)}{2} \approx 1.25 \times 10^7$. If using Kruskal's algorithm, just storing all edges would lead to huge memory overhead (MLE), not to mention the TLE caused by sorting ten million edges. Core Strategy: Use the naive Prim algorithm. Because the edge weights of the complete graph can be dynamically calculated at runtime (i.e., get_dist(i, j)), there is no need to explicitly build edges to store in the adjacency matrix, successfully reducing space complexity to $O(N)$. Utilizing the outer loop to advance, each time scan the points not yet included in the tree to calculate their minimum distance to the current tree, elegantly solving the problem with a time-space efficiency of $O(N^2)$.

2. Luogu P2504 [HAOI2006] The Clever Monkey

Problem Description: In a forest, there are $N$ trees, and the coordinates of each tree are given. There is a group of monkeys in the forest, each with a different maximum jumping distance. If the distance between two trees does not exceed a monkey's maximum jumping distance, the monkey can jump between the two trees. The monkeys want to know how many monkeys can traverse all the trees.
Core Essence and Key Idea: This is an MST maximum edge weight bottleneck problem. For a monkey to traverse all the trees, the graph formed by these trees must be connected under the jumping distance limit of the monkeys. According to the properties of the spanning tree, for the entire graph to be connected, the monkey's minimum jumping distance must be greater than or equal to the maximum edge weight in the minimum spanning tree of the graph. The core idea: dynamically calculate edge weights using the coordinates of points, and find the minimum spanning tree of this complete graph using Prim's algorithm. During the expansion of Prim, use a variable max_edge to continuously record and update the maximum edge weight among the $N-1$ edges included in the spanning tree. Finally, iterate through each monkey's jumping limit; if its jumping limit $\ge \text{max\_edge}$, the monkey can successfully traverse, incrementing the counter.

In-Depth Analysis of Prim's Algorithm: Constructing Minimum Spanning Trees in Dense Graphs