Core Logic and Mathematical Principles
The diameter and centroid of a tree are two core metrics that describe the asymmetry of tree topological structures.
1. Tree Diameter (Diameter)
The diameter of a tree is defined as the maximum length of all paths in the tree, i.e., $\max_{u, v \in V} \text{dist}(u, v)$.
- Mathematical Principle of Two DFS Method:
Starting from any node $x$, the farthest node reached during the first DFS must be one endpoint $u$ of some diameter. Then, starting from $u$, performing a second DFS will reach the other endpoint $v$ of that diameter.
Proof Sketch: Assume there exists a diameter $(p, q)$. The farthest point reached from $x$ is $u$. If $u$ is not on the diameter, then the path $\text{dist}(x, u) \ge \text{dist}(x, p)$. Combining with the acyclic connectivity of the tree, it is possible to construct a path longer than $(p, q)$, which contradicts the definition of $(p, q)$ as the diameter.
Limitation: This method can only handle non-negative edge weights trees. If negative edge weights exist, this greedy proof fails. - Mathematical Principle of Tree DP Method:
Choose any point as the root. For any node $u$, the longest path passing through $u$ is formed by concatenating the longest and second longest chains within the subtree of $u$. This method naturally supports negative edge weights.
2. Tree Centroid (Centroid)
The centroid of a tree is defined as a node $u$ such that when $u$ is removed, the size of the largest connected component (subtree) formed is minimized.
- Mathematical Properties:
- A tree can have at most two centroids. If there are two centroids, they must be directly connected by an edge, and the size of the largest subtree formed is exactly $\frac{N}{2}$.
- A node $u$ is a centroid if and only if: when $u$ is the root, the sizes of all its child nodes' subtrees $\text{siz}[v] \le \frac{N}{2}$, and $N - \text{siz}[u] \le \frac{N}{2}$.
- The sum of distances from all points in the tree to the centroid is minimized.
State Design and Algorithm Derivation
1. Tree DP to Find Diameter
Define $d_1[u]$ as the longest path length from $u$ to leaf nodes in the subtree rooted at $u$; $d_2[u]$ as the second longest path length.
The global diameter is denoted as $\text{ans}$.
For any child node $v \in \text{adj}[u]$ with edge weight $w$:
- State Transition Equation:
Before updating $d_1[u]$, attempt to concatenate to trigger the global maximum:
$$\text{ans} = \max(\text{ans}, d_1[u] + d_2[v] + w)$$
- Maintain the longest and second longest chains:
If $d_1[v] + w > d_1[u]$, then set $d_2[u] = d_1[u]$, $d_1[u] = d_1[v] + w$.
Otherwise, if $d_1[v] + w > d_2[u]$, then set $d_2[u] = d_1[v] + w$.
2. Centroid Derivation
Define $\text{siz}[u]$ as the size of the subtree rooted at $u$, and $\text{max\_part}[u]$ as the size of the largest connected component formed after removing $u$.
For any child node $v \in \text{adj}[u]$:
- Accumulate during post-order traversal: $\text{siz}[u] = 1 + \sum \text{siz}[v]$
- Local subtree maximum: $\text{max\_part}[u] = \max_{v \in \text{adj}[u]} (\text{siz}[v])$
- Consider the connected component size in the direction of the parent node:
$$\text{max\_part}[u] = \max(\text{max\_part}[u], N - \text{siz}[u])$$
- Centroid satisfies: $\text{max\_part}[\text{centroid}] = \min_{i \in V}(\text{max\_part}[i])$
C++ Standard Source Code
#include <iostream>
#include <vector>
#include <algorithm>
const int MAXN = 200005;
struct Edge {
int to;
long long weight;
};
std::vector<Edge> adj[MAXN];
int n;
// Diameter related variables
long long max_dia = 0;
long long d1[MAXN], d2[MAXN];
// Centroid related variables
int siz[MAXN];
int max_part[MAXN];
std::vector<int> centroids;
// Tree DP to find diameter (supports negative edge weights)
void dfs_diameter(int u, int fa) {
d1[u] = 0;
d2[u] = 0;
for (size_t i = 0; i < adj[u].size(); ++i) {
int v = adj[u][i].to;
long long w = adj[u][i].weight;
if (v == fa) continue;
dfs_diameter(v, u);
// Core pitfall: must update the global diameter before updating d1[u] to prevent double counting the longest chain from the same subtree
if (d1[v] + w > d1[u]) {
d2[u] = d1[u];
d1[u] = d1[v] + w;
} else if (d1[v] + w > d2[u]) {
d2[u] = d1[v] + w;
}
}
max_dia = std::max(max_dia, d1[u] + d2[u]);
}
// DFS to find centroid
void dfs_centroid(int u, int fa) {
siz[u] = 1;
max_part[u] = 0;
for (size_t i = 0; i < adj[u].size(); ++i) {
int v = adj[u][i].to;
if (v == fa) continue;
dfs_centroid(v, u);
siz[u] += siz[v];
max_part[u] = std::max(max_part[u], siz[v]);
}
// Consider the size of the connected component in the direction of the parent node
max_part[u] = std::max(max_part[u], n - siz[u]);
// Determine if it is a centroid
if (max_part[u] <= n / 2) {
centroids.push_back(u);
}
}
int main() {
std::ios_base::sync_with_stdio(false);
std::cin.tie(NULL);
if (!(std::cin >> n)) return 0;
for (int i = 1; i < n; ++i) {
int u, v;
long long w;
std::cin >> u >> v >> w;
adj[u].push_back(Edge{v, w});
adj[v].push_back(Edge{u, w});
}
// Calculate diameter
dfs_diameter(1, 0);
// Calculate centroid
dfs_centroid(1, 0);
std::cout << "Diameter: " << max_dia << "\n";
std::cout << "Centroid(s): ";
for (size_t i = 0; i < centroids.size(); ++i) {
std::cout << centroids[i] << " ";
}
std::cout << "\n";
return 0;
}
NOIP Practical Pitfall Guide
- Handling Negative Edge Weights in Two DFS Method for Diameter:
If the tree contains negative edge weights, the farthest point found in the first DFS may not actually be a diameter endpoint. In this case, the tree DP method must be used. Additionally, when calculating the longest and second longest chains, if one chain's data is mistakenly assigned to both $d_1$ and $d_2$, it will lead to double counting of an edge. - Omission of
n - siz[u]in Centroid Condition:
Many contestants only focus on the subtree sizes in the direction of child nodes (i.e.,max_part[u] = max(max_part[u], siz[v])), completely ignoring the part of the connected component formed by non-child subtrees when backtracking towards the root. Ifmax_part[u] = max(max_part[u], n - siz[u])is omitted, the calculated centroid will be incorrect, especially in chain-like or heavily skewed trees.
Classic NOIP/Luogu Problems
1. Luogu P1099 [NOIP2007 Advanced Group] Tree Network Core
- Problem Description:
In a given weighted tree, the "deviation" of a path is defined as the maximum distance from any point in the tree to that path. One needs to select a path on the diameter of the tree with a length not exceeding $s$ (which can be a single point) such that the deviation is minimized. Find this minimum deviation. - Core Idea and Essence of the Problem:
The physical core of the problem is based on the properties of tree diameter. First, extract any diameter using the two-pointer technique or DFS. It can be proven that the maximum deviation of points along the diameter will necessarily extend through the subtrees that do not pass through other points on the diameter. Preprocess the maximum distance not along the diameter for each point using tree DP along the diameter. Then, use a monotonic queue or two-pointer technique to maintain a sliding window of length not exceeding $s$ along the diameter sequence to quickly retrieve the maximum deviation value inside and outside the interval, controlling the complexity to $O(N)$.
2. Luogu P1395 Meeting
- Problem Description:
In a village, the housing structure forms a tree. A meeting center needs to be built such that the sum of distances from all residential points to the meeting center is minimized. If there are multiple such locations, output the one with the smallest index, along with the minimum distance sum. - Core Idea and Essence of the Problem:
This is a classic application of the tree centroid. According to the property 3 of the centroid: the point that minimizes the sum of distances from all points in the tree is the centroid. Therefore, the first step is to find the centroid through one round of DFS (if there are two centroids, take the one with the smaller index). The second step is to run another DFS with the chosen centroid as the root, summing all nodes' depths to get the total distance from each point to the centroid. The total time complexity is standard $O(N)$.