Core Logic and Mathematical Principles
The Lowest Common Ancestor (LCA) is defined as follows: in a rooted tree $T$, for any two nodes $u$ and $v$, the LCA is the deepest node that is an ancestor of both $u$ and $v$, denoted as $\text{LCA}(u, v)$.
The three core paradigms for solving LCA in NOIP each have their own mathematical essence in terms of time and space trade-offs:
1. Binary Lifting (Exponentiation Method)
Based on the idea of binary state compression. By preprocessing each node's $2^k$-th ancestor, we can optimize linear jumps from $O(N)$ to logarithmic jumps $O(\text{log} N)$.
- Mathematical Principle: Any decimal integer (here referring to the depth difference between two nodes) can be uniquely decomposed into a sum of powers of $2$. Using a recursive structure similar to the ST table, we can align depths and approach upwards in $O(\text{log} N)$ time.
2. Tarjan's Algorithm (Offline Disjoint Set Union)
An offline algorithm that combines depth-first search (DFS) and disjoint set union (DSU).
- Mathematical Principle: During the DFS traversal of the tree, each node at any given moment can be classified into three categories: visited and fully backtracked (marked as 2), currently visiting nodes in the stack and ancestors along the path (marked as 1), and unvisited (marked as 0). When the DFS is processing node $u$ and there is an offline query $(u, v)$, if $v$ has already been fully visited (marked as 2), then the LCA of $u$ and $v$ must be the root of the disjoint set containing $v$ (which must be some ancestor in the current execution stack).
3. RMQ Transformation Method (Euler Tour + ST Table)
A lossless mapping of the tree structure to a linear algebra maximum value problem.
- Mathematical Principle: In the Euler tour of the path, the first occurrences of nodes $u$ and $v$ are denoted as $\text{first}[u]$ and $\text{first}[v]$, respectively. The node with the minimum depth in the interval $[\text{first}[u], \text{first}[v]]$ is $\text{LCA}(u, v)$. By preprocessing the ST table in $O(N \text{log} N)$ time, we can achieve online queries in $O(1)$ time.
State Design and Algorithm Derivation
1. Binary Lifting
Define $fa[u][k]$ as the $2^k$-th ancestor of node $u$; $\text{dep}[u]$ is the depth of node $u$.
- State Transition Equation:
$$fa[u][k] = fa[fa[u][k-1]][k-1]$$
Physical Meaning: Jumping up $2^k$ steps is equivalent to first jumping up $2^{k-1}$ steps, and then from that position jumping up another $2^{k-1}$ steps.
- Boundary Conditions: $fa[u][0] = parent$. If jumping out of the root node, then $fa[u][k] = 0$.
- Query Algorithm:
- Assume $\text{dep}[u] \text{≥} \text{dep}[v]$. Use binary lifting to elevate $u$ to the same depth as $v$.
- If $u == v$, then $v$ is the LCA.
- Otherwise, both $u$ and $v$ attempt to jump upwards by $2^k$ steps. If $fa[u][k] \neq fa[v][k]$, then update $u = fa[u][k], v = fa[v][k]$.
- After the loop, $\text{LCA}(u, v) = fa[u][0]$.
2. RMQ Transformation Method
Define $\text{euler}[i]$ as the Euler sequence, and $\text{dep\_seq}[i] = \text{dep}[\text{euler}[i]]$ as the corresponding depth sequence at that position. Use the ST table to maintain the range minimum value: define $f[i][k]$ as the index of the Euler sequence corresponding to the minimum depth value in the depth sequence from position $i$ with length $2^k$.
- State Transition Equation:
$$pos\text{_}l = f[i][k-1], \quad pos\text{_}r = f[i + 2^{k-1}][k-1]$$
$$f[i][k] = \text{dep\_seq}[pos\text{_}l] < \text{dep\_seq}[pos\text{_}r] ? pos\text{_}l : pos\text{_}r$$
C++ Standard Source Code (Industrial Template for Binary Lifting)
#include <iostream>
#include <vector>
#include <algorithm>
const int MAXN = 500005;
const int MAXLOG = 21; // 2^20 > 500000, ensuring depth coverage
std::vector<int> adj[MAXN];
int dep[MAXN];
int fa[MAXN][MAXLOG];
// Preprocess depth and $2^0$-th ancestor
void dfs_lca(int u, int p) {
dep[u] = dep[p] + 1;
fa[u][0] = p;
// Dynamic programming state transition, preprocess the binary lifting array
for (int k = 1; k < MAXLOG; ++k) {
if (fa[u][k - 1] != 0) {
fa[u][k] = fa[fa[u][k - 1]][k - 1];
} else {
fa[u][k] = 0; // Fatal low-level error prevention: must set to 0 when jumping out of the root node
}
}
for (size_t i = 0; i < adj[u].size(); ++i) {
int v = adj[u][i];
if (v == p) continue;
dfs_lca(v, u);
}
}
// Online query for LCA
int query_lca(int u, int v) {
if (dep[u] < dep[v]) std::swap(u, v); // Always ensure u is deeper than v
// First phase: elevate u to the same depth as v
for (int k = MAXLOG - 1; k >= 0; --k) {
if (dep[fa[u][k]] >= dep[v]) {
u = fa[u][k];
}
}
if (u == v) return u;
// Second phase: synchronize upwards
for (int k = MAXLOG - 1; k >= 0; --k) {
if (fa[u][k] != fa[v][k]) { // Key: only jump when the ancestors differ
u = fa[u][k];
v = fa[v][k];
}
}
// Finally stop at the layer below LCA, return the parent node
return fa[u][0];
}
int main() {
std::ios_base::sync_with_stdio(false);
std::cin.tie(NULL);
int n, m, root;
if (!(std::cin >> n >> m >> root)) return 0;
for (int i = 1; i < n; ++i) {
int u, v;
std::cin >> u >> v;
adj[u].push_back(v);
adj[v].push_back(u);
}
// Initialize physical boundary of root node depth, passing 0 as the parent of the root
dep[0] = 0;
dfs_lca(root, 0);
for (int i = 0; i < m; ++i) {
int u, v;
std::cin >> u >> v;
std::cout << query_lca(u, v) << "\n";
}
return 0;
}
NOIP Practical Pitfall Guide
- Reverse Loop Order in Binary Lifting Depth Elevation:
When aligning depths or synchronizing with two pointers upwards, the binary loop must go from large to small (i.e.,
for (int k = MAXLOG - 1; k >= 0; --k)). If written from small to large, the greedy strategy fails, and the target depth difference cannot be correctly assembled, leading to degraded query results or pointer deadlocks. - No Boundary Protection for Root Node Parent:
If the boundary of
fa[u][k - 1] == 0is not correctly handled indfs_lca, it may lead to accessingfa[0][k - 1]. Ifdep[0]or thefa[0]array is not cleared or contains dirty data, the entire binary lifting chain may collapse topologically, causing infinite loops or incorrect ancestor positioning.
Classic NOIP/Luogu Problems
1. Luogu P1967 [NOIP2013 Advanced Group] Truck Transportation
- Problem Description: Given an undirected graph with $N$ nodes and $M$ edges, each edge has a maximum load limit. There are $Q$ queries, each inputting two points $x$ and $y$, asking for the maximum value of the minimum edge weight among all paths from $x$ to $y$. If unreachable, output -1.
- Core Idea and Essence of the Problem: Maximum spanning tree + tree binary lifting. According to the properties of the maximum spanning tree (Kruskal), the path that has the "maximum minimum edge weight" between any two points must lie on the maximum spanning tree. After constructing the tree, the problem transforms into finding the minimum edge weight on the path from $x$ to $y$. Use binary lifting to solve for LCA, and add an attribute in the binary lifting state: define $w[u][k]$ as the minimum edge weight encountered while jumping $2^k$ steps upwards from node $u$. The state transition equation is structurally recursive: $w[u][k] = \text{min}(w[u][k-1], w[fa[u][k-1]][k-1])$. During the two phases of LCA solving, the minimum load value is updated synchronously, with a time complexity of $O((M + Q) \text{log} N)$.
2. Luogu P3128 [USACO15JAN] Max Flow P
- Problem Description: Given a tree with $N$ nodes and $K$ transportation paths. Each path $(u, v)$ increases the pressure value of all nodes on the path (including endpoints) by 1. After all transportation is completed, determine the maximum pressure value of any node on the tree.
- Core Idea and Essence of the Problem:
Differential on tree points.
If each path is modified using tree splitting or brute force, the complexity is unacceptable. Since this is a static full statistics, tree differential can be introduced. For the path $(u, v)$, use binary lifting to find $a = \text{LCA}(u, v)$. Execute on the differential array
diff:
$$\text{diff}[u] \leftarrow \text{diff}[u] + 1, \quad \text{diff}[v] \leftarrow \text{diff}[v] + 1$$
$$\text{diff}[a] \leftarrow \text{diff}[a] - 1, \quad \text{diff}[fa[a][0]] \leftarrow \text{diff}[fa[a][0]] - 1$$
After modification, run a bottom-up DFS traversal; the true pressure value of each node will be the sum of the differential values of its subtree. The overall time complexity is $O((N + K) \text{log} N)$.