Core Logic and Mathematical Principles
A tree structure essentially consists of $N$ nodes and $N-1$ edges, forming a connected undirected graph. In computer memory, the standard storage scheme for sparse trees is the chain-forward star or adjacency list (dynamic array). Due to the non-linear continuous memory characteristics of the tree's topological structure, advanced data structures like segment trees and binary indexed trees cannot be directly utilized for interval maintenance. Therefore, it is necessary to flatten the tree into a linear sequence using traversal techniques.
1. DFS Order
DFS order refers to the timestamp order in which nodes are first accessed during a depth-first traversal of the tree. For any node $u$, its entry timestamp is denoted as $ ext{dfn}[u]$, and the timestamp of the last accessed node in its subtree is denoted as $ ext{out}[u]$.
Mathematical property: The entire subtree of node $u$ corresponds to a continuous closed interval $[ ext{dfn}[u], ext{out}[u]]$ in the DFS order. The interval length is equal to the size of the subtree $ ext{siz}[u] = ext{out}[u] - ext{dfn}[u] + 1$. This property establishes an isomorphic mapping from "subtree operations on trees" to "interval operations on segment trees".
2. Euler Tour
There are two variants of the Euler tour, and the NOIP phase specifically refers to the "path Euler tour", which records a node in the sequence whenever it is visited, whether it is the first time entering the node or backtracking from its subtree. In a rooted tree, an edge will be traversed twice in both directions, resulting in a total sequence length of exactly $2N-1$.
Mathematical property: The first appearance of node $u$ is at position $ ext{first}[u]$, and the last appearance is at position $ ext{last}[u]$. The interval $[ ext{first}[u], ext{last}[u]]$ contains, and only contains, all traversal paths of node $u$ and its subtree nodes. The path Euler tour is often used in conjunction with standard ST tables to transform the lowest common ancestor (LCA) problem into an interval maxima problem (RMQ).
Algorithm Derivation and State Design
Define the topological structure of the tree as $G = (V, E)$.
1. Graph Storage Using Adjacency List
Using std::vector<int> G[N] or int head[N], to[M], nxt[M]. The NOIP industrial standard recommends using std::vector for non-extreme constant time problems. For weighted trees, the state structure is defined as:
struct Edge {
int to;
long long weight;
};
std::vector<Edge> adj[N];
2. DFS Order Transformation Equation
Define a global timestamp counter $ ext{tim}$. For the current node $u$, the bidirectional edge backtracking to the parent node is denoted as $fa$:
- Entry state: $ ext{tim} ightarrow ext{tim} + 1$, set $ ext{dfn}[u] = ext{tim}$, and map the current node back to the sequence $ ext{seq}[ ext{tim}] = u$.
- State transition: Traverse all child nodes satisfying $v ext{ in } ext{adj}[u] ext{ and } v eq fa$, recursively execute $ ext{DFS}(v, u)$.
- Exit state: $ ext{out}[u] = ext{tim}$.
3. Euler Tour Transformation Equation
Define a global timestamp counter $ ext{cur}$.
- Entry state: $ ext{cur} ightarrow ext{cur} + 1$, set $ ext{euler}[ ext{cur}] = u$, and record the first position $ ext{first}[u] = ext{cur}$.
- State transition: Traverse child nodes $v$. Each time backtracking from $v$ to $u$, explicitly trigger: $ ext{cur} ightarrow ext{cur} + 1$, set $ ext{euler}[ ext{cur}] = u$.
- Exit state: Record the final position $ ext{last}[u] = ext{cur}$.
Standard C++ Source Code
#include <iostream>
#include <vector>
#include <algorithm>
// Strictly limit the maximum number of nodes, adjust according to the problem's data range
const int MAXN = 200005;
// Adjacency list storage structure
std::vector<int> adj[MAXN];
// Arrays related to DFS order
int dfn[MAXN], out[MAXN], seq[MAXN], dfn_tim;
// Arrays related to Euler tour
int euler[MAXN * 2], first_pos[MAXN], last_pos[MAXN], euler_tim;
// Core traversal function
void dfs(int u, int fa) {
// Activate DFS order
dfn_tim++;
dfn[u] = dfn_tim;
seq[dfn_tim] = u;
// Plan A: Record the first entry into the Euler tour
euler_tim++;
euler[euler_tim] = u;
first_pos[u] = euler_tim;
for (size_t i = 0; i < adj[u].size(); ++i) {
int v = adj[u][i];
if (v == fa) continue; // Avoid fatal infinite loops: undirected graph must skip parent node
dfs(v, u);
// Plan B: Backtracking from the subtree, must push the current node u again
euler_tim++;
euler[euler_tim] = u;
}
// Close the DFS order subtree interval
out[u] = dfn_tim;
last_pos[u] = euler_tim;
}
int main() {
// Improve I/O efficiency, unbind standard streams
std::ios_base::sync_with_stdio(false);
std::cin.tie(NULL);
int n;
if (!(std::cin >> n)) return 0;
for (int i = 1; i < n; ++i) {
int u, v;
std::cin >> u >> v;
adj[u].push_back(v);
adj[v].push_back(u); // Undirected tree must build edges bidirectionally
}
// Initialize counters and start DFS from root node 1
dfn_tim = 0;
euler_tim = 0;
dfs(1, 0);
// Output verification results (can be deleted in production environment)
for (int i = 1; i <= n; ++i) {
std::cout << "Node " << i << " Subtree Interval: [" << dfn[i] << ", " << out[i] << "]\n";
}
return 0;
}
NOIP Practical Pitfall Guide
- Stack Overflow Risk: When the tree depth is too large (e.g., chain graph with $N = 5 \times 10^5$), the system's default stack space is prone to collapse. In the Linux evaluation environment, be aware of the system's stack limits. If there are no special compilation commands, consider simulating DFS with a manual stack.
- Euler Tour Array Size Not Doubled:
Backtracking leads to duplicate enqueues, causing the maximum length of the Euler sequence to approach $2N$. If the
eulerarray and related statistical arrays only allocateMAXNsize, it will inevitably trigger out-of-bounds errors, resulting inSIGSEGVsegmentation faults when the data is a random tree or a high fan-out tree. - Undirected Edge Bidirectional Edge Construction Missing Parent Node Check:
In the
forloop traversing the adjacency list, if theif (v == fa) continue;check is not added, it will lead to an endless recursion between parent and child nodes, directly causing an infinite loop and stack overflow.
Classic NOIP/Luogu Problems
1. Luogu P3384 [Template] Heavy Chain Decomposition/Tree Chain Decomposition
- Problem Description: Given a tree with $N$ nodes, each node has an initial weight. Four operations need to be implemented: modify the weights of all nodes on the path from node $X$ to node $Y$; find the sum of weights of all nodes on the path from node $X$ to node $Y$; modify the weights of all nodes in the subtree rooted at $X$; find the sum of weights of all nodes in the subtree rooted at $X$.
- Core Idea and Essence of the Problem: The essence of subtree operations is the interval coverage of the DFS order. Since tree decomposition is essentially a type of priority DFS order based on heavy children, it guarantees that the DFS order of any subtree is a continuous interval $[ ext{dfn}[X], ext{dfn}[X] + ext{siz}[X] - 1]$. Therefore, flattening the tree converts subtree modifications and queries directly into standard segment tree interval addition and sum problems.
2. Luogu P3258 [JLOI2014] The Squirrel's New Home
- Problem Description: The squirrel needs to visit the nodes on the tree in the order of a given sequence of $N$ points. Each time it moves from the current point to the next point, the visit count of all nodes along the path (excluding the starting point) increases by $1$. Find the final visit count for each node.
- Core Idea and Essence of the Problem: Path modification problem. The traditional approach requires tree decomposition, but this problem is an offline static full count. The tree can be flattened, utilizing the difference array concept on the tree. For the path $(u, v)$, using LCA (which can be implemented through Euler tour + ST table in $O(1)$), perform operations on the difference array: $ ext{diff}[u] ightarrow ext{diff}[u] + 1$, $ ext{diff}[v] ightarrow ext{diff}[v] + 1$, $ ext{diff}[ ext{lca}] ightarrow ext{diff}[ ext{lca}] - 1$, $ ext{diff}[fa[ ext{lca}]] ightarrow ext{diff}[fa[ ext{lca}]] - 1$. Finally, perform a subtree sum from bottom to top (i.e., tree backtracking in DFS order) to obtain the actual frequencies.