NeFut Logo NeFut
Admin Login

Deep Dive into Tree Structures and Efficient Traversal Algorithms

Published at: 2026-05-27 09:17 Last updated: 2026-06-06 13:04
#algorithm #Tree #DFS

Core Logic and Mathematical Principles

A tree structure essentially consists of $N$ nodes and $N-1$ edges, forming a connected undirected graph. In computer memory, the standard storage scheme for sparse trees is the chain-forward star or adjacency list (dynamic array). Due to the non-linear continuous memory characteristics of the tree's topological structure, advanced data structures like segment trees and binary indexed trees cannot be directly utilized for interval maintenance. Therefore, it is necessary to flatten the tree into a linear sequence using traversal techniques.

1. DFS Order

DFS order refers to the timestamp order in which nodes are first accessed during a depth-first traversal of the tree. For any node $u$, its entry timestamp is denoted as $ ext{dfn}[u]$, and the timestamp of the last accessed node in its subtree is denoted as $ ext{out}[u]$.

Mathematical property: The entire subtree of node $u$ corresponds to a continuous closed interval $[ ext{dfn}[u], ext{out}[u]]$ in the DFS order. The interval length is equal to the size of the subtree $ ext{siz}[u] = ext{out}[u] - ext{dfn}[u] + 1$. This property establishes an isomorphic mapping from "subtree operations on trees" to "interval operations on segment trees".

2. Euler Tour

There are two variants of the Euler tour, and the NOIP phase specifically refers to the "path Euler tour", which records a node in the sequence whenever it is visited, whether it is the first time entering the node or backtracking from its subtree. In a rooted tree, an edge will be traversed twice in both directions, resulting in a total sequence length of exactly $2N-1$.

Mathematical property: The first appearance of node $u$ is at position $ ext{first}[u]$, and the last appearance is at position $ ext{last}[u]$. The interval $[ ext{first}[u], ext{last}[u]]$ contains, and only contains, all traversal paths of node $u$ and its subtree nodes. The path Euler tour is often used in conjunction with standard ST tables to transform the lowest common ancestor (LCA) problem into an interval maxima problem (RMQ).


Algorithm Derivation and State Design

Define the topological structure of the tree as $G = (V, E)$.

1. Graph Storage Using Adjacency List

Using std::vector<int> G[N] or int head[N], to[M], nxt[M]. The NOIP industrial standard recommends using std::vector for non-extreme constant time problems. For weighted trees, the state structure is defined as:

struct Edge {
    int to;
    long long weight;
};
std::vector<Edge> adj[N];

2. DFS Order Transformation Equation

Define a global timestamp counter $ ext{tim}$. For the current node $u$, the bidirectional edge backtracking to the parent node is denoted as $fa$:

  1. Entry state: $ ext{tim} ightarrow ext{tim} + 1$, set $ ext{dfn}[u] = ext{tim}$, and map the current node back to the sequence $ ext{seq}[ ext{tim}] = u$.
  2. State transition: Traverse all child nodes satisfying $v ext{ in } ext{adj}[u] ext{ and } v eq fa$, recursively execute $ ext{DFS}(v, u)$.
  3. Exit state: $ ext{out}[u] = ext{tim}$.

3. Euler Tour Transformation Equation

Define a global timestamp counter $ ext{cur}$.

  1. Entry state: $ ext{cur} ightarrow ext{cur} + 1$, set $ ext{euler}[ ext{cur}] = u$, and record the first position $ ext{first}[u] = ext{cur}$.
  2. State transition: Traverse child nodes $v$. Each time backtracking from $v$ to $u$, explicitly trigger: $ ext{cur} ightarrow ext{cur} + 1$, set $ ext{euler}[ ext{cur}] = u$.
  3. Exit state: Record the final position $ ext{last}[u] = ext{cur}$.

Standard C++ Source Code

#include <iostream>
#include <vector>
#include <algorithm>

// Strictly limit the maximum number of nodes, adjust according to the problem's data range
const int MAXN = 200005;

// Adjacency list storage structure
std::vector<int> adj[MAXN];

// Arrays related to DFS order
int dfn[MAXN], out[MAXN], seq[MAXN], dfn_tim;

// Arrays related to Euler tour
int euler[MAXN * 2], first_pos[MAXN], last_pos[MAXN], euler_tim;

// Core traversal function
void dfs(int u, int fa) {
    // Activate DFS order
    dfn_tim++;
    dfn[u] = dfn_tim;
    seq[dfn_tim] = u;

    // Plan A: Record the first entry into the Euler tour
    euler_tim++;
    euler[euler_tim] = u;
    first_pos[u] = euler_tim;

    for (size_t i = 0; i < adj[u].size(); ++i) {
        int v = adj[u][i];
        if (v == fa) continue; // Avoid fatal infinite loops: undirected graph must skip parent node

        dfs(v, u);

        // Plan B: Backtracking from the subtree, must push the current node u again
        euler_tim++;
        euler[euler_tim] = u;
    }

    // Close the DFS order subtree interval
    out[u] = dfn_tim;
    last_pos[u] = euler_tim;
}

int main() {
    // Improve I/O efficiency, unbind standard streams
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(NULL);

    int n;
    if (!(std::cin >> n)) return 0;

    for (int i = 1; i < n; ++i) {
        int u, v;
        std::cin >> u >> v;
        adj[u].push_back(v);
        adj[v].push_back(u); // Undirected tree must build edges bidirectionally
    }

    // Initialize counters and start DFS from root node 1
    dfn_tim = 0;
    euler_tim = 0;
    dfs(1, 0);

    // Output verification results (can be deleted in production environment)
    for (int i = 1; i <= n; ++i) {
        std::cout << "Node " << i << " Subtree Interval: [" << dfn[i] << ", " << out[i] << "]\n";
    }

    return 0;
}

NOIP Practical Pitfall Guide

  1. Stack Overflow Risk: When the tree depth is too large (e.g., chain graph with $N = 5 \times 10^5$), the system's default stack space is prone to collapse. In the Linux evaluation environment, be aware of the system's stack limits. If there are no special compilation commands, consider simulating DFS with a manual stack.
  2. Euler Tour Array Size Not Doubled: Backtracking leads to duplicate enqueues, causing the maximum length of the Euler sequence to approach $2N$. If the euler array and related statistical arrays only allocate MAXN size, it will inevitably trigger out-of-bounds errors, resulting in SIGSEGV segmentation faults when the data is a random tree or a high fan-out tree.
  3. Undirected Edge Bidirectional Edge Construction Missing Parent Node Check: In the for loop traversing the adjacency list, if the if (v == fa) continue; check is not added, it will lead to an endless recursion between parent and child nodes, directly causing an infinite loop and stack overflow.

Classic NOIP/Luogu Problems

1. Luogu P3384 [Template] Heavy Chain Decomposition/Tree Chain Decomposition

2. Luogu P3258 [JLOI2014] The Squirrel's New Home

Original Source: local://9.1

[h] Back to Home