NeFut Logo NeFut
Admin Login

Heavy Path Decomposition: Efficient Algorithm and Implementation for Optimizing Tree Structures

Published at: 2026-05-27 09:17 Last updated: 2026-06-06 13:04
#algorithm #Tree #Data Structure

Core Logic and Mathematical Principles

Heavy Path Decomposition is an advanced technique for losslessly transforming tree topologies into linear structures. The core motivation is to break the discreteness of tree branches, allowing any two node paths or subtree intervals in the tree to be isomorphically mapped to at most $O(\log N)$ contiguous memory intervals, thus enabling dynamic maintenance at $O(\log N)$ level using segment trees or binary indexed trees.

1. Core Definitions

For any non-leaf node $u$ in the tree:

2. Mathematical Theorems and Complexity Proof

Theorem: The number of light edges traversed and the number of heavy chain disconnections when moving from any node $u$ towards the root node is strictly no more than $\log_2 N$. Proof: According to the definition of heavy son, when moving from node $u$ along a light edge to its parent node $f$, it must hold that $\text{siz}[u] \le \frac{1}{2} \text{siz}[f]$. This means that for every light edge traversed, the size of the current subtree at least doubles. Since the total number of nodes in the tree is $N$, this doubling operation can occur at most $\log_2 N$ times. Thus, it can be proven that any path in the tree can be decomposed into at most $O(\log N)$ segments of heavy chains.


State Design and Algorithm Derivation

Implementing heavy path decomposition requires two independent DFS processes.

1. First DFS (Preprocessing Topological Properties)

Define the state arrays:

State Transition Equations:

$$\text{siz}[u] = 1 + \sum_{v \in \text{adj}[u], v \neq \text{fa}[u]} \text{siz}[v]$$

$$\text{son}[u] = \arg\max_{v} \{\text{siz}[v] \mid v \in \text{adj}[u], v \neq \text{fa}[u]\}$$

2. Second DFS (Decomposing Chains and Determining Linear Sequences)

Define the state arrays:

Decomposition Strategy:

  1. Priority of Heavy Sons: When recursively traversing downwards, always first perform a deep traversal on the heavy son $\text{son}[u]$. This ensures that all nodes on the same heavy chain occupy a contiguous memory address in the $\text{dfn}$ sequence.
  2. State Transfer:
    • For the heavy son $\text{son}[u]$, its chain top directly inherits: $\text{top}[\text{son}[u]] = \text{top}[u]$.
    • For any light son $v$, it forms a new chain starting point, with its top being itself: $\text{top}[v] = v$.

C++ Standard Source Code

#include <iostream>
#include <vector>
#include <algorithm>

const int MAXN = 200005;

std::vector<int> adj[MAXN];
int fa[MAXN], dep[MAXN], siz[MAXN], son[MAXN];
int top[MAXN], dfn[MAXN], rnki[MAXN], tim;

// First DFS: Calculate fa, dep, siz, son
void dfs1(int u, int p) {
    fa[u] = p;
    dep[u] = dep[p] + 1;
    siz[u] = 1;
    son[u] = 0; // Initialize without heavy son

    int max_siz = 0;
    for (size_t i = 0; i < adj[u].size(); ++i) {
        int v = adj[u][i];
        if (v == p) continue;

        dfs1(v, u);
        siz[u] += siz[v];
        if (siz[v] > max_siz) {
            max_siz = siz[v];
            son[u] = v; // Update heavy son
        }
    }
}

// Second DFS: Calculate top, dfn, note the priority of heavy son
void dfs2(int u, int t) {
    top[u] = t;
    tim++;
    dfn[u] = tim;
    rnki[tim] = u; // Mapping reverse lookup: original tree node corresponding to linear index

    if (!son[u]) return; // Leaf node breakpoint

    // Fatal pitfall: must and can only first recursively process the heavy son to ensure continuity of the heavy chain sequence
    dfs2(son[u], t);

    for (size_t i = 0; i < adj[u].size(); ++i) {
        int v = adj[u][i];
        if (v == fa[u] || v == son[u]) continue;
        dfs2(v, v); // Light son forms a new chain, top is itself
    }
}

// Jump heavy chain to find LCA (industry standard, very low constant, beats doubling)
int lca_by_heavy_edge(int u, int v) {
    while (top[u] != top[v]) {
        if (dep[top[u]] < dep[top[v]]) {
            std::swap(u, v);
        }
        u = fa[top[u]]; // Jump the deeper node at the chain top to its parent node
    }
    return dep[u] < dep[v] ? u : v;
}

int main() {
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(NULL);

    int n, root;
    if (!(std::cin >> n >> root)) return 0;

    for (int i = 1; i < n; ++i) {
        int u, v;
        std::cin >> u >> v;
        adj[u].push_back(v);
        adj[v].push_back(u);
    }

    dep[0] = 0;
    tim = 0;

    dfs1(root, 0);
    dfs2(root, root);

    // Output the linear sequence positions after flattening for each point
    for (int i = 1; i <= n; ++i) {
        std::cout << "Node " << i << " Map to Segment-Tree Index: " << dfn[i] << "\n";
    }

    return 0;
}

NOIP Practical Pitfall Guide

  1. Confusion in Heavy Son Determination Conditions in Second DFS: When traversing the adjacency list in dfs2, it is easy to miss if (v == son[u]) continue;, leading to a second deep traversal on the heavy son. This not only completely destroys the linear physical address continuity of the heavy chain but also renders subsequent segment tree interval queries completely ineffective, potentially causing a SIGSEGV due to infinite loops.
  2. Uninitialized tim or son Arrays for Multiple Data Sets: The son array stores heavy son pointers. If there are multiple test data sets and the son array is not completely cleared, when encountering a leaf node that originally has no heavy son, it may read residual dirty data from the previous set, causing dfs2 to continue calling non-existent nodes, leading to illegal memory access.

Classic NOIP/Luogu Problems

1. Luogu P2590 [ZJOI2008] Tree Statistics

2. Luogu P2146 [NOI2015] Package Manager

Original Source: local://9.4

[h] Back to Home