NeFut Logo NeFut
Admin Login

In-Depth Understanding of Segment Trees: Principles, Implementation, and Common Pitfalls

Published at: 2026-05-29 00:44 Last updated: 2026-06-06 13:04
#algorithm #C++ #Segment Tree

Core Logic and Mathematical Principles

The Segment Tree is essentially a binary search tree based on the Divide and Conquer principle, used to maintain interval information. It divides a linear interval of length $N$ into several logarithmic sub-intervals, with each node representing a closed interval $[L, R]$.

Spatial Topological Structure

For the current node with index $u$, its left child has index $u \ll 1$ (i.e., $2u$), and its right child has index $u \ll 1 | 1$ (i.e., $2u + 1$). Leaf nodes satisfy $L = R$, representing a single point. The division point for non-leaf nodes is $M = (L + R) \gg 1$, with the left subtree maintaining $[L, M]$ and the right subtree maintaining $[M + 1, R]$.

To ensure that the index of the complete binary tree does not go out of bounds, the array space must be strictly allocated to $4N$.

Asymptotic Complexity Proof


State Design and Algorithm Derivation

1. Divide and Conquer Tree Construction

Define the recursive function build(u, l, r). If $l = r$, it indicates reaching a leaf node, directly read the original array data and assign:

$$tree[u] = a[l]$$

Otherwise, let $m = (l + r) \gg 1$, recursively construct the left and right subtrees, and finally execute pushup(u):

$$tree[u] = tree[u \ll 1] + tree[u \ll 1 | 1]$$

2. Point Update

Define the recursive function update(u, l, r, pos, val). Lock the branch by the binary position pos. If $pos \le m$, recurse into the left subtree; otherwise, recurse into the right subtree. When reaching the leaf node (i.e., $l = r = pos$), perform the modification:

$$tree[u] = val$$

During backtracking, it is necessary to call pushup(u) layer by layer along the update path to update the values of ancestor nodes.

3. Interval Binary Sum Query

Define the recursive function query(u, l, r, ql, qr). The current node interval $[l, r]$ and the query interval $[ql, qr]$ have three relationships:

  1. $[l, r] \subseteq [ql, qr]$: The current interval is completely included; directly return $tree[u]$.
  2. $[l, r] \cap [ql, qr] = \emptyset$: The current interval has no intersection with the query interval; return $0$.
  3. Partial intersection: Let $m = (l + r) \gg 1$. If $ql \le m$, accumulate the contribution from the left subtree: query(u << 1, l, m, ql, qr). If $qr > m$, accumulate the contribution from the right subtree: query(u << 1 | 1, m + 1, r, ql, qr).

C++ Standard Source Code (NOIP Style)

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

// Strictly follow NOIP/Linux environment specifications, disable universal headers, and use fast input for large data sets
inline long long read() {
    long long x = 0, f = 1;
    char ch = getchar();
    while (ch < '0' || ch > '9') {
        if (ch == '-') f = -1;
        ch = getchar();
    }
    while (ch >= '0' && ch <= '9') {
        x = (x << 3) + (x << 1) + (ch ^ 48);
        ch = getchar();
    }
    return x * f;
}

const int MAXN = 100005;
long long tree[MAXN << 2]; // Critical pitfall: the size of the segment tree array must be allocated to 4 times the original array
long long a[MAXN];

inline void pushup(int u) {
    tree[u] = tree[u << 1] + tree[u << 1 | 1];
}

void build(int u, int l, int r) {
    if (l == r) {
        tree[u] = a[l];
        return;
    }
    int m = (l + r) >> 1;
    build(u << 1, l, m);
    build(u << 1 | 1, m + 1, r);
    pushup(u);
}

void update(int u, int l, int r, int pos, long long val) {
    if (l == r) {
        tree[u] = val; // Found the target point, directly overwrite or accumulate
        return;
    }
    int m = (l + r) >> 1;
    if (pos <= m) {
        update(u << 1, l, m, pos, val);
    } else {
        update(u << 1 | 1, m + 1, r, pos, val);
    }
    pushup(u); // Must update ancestor nodes during backtracking, otherwise the query data will be disrupted
}

long long query(int u, int l, int r, int ql, int qr) {
    if (ql <= l && r <= qr) {
        return tree[u]; // The current interval is completely included, directly return
    }
    int m = (l + r) >> 1;
    long long res = 0;
    if (ql <= m) {
        res += query(u << 1, l, m, ql, qr);
    }
    if (qr > m) {
        res += query(u << 1 | 1, m + 1, r, ql, qr);
    }
    return res;
}

int main() {
    // Optimize standard stream performance
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n = read();
    int m = read();
    for (int i = 1; i <= n; ++i) {
        a[i] = read();
    }

    build(1, 1, n);

    for (int i = 0; i < m; ++i) {
        int opt = read();
        if (opt == 1) {
            int pos = read();
            long long val = read();
            update(1, 1, n, pos, val);
        } else if (opt == 2) {
            int ql = read();
            int qr = read();
            printf("%lld\n", query(1, 1, n, ql, qr));
        }
    }
    return 0;
}

NOIP Practical Pitfall Guide

1. Array Out of Bounds and Insufficient Physical Space

The segment tree is a degenerate form of a full binary tree. In the worst case, when the linear interval length is $N$, the index of the last leaf node can approach $4N$. If only $2N$ or $3N$ space is allocated, the code will trigger a Segmentation Fault or deadlock when handling $N$ that is not a power of $2$ and data leans towards the right subtree when executing u << 1 | 1. Must strictly declare `MAXN << 2.

2. Operator Priority Disaster

It is common to use bitwise operations to improve efficiency when writing segment trees. Note that the priority of + and - is higher than that of << and >>. An erroneous example is int m = l + r >> 1; (equivalent to (l + r) >> 1, this line is correct). Another erroneous example is build(u << 1, l, m + 1); where if mistakenly written as u << 1 + 1, due to + priority, it actually executes u << (1 + 1) which is u << 2. When involving bitwise operations, parentheses must be forced, such as (u << 1) | 1.

3. Data Overflow

In interval sum problems, even if the original array is within the int range, the accumulated answer and the increment of single-point modifications can easily exceed $2^{31}-1$. The segment tree array tree[] and the return value of the query function must all be declared as `long long.


Classic NOIP/Luogu Problems

1. Luogu P1303 / Basic Adaptation: Point Modification and Interval Query Benchmark

2. Luogu P7075 [CSP-S 2020] Julian Date / Tree-shaped Data Retrieval Point

Original Source: local://6.1

[h] Back to Home