Core Logic and Mathematical Principles
The Segment Tree is essentially a binary search tree based on the Divide and Conquer principle, used to maintain interval information. It divides a linear interval of length $N$ into several logarithmic sub-intervals, with each node representing a closed interval $[L, R]$.
Spatial Topological Structure
For the current node with index $u$, its left child has index $u \ll 1$ (i.e., $2u$), and its right child has index $u \ll 1 | 1$ (i.e., $2u + 1$). Leaf nodes satisfy $L = R$, representing a single point. The division point for non-leaf nodes is $M = (L + R) \gg 1$, with the left subtree maintaining $[L, M]$ and the right subtree maintaining $[M + 1, R]$.
To ensure that the index of the complete binary tree does not go out of bounds, the array space must be strictly allocated to $4N$.
Asymptotic Complexity Proof
- Divide and Conquer Tree Construction (Build): Recursively touching all leaf nodes, the total number of nodes is $\sum_{i=0}^{\lceil \log_2 N \rceil} 2^i < 4N$, with a time complexity of $O(N)$.
- Point Update (Update): Directly from the root node to the leaf node, the path length is the height of the tree, with a time complexity of $O(\log N)$.
- Interval Binary Sum Query (Query): Any interval $[l, r]$ can be split into at most $2 \log N$ complete sub-intervals on the segment tree. Proof: At each level of the tree, at most two nodes can be partially covered and continue to recurse down; the other nodes are either completely included or completely excluded. The time complexity is strictly $O(\log N)$.
State Design and Algorithm Derivation
1. Divide and Conquer Tree Construction
Define the recursive function build(u, l, r). If $l = r$, it indicates reaching a leaf node, directly read the original array data and assign:
$$tree[u] = a[l]$$
Otherwise, let $m = (l + r) \gg 1$, recursively construct the left and right subtrees, and finally execute pushup(u):
$$tree[u] = tree[u \ll 1] + tree[u \ll 1 | 1]$$
2. Point Update
Define the recursive function update(u, l, r, pos, val). Lock the branch by the binary position pos. If $pos \le m$, recurse into the left subtree; otherwise, recurse into the right subtree. When reaching the leaf node (i.e., $l = r = pos$), perform the modification:
$$tree[u] = val$$
During backtracking, it is necessary to call pushup(u) layer by layer along the update path to update the values of ancestor nodes.
3. Interval Binary Sum Query
Define the recursive function query(u, l, r, ql, qr). The current node interval $[l, r]$ and the query interval $[ql, qr]$ have three relationships:
- $[l, r] \subseteq [ql, qr]$: The current interval is completely included; directly return $tree[u]$.
- $[l, r] \cap [ql, qr] = \emptyset$: The current interval has no intersection with the query interval; return $0$.
- Partial intersection: Let $m = (l + r) \gg 1$. If $ql \le m$, accumulate the contribution from the left subtree:
query(u << 1, l, m, ql, qr). If $qr > m$, accumulate the contribution from the right subtree:query(u << 1 | 1, m + 1, r, ql, qr).
C++ Standard Source Code (NOIP Style)
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
// Strictly follow NOIP/Linux environment specifications, disable universal headers, and use fast input for large data sets
inline long long read() {
long long x = 0, f = 1;
char ch = getchar();
while (ch < '0' || ch > '9') {
if (ch == '-') f = -1;
ch = getchar();
}
while (ch >= '0' && ch <= '9') {
x = (x << 3) + (x << 1) + (ch ^ 48);
ch = getchar();
}
return x * f;
}
const int MAXN = 100005;
long long tree[MAXN << 2]; // Critical pitfall: the size of the segment tree array must be allocated to 4 times the original array
long long a[MAXN];
inline void pushup(int u) {
tree[u] = tree[u << 1] + tree[u << 1 | 1];
}
void build(int u, int l, int r) {
if (l == r) {
tree[u] = a[l];
return;
}
int m = (l + r) >> 1;
build(u << 1, l, m);
build(u << 1 | 1, m + 1, r);
pushup(u);
}
void update(int u, int l, int r, int pos, long long val) {
if (l == r) {
tree[u] = val; // Found the target point, directly overwrite or accumulate
return;
}
int m = (l + r) >> 1;
if (pos <= m) {
update(u << 1, l, m, pos, val);
} else {
update(u << 1 | 1, m + 1, r, pos, val);
}
pushup(u); // Must update ancestor nodes during backtracking, otherwise the query data will be disrupted
}
long long query(int u, int l, int r, int ql, int qr) {
if (ql <= l && r <= qr) {
return tree[u]; // The current interval is completely included, directly return
}
int m = (l + r) >> 1;
long long res = 0;
if (ql <= m) {
res += query(u << 1, l, m, ql, qr);
}
if (qr > m) {
res += query(u << 1 | 1, m + 1, r, ql, qr);
}
return res;
}
int main() {
// Optimize standard stream performance
ios::sync_with_stdio(false);
cin.tie(nullptr);
int n = read();
int m = read();
for (int i = 1; i <= n; ++i) {
a[i] = read();
}
build(1, 1, n);
for (int i = 0; i < m; ++i) {
int opt = read();
if (opt == 1) {
int pos = read();
long long val = read();
update(1, 1, n, pos, val);
} else if (opt == 2) {
int ql = read();
int qr = read();
printf("%lld\n", query(1, 1, n, ql, qr));
}
}
return 0;
}
NOIP Practical Pitfall Guide
1. Array Out of Bounds and Insufficient Physical Space
The segment tree is a degenerate form of a full binary tree. In the worst case, when the linear interval length is $N$, the index of the last leaf node can approach $4N$. If only $2N$ or $3N$ space is allocated, the code will trigger a Segmentation Fault or deadlock when handling $N$ that is not a power of $2$ and data leans towards the right subtree when executing u << 1 | 1. Must strictly declare `MAXN << 2.
2. Operator Priority Disaster
It is common to use bitwise operations to improve efficiency when writing segment trees. Note that the priority of + and - is higher than that of << and >>. An erroneous example is int m = l + r >> 1; (equivalent to (l + r) >> 1, this line is correct). Another erroneous example is build(u << 1, l, m + 1); where if mistakenly written as u << 1 + 1, due to + priority, it actually executes u << (1 + 1) which is u << 2. When involving bitwise operations, parentheses must be forced, such as (u << 1) | 1.
3. Data Overflow
In interval sum problems, even if the original array is within the int range, the accumulated answer and the increment of single-point modifications can easily exceed $2^{31}-1$. The segment tree array tree[] and the return value of the query function must all be declared as `long long.
Classic NOIP/Luogu Problems
1. Luogu P1303 / Basic Adaptation: Point Modification and Interval Query Benchmark
- Problem Description: Given a sequence of length $N$, support two operations: modify the value at a certain position; query the sum of elements in the interval $[L, R]$.
- Essential Nature of the Problem: Classic segment tree template.
- Core Problem-Solving Idea: Use a point update segment tree. Build the topological structure in $O(N)$ time using
build, maintain single points and ancestor chains with eachupdatetaking $O(\log N)$, and split the target interval into at most $2\log N$ segment tree nodes inquery, directly accumulating and returning, perfectly replacing the inefficient $O(N)$ brute force scan.
2. Luogu P7075 [CSP-S 2020] Julian Date / Tree-shaped Data Retrieval Point
- Problem Description: Given multiple queries, each asking for the specific date after $Q$ days from January 1, 4713 BC.
- Essential Nature of the Problem: Large simulation or non-trivial binary search. Under the segment tree framework, it can be abstracted as finding the first position in the value range that satisfies the prefix day count greater than or equal to $Q$.
- Core Problem-Solving Idea: Utilize the basic segment tree's interval binary sum logic. Use the number of days in each year/month as the base array to build the tree, with each node maintaining the total days for the corresponding time period. When faced with the query $Q$, there is no need to binary search externally; directly perform a "tree binary search" on the segment tree: if the total days in the left subtree $\ge Q$, the answer is in the left subtree, directly enter the left subtree; otherwise, subtract the left subtree days from $Q$ and enter the right subtree. The complexity of a single query is hardened down to $O(\log N)$.