Core Logic and Mathematical Principles
The Scanning Line algorithm is a core application of segment trees in two-dimensional geometric computations. The classic problem is: given $N$ axis-parallel rectangles in a Cartesian coordinate system, compute their union area or union perimeter.
1. Spatial Slicing and Dimensional Reduction
Directly solving the union area of two-dimensional shapes involves handling extremely complex boundary intersection cases. The core idea of the Scanning Line algorithm is dimensional reduction and divide-and-conquer: move a vertical line (the scanning line) from left to right (or a horizontal line from bottom to top).
The left and right vertical edges of the rectangles are abstracted as event points on the scanning line:
- Left edge (entering edge): the rectangle area begins, marked with $+1$.
- Right edge (exiting edge): the rectangle area ends, marked with $-1$.
If all left and right edges of the rectangles are sorted in ascending order of the $x$ coordinate, the entire two-dimensional plane is cut into $2N-1$ vertical non-intersecting strips. For the @@@MATH_BLOCK5@@@ strip, its width is $ abla x = x{i+1} - x_i$, and its height $H$ is the length union of all active vertical segments intercepted by the current scanning line. Thus, the area of that strip is:
$$ abla S = abla x \times H $$
The final total area is the algebraic sum of all strip areas: $S_{all} = \sum \nabla S$.
2. Discretization and Interval Continuity Mapping
Since the range of the $y$ coordinate is often extremely large (e.g., $10^9$) or is a real number, it cannot be directly used as an index for the segment tree array. We must extract, deduplicate, and sort all rectangle $y$ coordinates in ascending order, mapping them to a discretized array $raw_y$ of size $M$.
Critical Point: The nodes of the segment tree maintain intervals (segment lengths), not points. If there are $M$ unique $y$ values after discretization, they will form $M-1$ basic segments. If a leaf node $u$ of the segment tree represents the closed interval $[l, r]$, its actual vertical interval in geometry is $[raw_y[l], raw_y[r+1]]$. Therefore, the root node of the segment tree maintains the interval $[1, M-1]$.
State Design and Algorithm Derivation
1. Design of the Scan Line Event Structure
Each rectangle contributes two vertical segments, defined as ScanLine:
x: the horizontal coordinate of the segment (used for sorting to trigger scanning).y1,y2: the vertical coordinate interval of the segment (satisfying $y1 < y2$).mark: entering edge is $1$, exiting edge is $-1$.
2. Segment Tree Node State Design
Unlike basic segment trees, standard scanning lines do not require the standard pushdown lazy marking mechanism due to their unique symmetric additive property (entering edge $+1$, exiting edge $-1$). Each node $u$ maintains two core states:
cover[u]: the number of times the current interval is completely covered (ifcover[u] > 0, it indicates that the geometric interval represented by this node is fully activated).len[u]: the actual geometric segment length covered in the current interval.
3. Specialized pushup Control Logic
Since there is no pushdown, the correctness of a node's data is entirely determined by its own cover and the len of its child nodes. When executing update(u, l, r, ql, qr, val) causes cover[u] to change, we immediately recalculate len[u] through a specialized pushup:
$$len[u] = \begin{cases} raw_y[r+1] - raw_y[l], & \text{if } cover[u] > 0 \\ len[u \ll 1] + len[u \ll 1 | 1], & \text{if } cover[u] = 0 \text{ and } l < r \\ 0, & \text{if } cover[u] = 0 \text{ and } l = r \end{cases}$$
- Logical Explanation: If the current node
cover > 0, it indicates that the interval is fully covered, and its geometric span is directly calculated; ifcover == 0and it is not a leaf node, its geometric length is determined by the sum reported by its left and right child nodes; if it is a leaf node andcover == 0, the length is $0$.
C++ Standard Source Code (NOIP Style)
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
inline long long read() {
long long x = 0, f = 1;
char ch = getchar();
while (ch < '0' || ch > '9') {
if (ch == '-') f = -1;
ch = getchar();
}
while (ch >= '0' && ch <= '9') {
x = (x << 3) + (x << 1) + (ch ^ 48);
ch = getchar();
}
return x * f;
}
const int MAXN = 200005; // 2 rectangles contribute 4 edges, space must be opened to 4 times 2 * N
struct ScanLine {
long long x;
long long y1, y2;
int mark;
// Strictly sorted in ascending order of x coordinate, entering edges processed first (mark priority) to prevent boundary discontinuities
bool operator<(const ScanLine& b) const {
if (x != b.x) return x < b.x;
return mark > b.mark;
}
} line[MAXN << 1];
long long raw_y[MAXN << 1];
long long len[MAXN << 3]; // Critical pitfall: due to point doubling, segment tree space must be MAXN * 8
int cover[MAXN << 3];
// Specialized no-mark pushup mechanism
inline void pushup(int u, int l, int r) {
if (cover[u] > 0) {
len[u] = raw_y[r + 1] - raw_y[l]; // Map geometric physical interval
} else if (l == r) {
len[u] = 0; // Leaf node and not covered
} else {
len[u] = len[u << 1] + len[u << 1 | 1];
}
}
// Core geometric interval update: the input ql, qr must be the index range of the discretized array
void update(int u, int l, int r, int ql, int qr, int val) {
if (ql <= l && r <= qr) {
cover[u] += val;
pushup(u, l, r); // Cover state changes, immediately calculate the actual length of the current node
return;
}
int m = (l + r) >> 1;
if (ql <= m) update(u << 1, l, m, ql, qr, val);
if (qr > m) update(u << 1 | 1, m + 1, r, ql, qr, val);
pushup(u, l, r);
}
int main() {
int n = read();
int cnt = 0;
for (int i = 1; i <= n; ++i) {
long long x1 = read(), y1 = read(), x2 = read(), y2 = read();
line[++cnt] = {x1, y1, y2, 1};
raw_y[cnt] = y1;
line[++cnt] = {x2, y1, y2, -1};
raw_y[cnt] = y2;
}
// Discretize the y coordinates
sort(line + 1, line + cnt + 1);
sort(raw_y + 1, raw_y + cnt + 1);
int tot = unique(raw_y + 1, raw_y + cnt + 1) - (raw_y + 1);
// The segment tree maintains intervals, tot points correspond to tot - 1 closed intervals
// The root node maintains the interval [1, tot - 1]
long long ans = 0;
for (int i = 1; i < cnt; ++i) {
// Binary search to locate the index of the vertical coordinate interval in the discretized array
int ql = lower_bound(raw_y + 1, raw_y + tot + 1, line[i].y1) - raw_y;
int qr = lower_bound(raw_y + 1, raw_y + tot + 1, line[i].y2) - raw_y - 1; // Critical pitfall: the right boundary mapping must subtract 1
if (ql <= qr) {
update(1, 1, tot - 1, ql, qr, line[i].mark);
}
ans += len[1] * (line[i + 1].x - line[i].x);
}
printf("%lld\n", ans);
return 0;
}
NOIP Practical Pitfall Guide
1. Right Boundary Mapping of Segment Tree Not Subtracting 1 Causes Out-of-Bounds Calculation
After discretization, the leaf node $i$ of the segment tree represents the geometric interval $[raw_y[i], raw_y[i+1]]$. If a scanning line's vertical range is $[raw_y[A], raw_y[B]]$, the target interval for update in the segment tree is $[A, B-1]$. If it is mistakenly passed as $[A, B]$, when update touches the $B^{th}$ node, the pushup formula's raw_y[r+1] will access raw_y[B+1], resulting in serious logical out-of-bounds or reading garbage data.
2. Space Allocation Multiplication Factor Severely Underestimated
A regular segment tree allocates $4$ times the space. However, in the scanning line algorithm, inputting $N$ rectangles produces $2N$ edges and $2N$ vertical coordinate points. This means the base size of the original array has become $2N$. Therefore, the size of the segment tree array must be allocated to $2N \times 4 = 8N$. Only allocating $4N$ will lead to direct RE when building the tree or updating deep into the right subtree.
3. Concurrent Edge Handling Not Processed During Horizontal Coordinate Sorting
When two rectangles have overlapping horizontal coordinates (i.e., $x_1 = x_2$) and one is an entering edge ($+1$) while the other is an exiting edge ($-1$), if the exiting edge is mistakenly processed first, the corresponding len[1] of the segment tree will abruptly drop to $0$, followed by the entering edge raising it back up. This will cause a multiplication by an erroneous intermediate zero value when calculating $
abla x \times len[1]$. It is essential to enforce priority for entering edges when overloading < in the structure.
Classic NOIP/Luogu Problems
1. Luogu P5490 [Template] Scanning Line & Rectangle Area Union
- Problem Description: Calculate the total area of $N$ rectangles located in a Cartesian coordinate system after overlapping.
- Nature of the Problem: A standard foundational problem of the two-dimensional discretized scanning line algorithm.
- Core Problem-Solving Idea: As shown in the source code, the $y$ axis coordinates are extracted and discretized into the node intervals of the tree. The $x$ coordinate serves as the event point to drive the scanning line from left to right. Using a specialized segment tree without
pushdown, the problem is perfectly solved within $O(N \log N)$ complexity, mapping high-dimensional continuous space to low-dimensional discrete space.
2. Luogu P1856 [IOI1998] [USACO5.5] Rectangle Perimeter Picture
- Problem Description: Calculate the total outline perimeter of the overall shape formed by $N$ rectangles.
- Nature of the Problem: A higher-order application of the scanning line, requiring simultaneous maintenance of the length of covered intervals and the number of non-intersecting segments.
- Core Problem-Solving Idea: The perimeter consists of both horizontal and vertical edges. It can be handled using two scanning lines separately, or simultaneously resolved in a single vertical scan: the segment tree nodes must maintain both
len(covered length) andnum(the number of independent vertical segments contained in the current interval), as well asl_count,r_count(whether the left and right endpoints are covered). Each time the scanning line advances, the contribution of vertical edges is $|len_{new} - len_{old}|$, and the contribution of horizontal edges is $2 \times num \times \nabla x$.