Efficient Approach to Large-Scale Matrix Coverage Problem...

Core Logic and Mathematical Principles

In the large-scale matrix coverage problem, the value range often reaches the level of $10^9$, while the number of rectangles $N$ is relatively small (generally $N \leq 10^3$). If a full two-dimensional array is allocated directly, memory overflow is inevitable. The physical essence of this problem is finite partitioning and discrete topologization of continuous space.

The core logic consists of two layers of key mappings: The first layer is Coordinate Discretization. By deduplicating and sorting the x and y coordinates of the $N$ rectangles, the infinite, sparse continuous plane is partitioned into a maximum of $(2N) \times (2N)$ compact grid blocks. Each grid numerically represents a specific area of the matrix.

The second layer is 2D Difference Grid. On the discretized compact grid matrix, the inclusion-exclusion principle is used to maintain interval increments. The complexity of the rectangle covering operation is forcibly reduced from $O(X \times Y)$ to $O(1)$. Finally, a single $O(N^2)$ two-dimensional prefix sum scan restores the grid state, combined with the absolute coordinate differences before mapping to calculate the final covered area.

State Design and Algorithm Derivation

1. Discretized Grid Slicing Topology

Let the bottom-left corner coordinates of the $i^{th}$ rectangle be $(x1_i, y1_i)$, and the top-right corner coordinates be $(x2_i, y2_i)$. Collect all $x1_i, x2_i$ into set $X$, and all $y1_i, y2_i$ into set $Y$. Sort and deduplicate $X$ and $Y$. The sizes of the deduplicated sets are $M_x$ and $M_y$, respectively. The point $(i, j)$ in the discretized grid corresponds to the point $(X[i-1], Y[j-1])$ in the original absolute coordinate system. The key is the mapping of area represented by the grid: the cell $(i, j)$ in the discretized grid (i.e., the bottom-right vertex being the $i^{th}$ $X$ and the $j^{th}$ $Y$ discretized coordinates) represents an absolute area in physical space given by:

$$ \Delta S_{i, j} = (X[i] - X[i-1]) \times (Y[j] - Y[j-1]) $$

2. 2D Difference State Transition

For each input rectangle, first use binary search (std::lower_bound) to obtain the 1-indexed indices of its bottom-left and top-right corners in the discretized set, denoted as $(lx, ly)$ and $(rx, ry)$. The discretized grid interval covered by this rectangle is $x \in [lx+1, rx]$, $y \in [ly+1, ry]$. Perform four-point corrections on the 2D difference matrix $D$:

$$ D[lx+1][ly+1] \leftarrow D[lx+1][ly+1] + 1 $$

$$ D[rx+1][ly+1] \leftarrow D[rx+1][ly+1] - 1 $$

$$ D[lx+1][ry+1] \leftarrow D[lx+1][ry+1] - 1 $$

$$ D[rx+1][ry+1] \leftarrow D[rx+1][ry+1] + 1 $$

When restoring the state, use the two-dimensional prefix sum recurrence:

$$ S[i][j] = S[i-1][j] + S[i][j-1] - S[i-1][j-1] + D[i][j] $$

If $S[i][j] > 0$, it indicates that the discretized grid is covered, and $\Delta S_{i, j}$ is accumulated to the total area.

C++ Standard Source Code

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

// Rectangle structure definition
struct Rectangle {
    int x1, y1, x2, y2;
};

void solveMatrixCoverage() {
    int n;
    if (!(cin >> n)) return;

    vector<Rectangle> rects(n);
    vector<int> xs, ys;
    xs.reserve(2 * n);
    ys.reserve(2 * n);

    for (int i = 0; i < n; ++i) {
        cin >> rects[i].x1 >> rects[i].y1 >> rects[i].x2 >> rects[i].y2;
        xs.push_back(rects[i].x1);
        xs.push_back(rects[i].x2);
        ys.push_back(rects[i].y1);
        ys.push_back(rects[i].y2);
    }

    // Coordinate discretization preprocessing
    sort(xs.begin(), xs.end());
    xs.erase(unique(xs.begin(), xs.end()), xs.end());
    sort(ys.begin(), ys.end());
    ys.erase(unique(ys.begin(), ys.end()), ys.end());

    int mx = xs.size();
    int my = ys.size();

    // Critical pitfall: the size of the difference matrix must be allocated to the size of the discrete set + 2 to prevent index out of bounds during four-point corrections
    vector<vector<int>> d(mx + 2, vector<int>(my + 2, 0));

    for (int i = 0; i < n; ++i) {
        // Map absolute coordinates to discretized array indices (0-indexed)
        int lx = lower_bound(xs.begin(), xs.end(), rects[i].x1) - xs.begin();
        int rx = lower_bound(xs.begin(), xs.end(), rects[i].x2) - xs.begin();
        int ly = lower_bound(ys.begin(), ys.end(), rects[i].y1) - ys.begin();
        int ry = lower_bound(ys.begin(), ys.end(), rects[i].y2) - ys.begin();

        // Key point: transform rectangle coverage into grid interval endpoint corrections, noting the index offset
        d[lx + 1][ly + 1] += 1;
        d[rx + 1][ly + 1] -= 1;
        d[lx + 1][ry + 1] -= 1;
        d[rx + 1][ry + 1] += 1;
    }

    long long total_area = 0;
    vector<vector<int>> s(mx + 1, vector<int>(my + 1, 0));

    // Restore the two-dimensional prefix sum and calculate the area
    for (int i = 1; i <= mx; ++i) {
        for (int j = 1; j <= my; ++j) {
            s[i][j] = s[i - 1][j] + s[i][j - 1] - s[i - 1][j - 1] + d[i][j];

            // Key point: if the prefix sum is greater than 0, it indicates that the current discretized grid [i-1, i]x[j-1, j] is effectively covered
            if (s[i][j] > 0 && i < mx && j < my) {
                long long width = xs[i] - xs[i - 1];
                long long height = ys[j] - ys[j - 1];
                total_area += width * height; // Critical pitfall: large value range multiplication must be explicitly cast to long long
            }
        }
    }

    cout << total_area << "\n";
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    solveMatrixCoverage();

    return 0;
}

NOIP Practical Pitfall Guide

Area Multiplication Overflow and Type Truncation Even if the input rectangle coordinates have small indices after one-dimensional discretization (for example, $\leq 2000$), the total area statistics still involve absolute distances in the original large value range (up to $10^9$). If you directly write total_area += width * height, since width and height are not explicitly declared as long long, the compiler will prioritize multiplication as int, resulting in integer truncation overflow before assigning to total_area, leading to negative or erroneous large values.
Grid Mapping and Point Coordinate Misalignment The 2D difference maintains grid blocks (Cells) rather than grid points (Points). Many participants directly use the indices obtained from binary search as the matrix length and width during four-point corrections, writing d[lx][ly] += 1. This will lead to contamination of adjacent grids that should not be covered after the difference restoration. It is essential to clarify that the $i^{th}$ position after discretization and the $(i-1)^{th}$ position are separated by a grid. Therefore, the physical projection of the input interval $[lx, rx]$ must shift the difference grid boundaries to lx + 1 and rx, otherwise boundary overlapping counting errors will occur.

Classic NOIP/Luogu Problems

1. Luogu P2034 Rectangle Coverage / Luogu P5490 [Template] Scan Line

Problem Description: Given $N$ rectangles in the plane parallel to the coordinate axes, find the total coverage area of these rectangles. The rectangle coordinate value range is $[-10^9, 10^9]$, $N \leq 10^3$ (for the discretization and difference method) or $N \leq 10^5$ (for the scan line tree structure).
Problem Essence: Calculating the area of union of 2D intervals in a large value range.
Core Problem-Solving Idea: When $N \leq 10^3$, directly apply the “2D difference + discretization” introduced in this article. Discretize all x and y coordinates into a maximum compact matrix of $2000 \times 2000$. Use 2D difference $O(1)$ to mark, and finally sweep through the overall prefix sum. When $N$ expands to $10^5$, the space complexity of the 2D difference $O(N^2)$ will lead to MLE; at this point, the physical model needs to be upgraded: discretize one dimension (e.g., the X-axis), while treating the other dimension (Y-axis) as a scan line axis, dynamically maintaining a one-dimensional segment tree from left to right, reducing space to $O(N)$.

2. Luogu P4048 [AHOI2014] Knight Game

Problem Description: On a large matrix map, multiple knights cast spells in different coverage ranges, each spell having a designated damage area. Meanwhile, defensive towers intercept at specific points. Find the number of layers of coverage in certain key matrix areas.
Problem Essence: High-frequency online/offline queries for the number of overlapping layers in large value range multi-rectangle regions.
Core Problem-Solving Idea: Due to the extremely large coordinate range, it is not possible to allocate the original array. Since all spell release rectangles and query rectangles are known at the beginning or offline, all spell boundaries and query boundaries' $X, Y$ coordinates are extracted for global discretization. After constructing the compact grid, use the 2D difference array to perform four-point increment operations (weight +1) on the released spell areas. After preprocessing all spells, perform a two-dimensional prefix sum on the entire discretized matrix, where each position in the matrix represents the true layer count of that grid covered by spells. Finally, for the query rectangles, directly perform area summation or maximum value retrieval on the restored prefix sum matrix.

Efficient Approach to Large-Scale Matrix Coverage Problem Using 2D Difference and Discretization