Efficient Prefix Sum Algorithm and Its Application in 2D ...

Core Logic and Mathematical Principles

Prefix sum is an efficient preprocessing technique whose physical essence is the application of the Inclusion-Exclusion Principle to discrete grid integration. By trading space for time, this method reduces the query complexity of interval or region sums from $O(N)$ or $O(N \times M)$ in brute-force enumeration down to $O(1)$.

Prefix sum is responsible for "discrete integration" (storing global cumulative quantities), while interval queries must utilize the "Inclusion-Exclusion Principle" for spatial geometric addition and subtraction corrections to eliminate irrelevant regions.

In one-dimensional space, prefix sum is defined as the cumulative sum of the first $i$ elements of a sequence. For the original array $A$, its prefix sum array $S$ satisfies:

$$S[i] = \sum_{k=1}^{i} A[k]$$

Using the discretized form of the Fundamental Theorem of Calculus, the contiguous subarray sum over interval $[l, r]$ can be directly computed via the difference of two prefix sums:

$$\sum_{k=l}^{r} A[k] = S[r] - S[l-1]$$

In two-dimensional space, prefix sum extends to matrix region sums. Define $S[i][j]$ as the algebraic sum of all elements within the submatrix with $(1,1)$ as the top-left corner and $(i,j)$ as the bottom-right corner. The element sum over any rectangular region $[x_1, y_1] \sim [x_2, y_2]$ is essentially a definite integral over a two-dimensional plane. Through geometric overlapping and trimming, multi-dimensional spatial boundaries can be partitioned, allowing the extraction of arbitrary submatrices in constant time using the Inclusion-Exclusion Principle.

State Design and Algorithm Derivation

1. State Recurrence for 2D Prefix Sum

Define state $S[i][j]$ as the sum of the submatrix $\sum_{u=1}^{i} \sum_{v=1}^{j} A[u][v]$. When $S[i-1][j]$ (the upper rectangle), $S[i][j-1]$ (the left rectangle), and $S[i-1][j-1]$ (the diagonally overlapping rectangle) are known, the addition of the current cell $A[i][j]$ causes the overlapping region to be counted twice. According to the Inclusion-Exclusion Principle, the recurrence equation for the current state is:

$$S[i][j] = S[i-1][j] + S[i][j-1] - S[i-1][j-1] + A[i][j]$$

2. Derivation of $O(1)$ Matrix Region Sum Query

Given the top-left coordinates $(x_1, y_1)$ and bottom-right coordinates $(x_2, y_2)$ of the query submatrix. The target region sum is equivalent to the total matrix with $(x_2, y_2)$ as the bottom-right corner, minus the upper irrelevant region $[1, x_1-1] \times [1, y_2]$, minus the left irrelevant region $[1, x_2] \times [1, y_1-1]$. At this point, the top-left block $[1, x_1-1] \times [1, y_1-1]$ has been subtracted twice and must be added back once. Therefore, the precise mathematical expression for any submatrix region sum is:

$$\text{Sum}(x_1, y_1, x_2, y_2) = S[x_2][y_2] - S[x_1-1][y_2] - S[x_2][y_1-1] + S[x_1-1][y_1-1]$$

Template

2D Prefix Sum

// 1. Preprocessing phase: Build the 2D prefix sum array, time complexity O(N * M)
for (int i = 1; i <= n; ++i) {
    for (int j = 1; j <= m; ++j) {
        s[i][j] = s[i - 1][j] + s[i][j - 1] - s[i - 1][j - 1] + a[i][j];
    }
}

// 2. Query phase: Use the Inclusion-Exclusion Principle to compute the sum of submatrix (x1, y1) to (x2, y2) in O(1)
long long ans = s[x2][y2] - s[x1 - 1][y2] - s[x2][y1 - 1] + s[x1 - 1][y1 - 1];

NOIP Practical Pitfall Guide

Preventing Data Overflow and Negative Modulo Results: Prefix sum and query variables must be declared as long long; when performing modulo operations with subtraction in the Inclusion-Exclusion formula, (ans % MOD + MOD) % MOD must be used to normalize negative results.
Preventing Out-of-Bounds and Branch Prediction Penalties: Forcefully adopt 1-indexed arrays, setting the 0th row and 0th column to 0 as natural sentinels. This avoids out-of-bounds crashes and eliminates internal if boundary checks, ensuring modern CPU execution efficiency.

Even if the original matrix elements satisfy $A[i][j] \le 10^5$, when the matrix scale reaches $10^3 \times 10^3$, the maximum prefix sum can reach $10^{11}$, far exceeding the $2 \times 10^9$ upper limit of the int type. The prefix sum array s and query result variables must be forcefully declared as long long. When performing large-modulo prefix sums (e.g., modulo $10^9+7$), because the Inclusion-Exclusion formula contains subtraction terms, computing (s[x2][y2] - s[x1-1][y2] - s[x2][y1-1] + s[x1-1][y1-1]) % MOD can easily produce negative results. The reason for producing negative numbers is that once the prefix sum array s is taken modulo during preprocessing, the array no longer stores the true "total sum" but rather the "remainder." You must use (ans % MOD + MOD) % MOD to normalize the result to a positive value.

If you are accustomed to using 0-indexed arrays, the $i-1$ and $j-1$ in the recurrence equation will trigger severe memory out-of-bounds access (Segmentation Fault) when $i=0$ or $j=0$. Adding an if(i > 0) check inside the loop would disrupt the modern CPU's branch prediction and pipelining optimization, significantly reducing execution speed. The standard engineering specification is to forcefully use 1-indexed arrays, setting all elements in the 0th row and 0th column to zero as natural boundary sentinels.

Classic NOIP/Luogu Problems

1. Luogu P2280 [HNOI2003] Laser Bomb

Problem Description: A new type of laser bomb can destroy all targets within an $R \times R$ square. There are $N$ targets on the map, each with coordinates $(x,y)$ and value $v$. Find the maximum total value of targets that one bomb can destroy.
Problem Essence: Maximum region sum query within a fixed-size 2D window.
Core Solution: Since the map boundary is fixed (typically on the order of $5000 \times 5000$), a 2D grid can be established. Accumulate the value of each target at its corresponding coordinate point, then preprocess the entire map to obtain a 2D prefix sum. Since the bomb covers a square, its physical boundary requires attention: if the bomb coverage radius is $R$, it is equivalent to querying the sum of all submatrices of size $R \times R$ in the 2D prefix sum matrix. Iterate through all possible bottom-right coordinates $(i, j)$, compute the region sum in $O(1)$, and maintain the global maximum. The overall complexity is optimized from brute-force $O(N \cdot R^2)$ to $O(M^2 + N)$, where $M$ is the map boundary size.
The input coordinates for this problem may be 0. To adhere to the 1-indexed prefix sum safety boundary principle, when reading input, all target coordinates must be uniformly shifted (i.e., $x = x+1, y = y+1$), forcibly translating the entire coordinate system into the range $[1, 5001]$. Additionally, when the bomb coverage radius $R$ exceeds the map boundary, take $\min(R, 5001)$ to prevent queries from exceeding the prefix sum array bounds.

2. Luogu P1387 Largest Square

Problem Description: Given an $N \times M$ binary matrix of 0s and 1s, find the side length of the largest square consisting entirely of 1s.
Problem Essence: 2D prefix sum verification or combined with dynamic programming.
Core Solution: This problem can use prefix sums to verify in $O(1)$ whether a submatrix consists entirely of 1s. Preprocess the 2D prefix sum of the binary matrix. If all elements within a submatrix of size $L \times L$ are 1, then the region sum of that submatrix must strictly equal $L^2$. By enumerating the top-left coordinates $(i, j)$ and the side length $L$ of the square, $O(1)$ verification can be performed using the 2D prefix sum formula. To further avoid $O(N^3)$ triple-loop constant issues, binary search on the side length $L$ or dynamic programming optimization can be introduced: using binary search on the square side length in the outer loop compresses the total time complexity to $O(N M \log(\min(N, M)))$, ensuring a perfect score.
The optimal solution for this problem is pure dynamic programming (with the state transition equation $f[i][j] = \min(f[i-1][j], f[i][j-1], f[i-1][j-1]) + 1$), which further reduces complexity to $O(N \cdot M)$. The advantage of the prefix sum approach is that it can easily handle variant problems where the matrix contains arbitrary weights rather than just 0s and 1s, such as "largest square with a bounded sum of weights."

Efficient Prefix Sum Algorithm and Its Application in 2D Matrices