Core Logic and Mathematical Principles
The Longest Increasing Subsequence (LIS) is a cornerstone of linear dynamic programming for single sequences. Given a sequence $A$ of length $N$, we need to select a subsequence $A_{p_1}, A_{p_2}, ext{...}, A_{p_k}$ such that the indices satisfy $p_1 < p_2 < ext{...} < p_k$ and the values are strictly increasing, i.e., $A_{p_1} < A_{p_2} < ext{...} < A_{p_k}$, and find the maximum value of $k$.
-
Naive $O(N^2)$ State Space: The optimal solution for each position $i$ only depends on the positions $j$ before it that have values smaller than it. By enumerating predecessors for state transitions, each state needs to scan $O(N)$ predecessors, resulting in a total complexity of $O(N^2)$.
-
Binary Search Optimization $O(N \log N)$ (Greedy Dimensional Reduction): The bottleneck of the naive algorithm is that it traverses a large number of redundant predecessor states to find the maximum value. A greedy strategy is introduced: if two increasing subsequences have the same length, the one with the smaller last element has a stronger ability to accommodate new elements (greater potential). Thus, we define a helper array $D$, where $D[len]$ represents the minimum last element among all increasing subsequences of length $len$.
-
Proof of Mathematical Monotonicity: The helper array $D$ has strict monotonicity, i.e., $D[1] < D[2] < ext{...} < D[len]$. Proof Sketch: If there exists $D[i] \ge D[i+1]$, according to the definition, the second last item of a subsequence of length $i+1$ must be strictly smaller than the last item $D[i+1]$, thus we can take its first $i$ items to form a subsequence of length $i$ with a smaller last element, whose last element must be smaller than $D[i+1] \le D[i]$, which contradicts the definition that $D[i]$ is the minimum last element of a subsequence of length $i$. Utilizing the strict monotonicity of the $D$ array, for each new element $A[i]$, we can directly locate and update the $D$ array through binary search (Binary Search) in $O(\log N)$ time, reducing the overall complexity to $O(N \log N)$.
State Design and Algorithm Derivation
1. Naive Algorithm $O(N^2)$
- State Design: $f[i]$ represents the length of the longest increasing subsequence ending at $A[i]$.
- Transition Equation:
$$f[i] = \max_{1 \le j < i, A[j] < A[i]} \{ f[j] \} + 1$$
The boundary condition is $f[i] = 1$ (itself forms an independent chain).
2. Greedy Dimensional Change $O(N \log N)$
-
State Design: Instead of maintaining who ends the sequence, we maintain global characteristics. $D[len]$ represents the minimum last value of increasing subsequences of length $len$. Simultaneously, we use a variable $g ext{\_len}$ to record the maximum length of the longest increasing subsequence known so far.
-
Transition and Update Logic: Sequentially scan each element $A[i]$ in the sequence:
-
Case A: If $A[i] > D[g ext{\_len}]$, it indicates that $A[i]$ can directly follow the current longest subsequence to form a longer subsequence. Set $g ext{\_len} = g ext{\_len} + 1$ and extend $D[g ext{\_len}] = A[i]$.
-
Case B: If $A[i] \le D[g ext{\_len}]$, it indicates that $A[i]$ cannot extend the maximum length, but it can optimize the last element of an existing subsequence of some length. In the monotonically increasing array $D[1 \dots g ext{\_len}]$, use binary search (in C++ as
lower_bound) to locate the first position $D[p]$ that is greater than or equal to $A[i]$ and replace it with the smaller, more promising $A[i]$. That is, $D[p] = A[i]$. -
Final Answer: The length of the global longest increasing subsequence is $g ext{\_len}$.
C++ Standard Code (NOIP Style)
The following code demonstrates the implementation of both $O(N^2)$ and $O(N \log N)$ algorithms in a single program, focusing on the engineering standards of binary optimization, verified by compiling with Linux g++ -O2.
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
const int MAXN = 100005;
int a[MAXN];
int f[MAXN]; // State array for O(N^2)
int d[MAXN]; // Greedy incremental array for O(N log N)
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n;
if (!(cin >> n)) return 0;
for (int i = 1; i <= n; ++i) {
cin >> a[i];
}
// ==================== 1. Naive Algorithm O(N^2) ====================
int ans_n2 = 0;
for (int i = 1; i <= n; ++i) {
f[i] = 1;
for (int j = 1; j < i; ++j) {
if (a[j] < a[i]) { // Strictly increasing condition
f[i] = max(f[i], f[j] + 1);
}
}
ans_n2 = max(ans_n2, f[i]);
}
// ==================== 2. Binary Optimization O(N log N) ====================
int g_len = 0;
// Critical pitfall: d array stores values, not indices, and must maintain strict monotonicity
for (int i = 1; i <= n; ++i) {
if (g_len == 0 || a[i] > d[g_len]) {
// Case A: Larger than the last of all current longest increasing subsequences, directly append
g_len++;
d[g_len] = a[i];
} else {
// Case B: Binary search to find the first element >= a[i] and replace it
// d + 1 to d + g_len + 1 is the effective one-dimensional retrieval interval
int p = lower_bound(d + 1, d + g_len + 1, a[i]) - d;
d[p] = a[i]; // Greedy correction: replace the old larger element with a smaller last element
}
}
cout << "O(N^2) Ans: " << ans_n2 << "\n";
cout << "O(N log N) Ans: " << g_len << "\n";
return 0;
}
NOIP Practical Pitfall Guide
1. Confusion Between Strictly Increasing and Non-Strictly Increasing Binary Functions
- Phenomenon: The problem requires the longest strictly increasing subsequence ($A[i] > A[j]$), and contestants mistakenly use
upper_boundduring binary optimization. Or if the problem requires a non-decreasing increasing subsequence ($A[i] \ge A[j]$), they mistakenly uselower_bound. This can lead to logical errors in handling identical values, resulting in zero scores. - Avoidance Methods:
- For strictly increasing (no duplicates): use
lower_bound(to find the first position $ ge A[i]$ and replace it to prevent duplicates). - For non-strictly increasing (duplicates allowed): use
upper_bound(to find the first position $> A[i]$ and replace it, allowing duplicates to be behind).
2. Misunderstanding that the $D$ Array Stores the Final Longest Increasing Subsequence
- Phenomenon: Some contestants directly print the final updated $D$ array as the LIS itself when outputting the solution. This is a serious geometric conceptual error. The $D$ array only represents the "potential last values" corresponding to each length, and during updates, the relative order of its internal elements may have already deviated from the topological structure of the original sequence.
- Avoidance Methods: If the problem requires outputting the specific solution, a predecessor pointer array
pre[i]must be introduced, which maintains the actual connections of each element during the $O(N \log N)$ process, allowing backtracking from the end.