Efficient Calculation of Combinatorial Numbers: Linear De...

Core Logic and Mathematical Principles

In combinatorial counting problems at the NOIP level, it is often necessary to compute combinatorial numbers under the constraint of large number modulo (the modulus $M$ is usually a large prime such as $998244353$ or $10^9+7$):

$$C_n^m = \frac{n!}{m!(n-m)!}$$

Due to the involvement of division operations, it is not possible to directly take the modulus step by step for the numerator and denominator under modular arithmetic. The division must be transformed into multiplication, i.e., multiplying by the multiplicative inverse of the denominator:

$$C_n^m \equiv n! \cdot [m!]^{-1} \cdot [(n-m)!]^{-1} \pmod M$$

As long as we can efficiently compute the inverse of the factorial, we can respond to any combinatorial number query in $O(1)$ time.

If we independently use fast exponentiation (Fermat's Little Theorem) to solve the inverse for each query or each factorial, the single complexity would be $O(\log M)$. In extreme scenarios where $N \le 10^7$ and multiple queries are involved, this would lead to a degradation in total time complexity, resulting in TLE. Therefore, we must introduce the linear inverse backtracking technique for factorials.

Algorithm Derivation and State Design

To process all factorial inverses from $1$ to $N$ offline in $O(N)$ time, we need to utilize the reverse recursive relationship of the inverses.

Let $fact[i] = i! \bmod M$ be the factorial array, and $invFact[i] = [i!]^{-1} \bmod M$ be the factorial inverse array.

Based on the definition of factorial:

$$fact[i] = fact[i-1] \cdot i$$

We take the inverse on both sides of the equation:

$$[fact[i]]^{-1} \equiv [fact[i-1] \cdot i]^{-1} \pmod M$$

$$invFact[i] \equiv invFact[i-1] \cdot i^{-1} \pmod M$$

To achieve linearity without relying on single-point inverses, we multiply both sides of the equation by $i$:

$$invFact[i] \cdot i \equiv invFact[i-1] \cdot i^{-1} \cdot i \pmod M$$

$$invFact[i-1] \equiv invFact[i] \cdot i \pmod M$$

Algorithm Execution State Design

Forward Preprocessing: Recursively compute the standard factorial array $fact[i]$ from $1$ to $N$.
Single Breakthrough: Only for the maximum boundary $N$, call fast exponentiation once for the factorial $fact[N]$, calculating $invFact[N] = (fact[N])^{M-2} \bmod M$.
Reverse Topological Linear Recursion: Utilize the derived state transition equation to loop backward from $N$ to $1$:

$$invFact[i-1] = invFact[i] \cdot i \bmod M$$

During the reverse scan, we can also easily derive the inverse of individual elements (if required by the problem): $i^{-1} = invFact[i] \cdot fact[i-1] \bmod M$.

C++ Standard Source Code

#include <iostream>
#include <vector>

const int MOD = 998244353; // Common high-frequency prime modulus in production environments
const int MAXN = 5000000;  // Simulating the extreme data range for NOIP

long long fact[MAXN + 5];
long long invFact[MAXN + 5];

// Fast exponentiation using Fermat's Little Theorem
long long quick_power(long long base, long long exp) {
    long long res = 1;
    base %= MOD;
    while (exp > 0) {
        if (exp & 1) res = (res * base) % MOD;
        base = (base * base) % MOD;
        exp >>= 1;
    }
    return res;
}

// Core of the bidirectional linear preprocessing for factorials and their inverses
void init_combinatorics(int n) {
    fact[0] = 1;
    invFact[0] = 1;
    for (int i = 1; i <= n; ++i) {
        fact[i] = (fact[i - 1] * i) % MOD;
    }

    // Key breakpoint: only compute fast exponentiation for the last element once
    invFact[n] = quick_power(fact[n], MOD - 2);

    // Reverse flatten the entire factorial inverse chain
    for (int i = n; i >= 1; --i) {
        invFact[i - 1] = (invFact[i] * i) % MOD; // Critical note: multiply by i, not inv[i]
    }
}

// O(1) response for combinatorial number queries
long long query_C(int n, int m) {
    if (m < 0 || m > n) return 0; // Boundary defensive interception
    return fact[n] * invFact[m] % MOD * invFact[n - m] % MOD;
}

int main() {
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(NULL);

    int n = MAXN;
    init_combinatorics(n);

    int queries;
    if (std::cin >> queries) {
        while (queries--) {
            int qn, qm;
            std::cin >> qn >> qm;
            std::cout << query_C(qn, qm) << "\n";
        }
    }
    return 0;
}

NOIP Practical Pitfall Guide

Mistakes in the formula during reverse recursion When executing the reverse recursion invFact[i-1] = invFact[i] * i % MOD, some contestants may confuse the concepts in their minds and mistakenly write it as invFact[i-1] = invFact[i] * inv[i] % MOD. This will cause the inverse array to collapse directly. Always remember: the reverse recursion for factorial inverses multiplies the original number $i$, not the inverse of $i$.
Array boundary overflow and uninitialized $fact[0]$ When handling extreme cases such as $C_n^0$ or $C_n^n$, the calculation will involve seeking invFact[0]. If you forget to initialize fact[0] = 1 and invFact[0] = 1, or if you do not safely derive the state of $0$ in the reverse loop, these special combinatorial numbers will multiply by uninitialized garbage values in memory, leading to catastrophic logical failures.

Classic NOIP/Luogu Problems

1. Luogu P3197 [HNOI2008] Jailbreak

Problem Description: There are $M$ types of religions, and $N$ prisoners are lined up. Each prisoner chooses a religion. If adjacent prisoners have the same religion, a jailbreak will occur. Find how many possible violent schemes can lead to a jailbreak. The result should be taken modulo $100003$.
Essential Nature of the Problem: The total set minus the invalid set.
Core Solving Idea: The total number of schemes is $M^N$. The schemes that do not lead to a jailbreak (i.e., adjacent people have different religions) is $M \cdot (M-1)^{N-1}$. Therefore, the number of schemes that lead to a jailbreak is $M^N - M \cdot (M-1)^{N-1}$. Directly use fast exponentiation to solve. Note to add the modulus when performing subtraction to prevent negative results: (ans % MOD + MOD) % MOD.

2. Luogu P2822 Combinatorial Number Problem (NOIP 2016 Advanced Group)

Problem Description: Given $n, m$, and a fixed constant $k$, determine how many pairs $(i, j)$ satisfy $C_i^j$ being a multiple of $k$ for all $0 \le i \le n, 0 \le j \le \min(i, m)$. Multiple queries, $n, m \le 2000$.
Essential Nature of the Problem: Dynamic programming with Pascal's triangle and two-dimensional prefix sums.
Core Solving Idea: Note that the modulus $k$ in this problem is not a fixed large prime and may even be very small, so the above inverse algorithm cannot be used. Since $n, m \le 2000$ is relatively small, directly use the Pascal's triangle recurrence formula $C_i^j = (C_{i-1}^{j-1} + C_{i-1}^j) \bmod k$. If the result after computation is $0$, it means it is divisible by $k$. Then use a two-dimensional prefix sum to build a statistical matrix, allowing each query to be answered in $O(1)$ after preprocessing in $O(N^2)$ time.

Efficient Calculation of Combinatorial Numbers: Linear Derivation and Implementation of Factorial Inverses