NeFut Logo NeFut
Admin Login

Optimizing Random Walk Problems Using the Linearity of Expectation

Published at: 2026-05-26 02:45 Last updated: 2026-06-08 09:45
#C++ #Tutorial

Core Logic and Mathematical Principles

Mathematical expectation is the weighted average of all possible values of a discrete random variable. Let the possible values of the discrete random variable $X$ be $x_1, x_2, \dots, x_n$, with corresponding probabilities $p_1, p_2, \dots, p_n$. The mathematical expectation of $X$ is defined as:

$$E(X) = \sum_{i=1}^n x_i \cdot p_i$$

In combinatorial counting and probability problems at the NOIP level, the linearity of expectation is the most efficient tool for breaking complex state dependencies. For any two random variables $X$ and $Y$, whether independent or strongly coupled and not independent, it strictly holds that:

$$E(X + Y) = E(X) + E(Y)$$

When generalized to $n$ random variables, it becomes:

$$E\left( \sum_{i=1}^n X_i \right) = \sum_{i=1}^n E(X_i)$$

Analysis of the Breaking Mechanism

In scoring or cost statistics problems, the total score $X$ is often composed of many local actions (such as whether to score at each step or whether each pair of inverted pairs exists). Due to the complex subsequent influences between actions, directly solving the probability distribution of the total state is extremely difficult.

By utilizing the linearity of expectation, we can adopt the indicator variable decomposition method:

  1. Decompose the total contribution $X$ into the sum of $n$ local minimal costs: $X = X_1 + X_2 + \dots + X_n$.
  2. Here, $X_i$ is typically designed as a $0/1$ indicator random variable (i.e., success yields $1$, failure yields $0$).
  3. Thus, $E(X_i) = 1 \cdot P(X_i=1) + 0 \cdot P(X_i=0) = P(X_i=1)$.
  4. Bypassing the overall complex joint probability distribution, we only need to independently solve the absolute probability of each local action occurring, $P(X_i=1)$, and summing them gives the total expectation.

State Design and Algorithm Derivation

State Design for the Cost of Random Walk on a Sequence

Given a sequence or graph structure, we perform random steps on it to find the expected value of a certain algebraic indicator.

Taking the example of "the number of steps required to randomly hit a target". Let the success probability of transitioning from state $i$ to state $i+1$ be $p_i$, and failure returns to state $0$. If we directly design the global expectation, the states will have complex cyclic dependencies due to the "return".

Using the linearity of expectation, we define the state variable $4E[i]$ to represent the expected number of steps to first transition from state $i$ to $i+1$. According to the branching induction of single-step transitions:

From the definition of expectation, we can write the equation:

$$\Delta E[i] = p_i \cdot 1 + (1 - p_i) \cdot \left( 1 + \sum_{j=0}^{i-1} \Delta E[j] + \Delta E[i] \right)$$

Algebraically rearranging the equation, we move the $4E[i]$ term from the right side to the left side to factor out the coefficient:

$$\Delta E[i] - (1 - p_i) \Delta E[i] = p_i + (1 - p_i) \left( 1 + \sum_{j=0}^{i-1} \Delta E[j] \right)$$

$$p_i \cdot \Delta E[i] = 1 + (1 - p_i) \sum_{j=0}^{i-1} \Delta E[j]$$

$$\Delta E[i] = \frac{1 + (1 - p_i) \sum_{j=0}^{i-1} \Delta E[j]}{p_i}$$

Using the prefix sum array $sum[i] = \sum_{j=0}^{i-1} \Delta E[j]$ to optimize the accumulation, the state transition equation collapses to:

$$\Delta E[i] = \frac{1 + (1 - p_i) \cdot sum[i]}{p_i}$$

Ultimately, the total expected number of steps from state $0$ to the endpoint $N$ is given by $\sum_{i=0}^{N-1} \Delta E[i]$, perfectly flattening the multi-step dependencies to an $O(N)$ linear recurrence.


C++ Standard Source Code

#include <iostream>
#include <vector>

const int MOD = 998244353; // The expectation problem in number theory requires modular output with a large prime

// Fermat's Little Theorem for finding the modular inverse
long long quick_power(long long base, long long exp) {
    long long res = 1;
    base %= MOD;
    while (exp > 0) {
        if (exp & 1) res = (res * base) % MOD;
        base = (base * base) % MOD;
        exp >>= 1;
    }
    return res;
}

long long get_inv(long long n) {
    return quick_power(n, MOD - 2);
}

// Solve the total expected number of steps under the random walk modulo
long long solve_expectation_walk(int n, const std::vector<long long> &p_num, const std::vector<long long> &p_den) {
    std::vector<long long> delta_E(n);
    long long prefix_sum = 0; // Maintain the current sum[i] state

    for (int i = 0; i < n; ++i) {
        // Calculate the modular value of the probability p_i = p_num[i] / p_den[i]
        long long p = p_num[i] * get_inv(p_den[i]) % MOD;
        long long one_minus_p = (MOD + 1 - p) % MOD;

        // Numerator part: 1 + (1 - p_i) * sum[i]
        long long numerator = (1 + one_minus_p * prefix_sum) % MOD;

        // Expected number of steps delta_E[i] = numerator / p_i
        delta_E[i] = numerator * get_inv(p) % MOD;

        // Dynamically update the prefix sum for the next state
        prefix_sum = (prefix_sum + delta_E[i]) % MOD;
    }

    long long total_expectation = 0;
    for (int i = 0; i < n; ++i) {
        total_expectation = (total_expectation + delta_E[i]) % MOD;
    }
    return total_expectation;
}

int main() {
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(NULL);

    int n;
    if (std::cin >> n) {
        std::vector<long long> p_num(n), p_den(n);
        for (int i = 0; i < n; ++i) {
            std::cin >> p_num[i] >> p_den[i]; // Read the numerator and denominator of the probabilities
        }
        std::cout << solve_expectation_walk(n, p_num, p_den) << "\n";
    }
    return 0;
}

NOIP Practical Pitfall Guide

  1. Mistaking the product of expectations of non-independent random variables for the product of their expectations Contestants often fall into the fatal error of over-relying on the linearity property, believing that $E(X \cdot Y) = E(X) \cdot E(Y)$. Note: The multiplicative property of expectation only holds when $X$ and $Y$ are strictly independent. If there is a causal relationship between the two states, one must maintain it through conditional expectations or joint state transitions and never split the product directly.
  2. Negative denominators in modular expectation calculations When calculating (MOD + 1 - p) % MOD, if written directly as 1 - p % MOD, the result is likely to become negative in C++. This can lead to overflow when subsequently multiplied by the inverse, resulting in astronomical numbers. Always add MOD to force a correction to a non-negative value.

Classic NOIP/Luogu Problems

1. Luogu P1365 WJMZBMR Snowball Fight / Similar to Luogu P1654 OSU!

$$E(len)_{\text{new}} = p \cdot (E(len) + 1)$$

$$E(score)_{\text{new}} = E(score) + p \cdot (2 \cdot E(len) + 1)$$

Due to the single nature of the success length, satisfying linear relationships, we can achieve an accurate calculation of the total profit expectation of the quadratic term in $O(N)$ time.

2. Luogu P3802 Pure Snow

$$P = \frac{a_1}{N} \cdot \frac{a_2}{N-1} \dots \cdot \frac{a_7}{N-6} \cdot 7!$$

Therefore, the total expectation is simply $(N - 6) \cdot P$. There is no need to run any loop DP; a pure algebraic formula can break through in $O(1)$ time.

Original Source: Local_Import

[h] Back to Home