NeFut Logo NeFut
Admin Login

Deep Dive into Expectation Dynamic Programming: Algorithm Principles and Implementation

Published at: 2026-05-27 09:17 Last updated: 2026-06-06 13:04
#Dynamic Programming #DP #Expectation DP

Core Logic and Mathematical Principles

Expectation Dynamic Programming (Expectation DP) is a core algorithmic tool for dealing with problems characterized by "randomness in decision paths" or "uncertainty in subsequent states." Its primary distinction from conventional dynamic programming lies in the fact that state transitions are not deterministic choices (Max/Min), but rather weighted averages (summation reduction) of the costs of all possible subsequent states.

1. Transition Mechanism of Expectation DP

Let the current state be $u$, which has several discrete random transition paths leading to the set of subsequent states $to(u)$. For each subsequent state $v \in to(u)$, the probability of transition is $p(u \to v)$, and the cost of the transition itself (such as steps, expenses, weights) is $w(u \to v)$.

According to the definition of mathematical expectation and the law of total expectation, the expected total cost $f[u]$ from state $u$ to the endpoint satisfies the following algebraic equation:

$$f[u] = \sum_{v \in to(u)} p(u \to v) \cdot \Big( f[v] + w(u \to v) \Big)$$

Using the property of probability normalization $\sum p(u \to v) = 1$, this formula is often further simplified into a classic form that is easier for code implementation:

$$f[u] = \left( \sum_{v \in to(u)} p(u \to v) \cdot f[v] \right) + \sum_{v \in to(u)} p(u \to v) \cdot w(u \to v)$$

2. Why is Expectation DP Mostly Adopted in "Backward Push (Reverse Substitution)?"

In conventional DP, we are accustomed to "forward push" (pushing from the starting point to the endpoint). However, in Expectation DP, forward pushing often leads to logical collapse.


State Design and Algorithm Derivation

1. Backward Push on a DAG

If the state transition dependencies form a Directed Acyclic Graph (DAG), we can directly use reverse topological sorting (or memoized search) to fill the table starting from the endpoint $T$.

$$f[u] = \sum_{(u, v) \in E} p(u \to v) \cdot (f[v] + w(u \to v))$$

2. Expectation DP with "Self-Loops/Cycles" and Gaussian Elimination

Many expectation problems exhibit characteristics of "staying in place" or "state cycles" (for example: if the dice roll is incorrect, one stays in place, or one randomly walks on a graph). In this case, the state transition equation will exhibit cyclic dependencies, meaning that the algebraic expression for $f[u]$ contains $f[u]$ itself or other mutually dependent unknowns, making it impossible to fill the table in a straightforward manner.

Breakthrough Strategies:

  1. Simplification of Single-Point Self-Loops: If there is only a transition from the current state to itself (with probability $p_{\text{self}}$), perform algebraic substitution directly.

$$f[u] = p_{\text{self}} \cdot (f[u] + w_{\text{self}}) + \sum_{v \neq u} p(u \to v) \cdot (f[v] + w(u \to v))$$

Move all terms containing $f[u]$ to the left side, extract the coefficient $(1 - p_{\text{self}})$, and divide to eliminate the self-loop obstruction. 2. Strongly Connected Component Loops (Full Graph Walk): If states form complex network loops, abandon recursion and treat each state $f[u]$ as an unknown $x_u$. List $N$ linear equations based on the transition topological relationships, construct a matrix, and directly use Gaussian elimination (Gaussian Elimination) to solve it violently within $O(N^3)$ complexity.


C++ Standard Source Code

Core Implementation: Memoized Search for Backward Expectation DP

For typical expectation states with a divide-and-conquer structure, using memoized search can automatically complete the reverse topological flow.

#include <iostream>
#include <vector>
#include <iomanip>

struct Edge {
    int to;
    double prob; // Transition probability
    double weight; // Transition cost
};

const int MAXN = 100000;
std::vector<Edge> graph[MAXN + 5];
double f[MAXN + 5];
bool visited[MAXN + 5];
int target_node;

// Memoized search: calculate the expected total cost from u to target_node
 double get_expectation(int u) {
    if (u == target_node) return 0.0; // Endpoint boundary: expected cost to itself is 0
    if (visited[u]) return f[u];      // Intercept already computed states

    visited[u] = true;
double total_exp = 0.0;

    for (const auto &edge : graph[u]) {
        int v = edge.to;
        // Core backward push formula: p * (f[v] + weight)
total_exp += edge.prob * (get_expectation(v) + edge.weight);
    }

    f[u] = total_exp;
    return f[u];
}

int main() {
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(NULL);

    int n, m, start_node;
    if (std::cin >> n >> m >> start_node >> target_node) {
        for (int i = 0; i < m; ++i) {
            int u, v;
            double p, w;
            std::cin >> u >> v >> p >> w;
            graph[u].push_back({v, p, w});
        }

        double ans = get_expectation(start_node);
        std::cout << std::fixed << std::setprecision(6) << ans << "\n";
    }
    return 0;
}

NOIP Practical Pitfall Guide

  1. Incorrectly initializing the endpoint state to a non-zero value or forward assigning initial values Some contestants are accustomed to forward thinking, initializing the starting point f[S] = 0 and then attempting to update backwards. This can lead to computed values diverging completely from the expected definition due to the inability to predict the global convergence value of subsequent nodes during branch calculations. It is essential to remember: the natural deterministic boundary of Expectation DP is at the endpoint, f[T] = 0 is the starting engine of recursion.
  2. Failure to handle unreachable states during Gaussian elimination leading to free variable collapse When dealing with graph theory random walk problems that require Gaussian elimination, there may be deadlock nodes that are impossible to reach the endpoint due to unidirectional connectivity or isolated nodes that cannot be reached from the starting point. The unknown equations corresponding to these nodes may cause the matrix determinant to be $0$ (resulting in multiple or no solutions). Before sending the equations to Gaussian elimination, a BFS/DFS must first be run to determine the forward and backward connectivity, removing invalid states from the equation set.

Classic NOIP/Luogu Problems

1. Luogu P4550 Collecting Stamps

$$f[i] = \frac{i}{n} f[i] + \frac{n-i}{n} f[i+1] + 1 \implies f[i] = f[i+1] + \frac{n}{n-i}$$

Using $f[i]$ as the algebraic coefficient to drive the state transitions of $g[i]$, we can push backward from $n$ to $0$, with $g[0]$ being the final solution.

2. Luogu P1850 [NOIP2016 Advanced Group] Change Classroom

Original Source: local://21.3

[h] Back to Home