Mastering Probability Theory and Dynamic Programming: Sta...

Core Logic and Mathematical Principles

Probability theory serves as the mathematical foundation for handling uncertainty in decision-making and expected dynamic programming (DP) in competitions. In NOIP-level assessments, the boundaries of probability theory are typically confined to discrete sample spaces.

1. Sample Space and Conditional Probability

Let $\Omega$ be a discrete finite sample space, and $A$ and $B$ be two events within it.

Conditional Probability: The probability of event $A$ occurring given that event $B$ has already occurred, denoted as $P(A \mid B)$:

$$P(A \mid B) = \frac{P(A \cap B)}{P(B)} \text{ (if } P(B) > 0 \text{)}$$

In algorithm competitions, conditional probability is often used to correct transition coefficients during state progression.

2. Law of Total Probability

Let $B_1, B_2, \dots, B_n$ be a partition of the sample space $\Omega$ (i.e., the events are mutually exclusive and $\bigcup_{i=1}^n B_i = \Omega$, with $P(B_i) > 0$). Then for any event $A$, we have:

$$P(A) = \sum_{i=1}^n P(A \mid B_i) P(B_i)$$

Competition Graph Theory Boundary: The law of total probability corresponds to the current node's state probability being the weighted sum of all its predecessor nodes' probabilities in graph theory topological networks or DP state machines.

3. Bayes' Theorem

Based on conditional probability and the law of total probability, it is used to implement reverse probability inference for "causal reasoning":

$$P(B_j \mid A) = \frac{P(A \mid B_j) \times P(B_j)}{P(A)} = \frac{P(A \mid B_j) \times P(B_j)}{\sum_{i=1}^n P(A \mid B_i) P(B_i)}$$

Given that a specific result $A$ has occurred, Bayes' theorem can accurately compute the posterior probability that it was caused by a specific reason $B_j$.

State Design and Algorithm Derivation

In common probability dynamic programming (Probability DP), we typically need to maintain the absolute probability of a specific state occurring.

Probability State Design on Topological Graphs

Let $f[u]$ represent the probability of reaching the directed node $u$ in the graph. Suppose node $u$ has several incoming edges from the predecessor set $pre(u)$. For each predecessor node $v \in pre(u)$, the transition probability (branch coefficient) from $v$ to $u$ is $w(v, u)$, satisfying $\sum_{k \in post(v)} w(v, k) = 1$ (i.e., the outgoing edge probabilities are normalized).

According to the law of total probability, the state transition equation for node $u$ is:

$$f[u] = \sum_{v \in pre(u)} f[v] \times w(v, u)$$

Execution Sequence Determination:

If the graph structure is a Directed Acyclic Graph (DAG): Directly use topological sorting to strictly control the computation order, scanning and filling the table in linear time $O(N + M)$.
Boundary Conditions: Initialize the start node $S$ with $f[S] = 1$, while all other nodes have an initial probability of $f[u] = 0$.

C++ Standard Source Code

#include <iostream>
#include <vector>
#include <queue>
#include <cmath>

// Definition of edge structure
struct Edge {
    int to;
    double prob; // Transition conditional probability P(to | from)
};

const int MAXN = 100000;
std::vector<Edge> graph[MAXN + 5];
int in_degree[MAXN + 5];
double f[MAXN + 5]; // State array: stores the absolute probability of reaching each node

// Topological sort for DAG probability dynamic programming
void compute_probability_dp(int n, int start_node) {
    std::queue<int> q;

    // Initialize boundary: starting node probability is 1, others default to 0
    f[start_node] = 1.0;

    for (int i = 1; i <= n; ++i) {
        if (in_degree[i] == 0) {
            q.push(i);
        }
    }

    while (!q.empty()) {
        int u = q.front();
        q.pop();

        for (const auto &edge : graph[u]) {
            int v = edge.to;
            // Core transition: algebraic accumulation using the law of total probability
            f[v] += f[u] * edge.prob; 

            if (--in_degree[v] == 0) {
                q.push(v);
            }
        }
    }
}

int main() {
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(NULL);

    int n, m, start_node;
    if (std::cin >> n >> m >> start_node) {
        for (int i = 0; i < m; ++i) {
            int u, v;
            double p;
            std::cin >> u >> v >> p;
            graph[u].push_back({v, p});
            in_degree[v]++;
        }

        compute_probability_dp(n, start_node);

        // Output the probability of the target endpoint
        for (int i = 1; i <= n; ++i) {
            std::cout << "Node " << i << " Prob: " << f[i] << "\n";
        }
    }
    return 0;
}

NOIP Practical Pitfall Guide

Failing to intercept conditions for isolated starting points or disconnected nodes during probability accumulation When performing topological sort DP, nodes with an in-degree of 0 may include not only the designated start node start_node, but also other isolated dead nodes that are completely disconnected from the main connected component. The initial probability of these dead nodes is f[u] = 0. If their garbage states are used to update successors without control, it may not affect numerical correctness, but if complex multiplication or division operations are involved (such as calculating ratios), it can easily lead to NaN (Not a Number) exceptions.
Logical failure in direct comparison of floating-point precision When determining the boundary for probability transitions (for example, checking if a certain branch probability is 0), it is absolutely forbidden to write if (edge.prob == 0.0). Due to the internal representation of floating-point numbers in computers using IEEE 754 binary approximation, a continuous multiplication resulting in 0 may become a very small value on the order of $10^{-17}$. It is necessary to introduce a small epsilon value eps (such as $10^{-9}$) for range interception: if (std::abs(edge.prob) < 1e-9).

Classic NOIP/Luogu Problems

1. Luogu P4316 Green Frog's Destination

Problem Description: Given a directed acyclic graph starting from node 1 to node $N$. Starting from the starting point, each time a node is reached, one outgoing edge is chosen with equal probability. Find the expected total length of the path to the endpoint.
Essential Nature of the Problem: The linear properties of expectation and the conditional probability transitions in a topological graph.
Core Solution Idea: Although this problem seeks the expectation, the transition coefficient at each step strictly depends on the out-degree of the current node. Let the out-degree of node $u$ be $out\_deg[u]$, then the conditional probability of taking any outgoing edge is $\frac{1}{out\_deg[u]}$. By either reversing the graph or applying forward topological DP, multiplying the weight of each edge by the absolute probability of reaching the current node, and performing total probability accumulation, the final result can be converged in linear time.

2. Luogu P2218 [HAOI2007] Covering Problem / Variant Luogu P5104 Red Packet Distribution

Problem Description: A person distributes a total amount of $w$ in red packets, which $n$ people will compete to grab. The amount each person grabs is uniformly distributed in the interval $(0, 1)$ of the remaining total amount. Find the expected value of the amount grabbed by the $k$-th person.
Essential Nature of the Problem: Conditional expectation probability derivation in continuous probability spaces.
Core Solution Idea: For the first person, the expected amount grabbed is clearly half of the remaining amount, i.e., $\frac{w}{2}$, leaving $\frac{w}{2}$. For the second person, facing the remaining $\frac{w}{2}$, the expected amount grabbed is also half of the remaining, i.e., $\frac{w}{4}$. Through mathematical induction, it is easy to capture the characteristic: after each person grabs, the expected remaining amount will halve. Therefore, the expected amount grabbed by the $k$-th person is precisely $\frac{w}{2^k}$. In the code, directly apply fast exponentiation to calculate the value of $2^k$ under floating-point representation and perform division.

Mastering Probability Theory and Dynamic Programming: State Transition and Bayesian Inference in Graph Theory