NeFut Logo NeFut
Admin Login

Inversion Pair Counting and Dynamic Frequency Statistics Using Fenwick Tree

Published at: 2026-05-27 09:17 Last updated: 2026-06-06 13:04
#C++ #Tutorial

Core Logic and Mathematical Principles

The Binary Indexed Tree (BIT) not only maintains explicit range sums but also serves as a Dynamic Frequency Counter to manage the weight space. Its core logic involves mapping the value range of the original array to the indices of the BIT, dynamically maintaining the prefix sums of frequencies to perform various high-dimensional ordinal statistics within $O(\log N)$ time.

1. Algebraic Transformation of Inversion Pairs

For a sequence $A$, if $i < j$ and $A[i] > A[j]$, then $(A[i], A[j])$ is called an inversion pair. When traversing the sequence $A$ from left to right in the weight space, let’s denote the current element being processed as $A[j]$:

$$\text{Count}(A[i] > A[j]) = \text{Query}(\text{MAX\_VAL}) - \text{Query}(A[j])$$

2. Geometric Mapping of Discretization

If the sequence elements $A[i] \in [-10^9, 10^9]$, the weight space is too large to be used directly as indices for the BIT. Since inversion pairs and frequency statistics only care about relative sizes (total order) and not absolute values, we must discretize the values to a strict closed interval $[1, N]$. Let the discretization mapping function be $\text{rank}(x)$, which must satisfy the following mathematical property:

$$A[i] < A[j] \iff \text{rank}(A[i]) < \text{rank}(A[j])$$


State Design and Algorithm Derivation

1. State Transition for Inversion Pair Counting

The state space is defined as a BIT tree[] of size $N$, initialized to all zeros. The algorithm proceeds as follows:

  1. Discretize the original array to obtain the mapping array R[i].
  2. Loop through R[i] from $1$ to $N$:
    • First, query the number of elements in the BIT greater than R[i]: $\Delta = i - 1 - \text{Query}(R[i])$ (where $i-1$ is the total number of elements currently inserted).
    • Accumulate $\Delta$ into the global answer $ans$.
    • Insert the current element into the BIT: $\text{Add}(R[i], 1)$.

2. Dynamic Interval Frequency Statistics (Multi-dimensional Partial Ordering)

In more complex variations, such as counting the number of element pairs within a range that satisfy certain constraints. Typically, this is approached with lossless differential thinking or by splitting queries into dynamic discrete events. For example: counting the number of pairs satisfying $A[i] \le A[j] + k$. The state transition equation becomes:

$$\text{Query}(A[j] + k)$$

Each time we reach $j$, we instantaneously define the acceptable upper bound for the weights, using the query function to extract the valid frequency slice in $O(\log N)$ time.


Standard C++ Source Code

Below is the complete C++ source code for solving inversion pairs and dynamic weight statistics, including hardcore discretization preprocessing and strict long long design to prevent overflow.

#include <iostream>
#include <vector>
#include <algorithm>

const int MAXN = 500005;

// Discretization auxiliary structure
struct Element {
    int val;
    int id;
    // Strict weak ordering overload, maintaining original position order for equal values (stable mapping)
    bool operator<(const Element& other) const {
        if (this->val != other.val) return this->val < other.val;
        return this->id < other.id;
    }
};

struct TreeArray {
    int tree[MAXN];
    int size;

    void init(int n) {
        this->size = n;
        std::fill(tree, tree + size + 1, 0);
    }

    inline int lowbit(int x) const {
        return x & (-x);
    }

    void add(int x, int val) {
        for (; x <= size; x += lowbit(x)) {
            tree[x] += val;
        }
    }

    int query(int x) const {
        int sum = 0;
        for (; x > 0; x -= lowbit(x)) {
            sum += tree[x];
        }
        return sum;
    }
};

Element a[MAXN];
int ranks[MAXN]; // Store discretized weight rankings (range in [1, n])

int main() {
    std::ios_base::sync_with_stdio(false);
    std::cin.tie(nullptr);

    int n;
    if (std::cin >> n) {
        for (int i = 1; i <= n; ++i) {
            std::cin >> a[i].val;
            a[i].id = i;
        }

        // Core of discretization: using standard library stable sort to create stable mapping
        std::sort(a + 1, a + n + 1);
        for (int i = 1; i <= n; ++i) {
            ranks[a[i].id] = i; 
        }

        TreeArray bit;
        bit.init(n);

        long long inversion_cnt = 0; // Critical pitfall: maximum inversion pairs can reach O(N^2), must lock long long storage

        for (int i = 1; i <= n; ++i) {
            int curr_rank = ranks[i];

            // Query the number of elements in the BIT with rank in [1, curr_rank]
            int smaller_or_equal = bit.query(curr_rank);

            // Critical pitfall: i - 1 is the total number of elements already inserted into the BIT, subtracting those not greater than itself gives the count strictly greater than itself
            inversion_cnt += (i - 1 - smaller_or_equal);

            // Insert the frequency of the current element into the BIT
            bit.add(curr_rank, 1);
        }

        std::cout << inversion_cnt << "\n";
    }

    return 0;
}

NOIP Practical Pitfall Guide

  1. Improper Discretization Leading to Altered Total Order When mapping weights, if the original sequence contains duplicate elements (e.g., 5, 5, 2), and participants use std::unique to remove duplicates directly during discretization, it can cause both 5s to map to the same weight index. In this case, when executing i - 1 - query(curr_rank) in the BIT, it must ensure that query counts the frequencies of "strictly less than or equal to the current weight". If the sorting comparator does not use id as a secondary key, causing the relative order of equal values to be inverted, it will lead to incorrect counting of inversion pairs for tied elements, directly resulting in WA.
  2. Counter Variable Not Using long long Leading to High-Order Truncation For data sizes of $N = 5 \times 10^5$, the worst case (completely reversed sequence) can lead to a total count of inversion pairs of $\frac{N(N-1)}{2} \approx 1.25 \times 10^{11}$. This number far exceeds the maximum capacity of a 32-bit signed integer int ($2.14 \times 10^9$). If inversion_cnt is declared as int, the judging machine will experience arithmetic high-order overflow when processing large datasets, leading to truncated, negative, or random results, losing all points for that test case.

Classic NOIP/Luogu Problems

1. Luogu P1908 Inversion Pairs

2. Luogu P5094 [USACO04OPEN] Moo Fest G / Treading

Original Source: local://5.4

[h] Back to Home