NeFut Logo NeFut
Admin Login

[CS.DS] Breaking the Barrier of Non-adaptive Clustering Algorithms

Published at: 2026-06-30 22:00 Last updated: 2026-07-01 09:21
#algorithm #optimization #Clustering

In recent years, recovering the underlying $k$-clustering of a set $U$ of $n$ points by asking pair-wise same-cluster queries has garnered significant interest. Given a query $S \subset U$ with $|S|=2$, the oracle returns "yes" if the points are in the same cluster and "no" otherwise. For adaptive algorithms, the query complexity is known to be $\Theta(nk)$, while non-adaptive algorithms are extremely limited: even for $k=3$, such algorithms require $\Theta(n^2)$ queries, matching the trivial upper bound.

However, non-adaptivity is highly desirable as it allows for parallel querying. To break the quadratic barrier for non-adaptive queries, we study a natural generalization of this problem to subset queries for $|S| > 2$, where the oracle returns the number of clusters intersecting $S$.

Previous work provided an $O(n)$ query adaptive algorithm, but the realm of non-adaptive algorithms remained completely unknown. This paper presents the first non-adaptive algorithms for clustering with subset queries. Our main result is a non-adaptive algorithm making $O(n \times \log k \times (\log k + \log \log n)^2)$ queries, improving to $O(n \times \log \log n)$ when $k$ is constant.

In addition to non-adaptivity, we consider practical aspects, such as enforcing a bound $s$ on the query size. We show that $\Theta(\max(n^2/s^2,n))$ queries are necessary and obtain algorithms making $\tilde{O}(n^2 k/s^2)$ queries for any $s \leq \sqrt{n}$ and $\tilde{O}(n^2/s)$ queries for any $s \leq n$. Finally, we achieve improved upper bounds when the clusters are roughly balanced and when the algorithm is allowed two rounds of adaptivity.

Blogger's Review: The non-adaptive clustering algorithms proposed in this paper hold significant theoretical importance, especially in terms of parallel processing and query efficiency. By significantly reducing query complexity, it advances the development of clustering algorithms and offers new insights for practical applications. Although there is still room for improvement, its results undoubtedly lay a foundation for research in this field.

Original Source: https://arxiv.org/abs/2409.10908

[h] Back to Home