NeFut Logo NeFut
Admin Login

[CS.AI] FllumaOne: A Revolutionary Multimodal CAD Dataset with Executable Programs

Published at: 2026-06-17 22:00 Last updated: 2026-06-20 13:46
#algorithm #Open Source #CAD

Abstract

Parametric computer-aided design (CAD) records both final geometry and the ordered construction history that determines how a part can be edited. Datasets for editable CAD research should therefore expose modeling operations, parameters, and feature dependencies together with validated geometry.

We introduce FllumaOne, a code-native multimodal CAD dataset whose models are generated by executable Python programs in Flluma, a Qt/C++ OpenCASCADE-based CAD system. Each sample aligns its program with a structured feature tree, a training-oriented intermediate representation, STEP geometry, a surface point cloud, natural-language descriptions, metadata, and eight canonical visible-edge renderings.

The primary release, FllumaOne-100K, contains 100,000 accepted samples across four template-level complexity regimes. Programs are executed and retained only after kernel geometry, solid validity, and export checks; release reports also record modality completeness and split-level duplicate tests.

A Qwen2.5-Coder-1.5B LoRA baseline trained on 80,000 samples achieves 99.98% Python syntax validity, 99.97% Flluma build success, and 99.14% STEP-export validity on the held-out 10,000-sample test split. For the 9,909 predictions converted to surface point clouds, the mean normalized Chamfer Distance is 0.002124.

The dataset supports conditioned CAD reconstruction, executable program synthesis, feature-tree prediction, B-Rep analysis, retrieval, design completion, and editable reverse engineering.

Blogger's Review: The release of the FllumaOne dataset marks a significant advancement in the CAD research field, particularly with its innovative approach of generating models via executable programs. This opens up new possibilities for design automation and reverse engineering, while its robust validation mechanisms ensure data quality, laying a solid foundation for future research and development.

Original Source: https://arxiv.org/abs/2606.17696

[h] Back to Home