OpenSUN3D

4th Workshop on Open-World 3D Scene Understanding

in conjunction with CVPR June 11 (or June 12), 2025 in Nashville, USA.

Introduction

The ability to perceive, understand and interact with arbitrary 3D environments is a long-standing goal in research with applications in AR/VR, health, robotics and so on. Current 3D scene understanding models are largely limited to low-level recognition tasks such as object detection or semantic segmentation, and do not generalize well beyond the a pre-defined set of training labels. More recently, large visual-language models (VLM), such as CLIP, have demonstrated impressive capabilities trained solely on internet-scale image-language pairs. Some initial works have shown that these models have the potential to extend 3D scene understanding not only to open set recognition, but also offer additional applications such as affordances, materials, activities, and properties of unseen environments. The goal of this workshop is to bundle these efforts and to discuss and establish clear task definitions, evaluation metrics, and benchmark datasets.

Keynote Speakers





Paper Track

We accept novel full 8-page papers for publication in the proceedings, and either shorter 4-page extended abstracts or 8-page papers of novel or previously published work that will not be included in the proceedings. Full papers should use the official CVPR 2025 template. Extended abstracts are not subject to the CVPR rules, so they can be in any template but, as a rule to not be considered a publication in terms of double submission policies, they should be 4 pages in CVPR template format.

Challenge

We host a challenge based on the SceneFun3D datasets. More details follow soon.

Organizers