Open☀️3D

Introduction

The ability to perceive, understand and interact with arbitrary 3D environments is a long-standing goal in research with applications in AR/VR, health, robotics and so on. Current 3D scene understanding models are largely limited to low-level recognition tasks such as object detection or semantic segmentation, and do not generalize well beyond the a pre-defined set of training labels. More recently, large visual-language models (VLM), such as CLIP, have demonstrated impressive capabilities trained solely on internet-scale image-language pairs. Some initial works have shown that these models have the potential to extend 3D scene understanding not only to open set recognition, but also offer additional applications such as affordances, materials, activities, and properties of unseen environments. The goal of this workshop is to bundle these efforts and to discuss and establish clear task definitions, evaluation metrics, and benchmark datasets.

Schedule

13:30 - 13:45	Welcome & Introduction
13:45 - 14:15	Keynote 1
14:15 - 14:45	Keynote 2
14:45 - 15:00	Winner Presentations
15:00 - 15:45	Poster Session & Coffee Break
15:45 - 16:15	Keynote 3
16:15 - 16:45	Keynote 4
16:45 - 17:30	Concluding Remarks

Keynote Speakers

Laura Leal-Taixé

Senior Research Manager at NVIDIA

Dr. Laura Leal-Taixé is a Senior Research Manager at NVIDIA and also an Adjunct Professor at the Technical University of Munich (TUM), leading the Dynamic Vision and Learning group. From 2018 until 2022, she was a tenure-track professor at TUM. Before that, she spent two years as a postdoctoral researcher at ETH Zurich, Switzerland, and a year as a senior postdoctoral researcher in the Computer Vision Group at the Technical University in Munich. She obtained her PhD from the Leibniz University of Hannover in Germany, spending a year as a visiting scholar at the University of Michigan, Ann Arbor, USA. She pursued B.Sc. and M.Sc. in Telecommunications Engineering at the Technical University of Catalonia (UPC) in her native city of Barcelona. She went to Boston, USA to do her Masters Thesis at Northeastern University with a fellowship from the Vodafone foundation. She is a recipient of the Sofja Kovalevskaja Award of 1.65 million euros in 2017, the Google Faculty Award in 2021, and the ERC Starting Grant in 2022.

Krishna Murthy Jatavallabhula

PostDoc at MIT CSAIL

Krishna Murthy Jatavallabhula is a postdoc at MIT CSAIL with Antonio Torralba and Josh Tenenbaum. He received his PhD at Mila, advised by Liam Paull. His research focuses on designing structured world models for robots: rich, multisensory models of the physical world that enable robots and embodied AI systems to perceive, reason, and act just as humans are able. His work draws upon ideas from robotics, computer vision, graphics, and computational cognitive science; intertwining our understanding of the world with probabilistic inference and deep learning. His work has been recognized with PhD fellowship awards from NVIDIA and Google, and a best-paper award from IEEE RAL.

Georgia Chalvatzaki

Full Professor at Technical University of Darmstadt

As of April 2023, Georgia is a Full Professor for Interactive Robot Perception & Learning at the Computer Science Department of the Technical University of Darmstadt and Hessian.AI. Before that, she was an Assistant Professor since February 2022, and Independent Research Group Leader from March 2021, after getting the renowned Emmy Noether Programme (ENP) fund of the German Research Foundation (DFG). This project was awarded within the ENP Artificial Intelligence call of the DFG. In her research group, PEARL (previously iROSA), Dr. Chalvatzaki and her team propose new methods at the intersection of machine learning and classical robotics, taking the research for embodied AI robotic assistants one step further. The research in PEARL proposes novel methods for combined planning and learning to enable mobile manipulator robots to solve complex tasks in house-like environments, with the human-in-the-loop of the interaction process.

Alex Bewley

Researcher at Google DeepMind

Alex Bewley is a Researcher at in Google Zurich Switzerland where he investigates novel approaches to machine learning and perception. Previously, he was a Postdoc at the Applied Artificial Intelligence Lab at the University of Oxford (formally part of the Mobile Robotics Group) working with Ingmar Posner and Paul Newman. There, the scope of his research covered various domains including multi-task learning, unsupervised domain adaptation, visual attention, model introspection and interpretability. He completed his PhD research at the Queensland University of Technology (Australia) alongside the ARC Centre of Excellence for Robotic Vision. His PhD topic was focused on the automatic detection and tracking of moving objects from video data with applications towards field robotics.

Related Works

Below is a collection of concurrent and related works in the field of open-set 3D scene understanding. Please feel free to get in touch to add other works as well.

and many more ...

Dates

Paper Track: We accept novel full 14-page papers for publication in the proceedings, and either shorter 4-page extended abstracts or 14-page papers of novel or previously published work that will not be included in the proceedings. Full papers should use the official ECCV 2024 template. Extended abstracts are not subject to the ECCV rules, so they can be in any template but, as a rule to not be considered a publication in terms of double submission policies, they should be 4 pages in CVPR template format.

Submission Portal: CMT
Paper Submission Deadline: August 12, 2024
Notification to Authors: August 19, 2024
Camera-ready submission: August 25, 2024