Open☀️3D

Schedule ⏰

13:20 - 13:30	Welcome & Introduction
13:30 - 14:00	Keynote: Jen Jen Chung
14:00 - 14:30	Keynote: Vishal Patel
14:30 - 14:45	Oral Sessions / Challenge Winners
14:45 - 15:15	Keynote: Thomas Funkhouser
15:15 - 16:00	Poster Session & Coffee Break
16:00 - 16:30	Keynote: Angela Dai
16:30 - 17:00	Keynote: Manolis Savva
17:00 - 17:30	Panel Discussion

Invited Speakers 🧑‍🏫

Professor Vishal Patel

Johns Hopkins University

Vishal M. Patel is an associate professor of electrical and computer engineering and a member of the Vision and Image Understanding Lab. His research interests are focused on computer vision, machine learning, image processing, medical image analysis, and biometrics. Patel is an associate editor of the IEEE Transactions on Pattern Analysis and Machine Intelligence journal and chairs the conference subcommittee of IAPR Technical Committee on Biometrics (TC4). He has received a number of awards including the 2021 IEEE Signal Processing Society (SPS) Pierre-Simon Laplace Early Career Technical Achievement Award, the 2021 NSF CAREER Award, the 2021 IAPR Young Biometrics Investigator Award (YBIA), the 2016 ONR Young Investigator Award, and the 2016 Jimmy Lin Award for Invention.

Professor Angela Dai

Technical University of Munich

Angela Dai is an assistant professor at the Technical University of Munich (TUM) where she leads the 3D AI Lab. Her research focuses on understanding how the 3D world around us can be modeled and semantically understood. Prof. Dai is the creator of the seminal ScanNet benchmark that sparked the development of numerous 3D scene understanding works.

Professor Manolis Savva

Simon Fraser University

Manolis Savva is an assistant professor in the School of Computing Science at Simon Fraser University, and a Canada Research Chair in Computer Graphics. His research focuses on analysis, organization and generation of 3D content. The methods that he works on are stepping stones towards holistic 3D scene understanding revolving around people, with applications in computer graphics, computer vision, and robotics. Prof. Savva contributed highly influential works towards embodied AI including Matterport and Habitat.

Professor Thomas Funkhouser

Google / Princeton University

Thomas Funkhouser is a full professor at Princeton University and a senior research scientist at Google. His research focuses on computer graphics, computer vision, and in particular 3D machine perception. In recent years, Professor Funkhouser has greatly impacted the field of 3D scene understanding.

Professor Jen Jen Chung

University of Queensland

Jen Jen Chung is an associate professor in Mechatronics within the School of Information Technology and Electrical Engineering at the University of Queensland. Her current research interests include perception, planning and learning for robotic mobile manipulation, algorithms for robot navigation through human crowds, informative path planning and adaptive sampling.

Related Works 🧑‍🤝

Below is a collection of concurrent and related works in the field of open-set 3D scene understanding. Please feel free to get in touch to add other works as well.

Important Dates 🗓️

Paper Track: We accept novel full 8-page papers for publication in the proceedings, and shorter 4-page extended abstracts of either novel or previously published work that will not be included in the proceedings. All submissions shall follow the ICCV 2023 author guidelines.

Submission Portal: CMT
Paper Submission Deadline: July 31, 2023 (23:59 Pacific Time)
Notification to Authors: August 9, 2023
Camera-ready submission: August 21, 2023

Challenge Track:

Submission Portal: EvalAI
Data Instructions & Helper Scripts: GitHub
Dev Phase Start: July 13, 2023
Submission Portal Start: July 17, 2023
Test Phase Start: August 16, 2023
Test Phase End: September 30, 2023 (Winners are decided on this date)

Accepted Papers 📄

CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition
Deepti B. Hegde, Jeya Maria Jose Valanarasu, Vishal Patel

CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D Dense CLIP
Junbo Zhang, Runpei Dong, Kaisheng Ma

The Change You Want to See (Now in 3D)
Ragav Sachdeva, Andrew Zisserman

Learning to Prompt CLIP for Monocular Depth Estimation: Exploring the Limits of Human Language
Dylan Auty, Krystian Mikolajczyk

SAM3D: Segment Anything in 3D Scenes
Yunhan Yang, Xiaoyang Wu, Tong He, Hengshuang Zhao, Xihui Liu

POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images
Antonin Vobecky, Oriane Siméoni, David Hurych, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Josef Sivic

OVIR-3D: Open-Vocabulary 3D Instance Retrieval Without Training on 3D Data
Shiyang Lu, Haonan Chang, Eric P. Jing, Yu Wu, Abdeslam Boularias, Kostas Bekris

Challenge Results

We have published a technical report providing an overview of our workshop challenge, results, and the methods of the winning teams!

Top-3 ranking teams from our workshop challenge are listed below:

Rank	Team	Method	mAP (↑)	AP_50 (↑)	AP_25 (↑)
1	PICO-MR Hongbo Tian^1,2, Chunjie Wang¹, Xiaosheng Yan¹, Bingwen Wang¹, Xuanyang Zhang¹, Xiao Liu¹ ¹PICO, ByteDance, Beijing ²Beijing University of Posts and Telecommunications	-	6.08	14.08	17.67
2	VinAI-3DIS Phuc Nguyen¹, Khoi Nguyen¹, Anh Tran¹, Cuong Pham¹ ¹VinAI Research	GitHub	4.13	12.14	39.41
3	CRP Zhening Huang¹, Xiaoyang Wu², Xi Chen², Hengshuang Zhao², Lei Zhu³, Joan Lasenby¹ ¹University of Cambridge ²HKU ³HKUST (Guangzhou)	-	2.67	5.06	13.98