3rd Computer Vision for Metaverse Workshop
CV4Metaverse 2024

August 21, 2024: Author Notifications sent
August 26, 2024: Camera-ready deadline [strict]

Co-located at ECCV 2024, Milan, Italy
29 September 2024 - 9:00 to 13:00

Room: Amber 1

Workshop Overview

In the ever-growing areas of Augmented Reality (AR), Virtual Reality (VR), and the expansive Metaverse, computer vision brings together the digital and physical worlds seamlessly. Its ability to understand and interpret visual information pushes these immersive technologies to new levels, enhancing user experiences, driving creative innovations, and exploring new frontiers.

On the other side, Natural Language Processing (NLP) is pivotal for deciphering human language and facilitating applications like translation and summarization. Large Language Models (LLMs) are now capable of human-level conversational skills, drastically enhancing human-machine interactions. As exemplified by CLIP and other multimodal foundational models, textual information plays a significant role in understanding visual data. Furthermore, as a consequence, these large models may contribute significantly to improving AR, VR, and the Metaverse, enabling hands-free navigation, voice-based commands, and immersive communication between avatars.

Challenge

The first Metaverse Apartment Retrieval Challenge will be held through this workshop.

Join the Google Group!

Join the Challenge!

Important Dates

Challenge Start: 25 March 2024
Private Test Published: 1 July 2024
Final Submission Deadline: 30 July
Winners announcement: 29 September 2024

Workshop Topics

Therefore, the third edition of the CV4Metaverse workshop, acknowledging the substantial advancements language models have made in various domains, aims at integrating both language-based and pure computer vision techniques to contribute to the advancement of the field. The areas of interest touch upon, but are not confined to, the following subjects:

Scene Understanding:
- Methods, algorithms, and systems for scene understanding to enable environmental interaction use cases in 3D scenes.
- Modeling the virtual/augmented environment (depth estimation, 3D reconstruction, object detection and tracking, multimedia understanding, etc).
Metaverse Applications:
- Different kinds of applications using Machine Learning techniques to help the Metaverse users.
- New developed datasets in the metaverse area, which can foster the research areas related to it.
Cross-Modal Applications:
- Using other types of data like textual data in facilitating or creating new applications in 3D scenes or metaverse areas.