2025 WORKSHOP PROGRAM
This year’s AI City Challenge workshop will take place on Monday, October 20th as a full-day workshop of ICCV 2025.
Location: Hawai’i Convention Center, Honolulu, Hawai’i
Virtual attendance of our workshop will be available. We will share the link soon.
Please find tentative the workshop schedule below (Hawaii Time GMT-10):
- Monday, June17, 2024
- Workshop
- www.aicityclallenge.org
07:30 AM – 08:00 AM
Breakfast
08:00 AM – 08:30 AM
Opening – Workshop Overview
08:30 AM – 09:15AM
Keynote
Speaker: Prof. Laura Leal-Taixé
09:15 AM – 10:00 AM
Keynote
Speaker: Prof. Frank Wang
10:00 AM – 10:15AM
Morning Coffee Break
10:15 AM – 11:05 AM
Paper Presentations and Q&A – Track1
(1) Paper_ID 21: Multi-Camera 3D Object Tracking via 3D Point Clouds and Re-Identification
(2) Paper_ID 12: DepthTrack: Cluster Meets BEV for Multi-Camera Multi-Target 3D Tracking
(3) Paper_ID 26: Online 3D Multi-Camera Perception through Robust 2D Tracking and Depth-based Late Aggregation
(4) Paper_ID 23: VGCRTrack: Multi-Camera 3D Tracking with View-Aware Geometric Center Refinement
(5) Paper_ID 6: MCBLT: Multi-Camera Multi-Object 3D Tracking in Long Videos
11:05 AM – 12:05 PM
Paper Presentations and Q&A – Track2
(1) Paper_ID 4: TrafficInternVL: Understanding Traffic Scenarios with Vision–Language Models
(2) Paper_ID 34: Multi-Agent Cooperation for Traffic Safety Description and Analysis
(3) Paper_ID 14: TrafficVILA: Scaling Vision-Language Models to High-Resolution Video Understanding for Traffic Safety Analysis
(4) Paper_ID 13: TrafficInternVL: Spatially-Guided Fine-Tuning with Caption Refinement for Fine-Grained Traffic Safety Captioning and Visual Question Answering
(5) Paper_ID 27: TrafficVILA: A Multimodal Framework for Traffic Safety Description and Analysis
(6) Paper_ID 22: Domain-Aware Enhancements to Vision-Language Models for Urban Traffic Safety Question Answering
(7) Paper_ID 42: STER-VLM: Spatio-Temporal With Enhanced Reference Vision-Language Models
(8) Paper_ID 16: Task-Specific Dual-Model Framework for Comprehensive Traffic Safety Video Description and Analysis
12:05 PM – 13:30 PM
Lunch
13:30 PM – 14:15 PM
Paper Presentations and Q&A – Track3
(1) Paper_ID 3: Warehouse Spatial Question Answering with LLM Agent: 1st Place Solution of the 9th AI City Challenge Track 3
(2) Paper_ID 20: Multimodal and Multi-task Fusion for Spatial Reasoning
(3) Paper_ID 9: SmolRGPT: Efficient Spatial Reasoning for Warehouse Environments with 600M Parameters
(4) Paper_ID 11: Prompt-Guided Spatial Understanding with RGB-D Transformers for Fine-Grained Object Relation Reasoning
(5) Paper_ID 41: TinyGiantVLM: A Lightweight Vision-Language Architecture for Spatial Reasoning under Resource Constraints
14:15 PM – 15:20 PM
Paper Presentations and Q&A – Track4
(1) Paper_ID 39: A Lightweight and Data-Centric Framework for Real-Time Object Detection in Fisheye Camera
(2) Paper_ID 5: Enhanced Fisheye Object Detection via YOLO Ensemble Learning and Weighted Box Fusion
(3) Paper_ID 35: Boosting Fisheye Detection with Augmentations and Ensembles
(4) Paper_ID 17: Data Augmentation Is All You Need For Robust Fisheye Object Detection
(5) Paper_ID 7: A Unified Detection Pipeline for Robust Object Detection in Fisheye-Based Traffic Surveillance
(6) Paper_ID 8: Augmentation, Distillation and Optimization: A Practical Pipeline for Fisheye Object Detection on Edge Devices
(7) Paper_ID 25: A Real-time Vehicle Detection Pipeline with Data-centric Enhancements and Multi-stage DETR Distillation
(8) Paper_ID 32: Efficient and Distortion-Aware Fisheye Object Detection for Edge Devices
(9) Paper_ID 33: Real-Time Object Detection on Edge Devices: A Fisheye Specific DFINE
15:20 PM – 15:35 PM
Afternoon Coffee Break
15:35 PM – 15:55 PM
Paper Presentations and Q&A – Independent
(1) Paper_ID 15: EKI-GAN: Context-Aware Vehicle Trajectory Forecasting with Vehicle Factors and Environmental Information at Signalized Intersections
(2) Paper_ID 18: Hierarchical Multi-Modal Fusion for Roadside VRU Detection: Method Complementarity Under Sparse Label Constraints
15:55 PM – 16:15 PM
Award Ceremony
16:30 PM – 17:30 PM
Poster Session (29 posters)
17:30 PM
Adjourn