2026 Challenge Track Description

Track 6: Cross-City Object Detection (Milestone Project Hafnia)

  • Overview 

Track 6 focuses on fine-grained object detection in real-world traffic imagery under geographic domain shift. Participants are asked to train models on data from one city and evaluate them on hidden benchmarking data that includes both source-city samples and samples from a distinct target city with different visual characteristics, scene layouts, viewpoints, and environmental conditions.

In this context, “fine-grained object detection” means detecting more specific object categories, such as distinguishing between car, van, pickup truck, truck, motorcycle, bicycle, trailer, and person, rather than using broader categories such as vehicle. This remains an object detection task: the required outputs are bounding boxes and class labels, not pixel-level segmentation masks.

The track is designed to study cross-city generalization, a challenging setting that remains underexplored due to the difficulty of obtaining large-scale, compliant, privacy-preserving real-world data from multiple locations.

Track 6 is powered by Milestone Project Hafnia, which provides access to anonymized real-world traffic data through a managed Training-as-a-Service platform. Participants do not receive direct access to the full training or benchmarking datasets. Instead, training and benchmarking are conducted through Hafnia platform workflows that support privacy-conscious and compliance-aware experimentation.

Track 6 and the associated registration information will be released on May 18, 2026. 

  • Registration and access 

On the Hafnia Track 6 community page, users will find an overview, all relevant links, and registration information:

https://community.hafnia.milestonesys.com/home/clubs/ai-city-challenge-track-6-omnhs/overview

Applications will be reviewed by the organizers. Access is expected to be granted typically within 3–4 days after registration, provided that the application is valid.

Participation is limited to 200 individual Hafnia accounts. Access will be granted on a first-come, first-served basis after validation of the application. Additional seats may be considered depending on demand.

Participants must register using an email address associated with an official organization, such as a university, research lab, or company. Personal or commercial email domains, including common personal email providers, may not be accepted for registration.

Participants may join as individuals or as part of a team. Each Hafnia account must correspond to a real individual participant. Teams must not request more accounts than the number of actual team members. Misrepresentation of team size or account ownership may lead to disqualification.

Access may be revoked for inactive accounts, including accounts with no login or no experiments for an extended period, or for accounts that violate challenge or platform rules.

To help improve the platform, participants may be invited to complete a short survey about their experience with Milestone Project Hafnia. This feedback will support future research events and potential commercial use cases.

Should you have any questions, you can reach out to us at:

info@comms.hafnia.milestonesys.com.

  • Data 

The track is based on a subset of a large-scale real-world traffic dataset curated through Milestone Project Hafnia.

The training split, including a train and validation split, contains approximately:

    • 13,000 frames
    • 150,000 annotated object instances

The hidden benchmarking set contains a comparable number of frames and annotated instances. The benchmarking set includes both images from the training city and images from a new target domain. At a high level, participants should expect the benchmark to test robustness to domain shift across factors such as city, viewpoint, road layout, visual conditions, and object distributions. Specific benchmark details will remain hidden.

The data are extracted from real-world traffic video streams and include diverse viewpoints, roadway types, camera perspectives, and imaging conditions. Images are primarily provided at high resolution, with visual characteristics representative of real-world static traffic cameras.

The benchmark includes the following 10 object classes:

Class ID    Class Name
0 Vehicle.Car
1 Vehicle.Pickup Truck
2 Vehicle.Single Truck
3 Vehicle.Combo Truck
4 Vehicle.Heavy Duty Vehicle
5 Vehicle.Trailer
6 Vehicle.Motorcycle
7 Vehicle.Bicycle
8 Vehicle.Van
9 Person
    • Car – A standard passenger vehicle primarily designed to transport a small number of people (typically up to 5–7 passengers). Includes sedans, hatchbacks, coupes, and SUVs used for personal transportation. Police cars are also included.
    • Pickup Truck – A light-duty vehicle with an enclosed passenger cabin and an open cargo bed at the rear. Designed for transporting both passengers and goods.
    • Single Truck – A truck consisting of a single rigid frame with one main body section, without an attached trailer (i.e., 2, 3 or 4+ single unit trucks). Includes delivery trucks, box trucks, and rigid cargo trucks.
    • Combo Truck – A truck composed of a tractor unit pulling one or more trailers. Also referred to as an articulated truck or semi-truck combination (i.e., 3, 4, 5, 6+ axle semi-trucks and twin trailer semi-trucks)
    • Heavy Duty Vehicle – A large commercial or industrial vehicle designed for demanding transport or operational tasks. Includes construction vehicles, dump trucks, cement mixers, tractors, and other specialized heavy machinery operating on roads.
    • Trailer – A non-motorized vehicle designed to be towed by another vehicle for transporting goods, equipment, or materials. Includes cargo trailers, container trailers, and utility trailers. This class will mostly appear together with other vehicles that are not trucks, such as, cars or pickup trucks. Trailers may also transport some vehicles, so they will have been labelled in that case.
    • Motorcycle – A two-wheeled motorized vehicle designed for one or two riders. Includes scooters, mopeds, and standard motorcycles. People on motorcycles will not be included in the vehicle bounding box.
    • Bicycle – A human-powered or electric-assisted two-wheeled vehicle operated using pedals. Includes standard bicycles and e-bikes. People on bicycles will not be included in the vehicle bounding box.
    • Van – A medium-sized vehicle designed for transporting goods or groups of people. Typically characterized by a box-shaped body and larger cargo or passenger capacity than a standard car.
    • Person – A human individual visible in the scene, regardless of posture or activity. Includes standing, sitting, walking, running, or partially visible individuals, provided there is sufficient visual information to identify them as a person. In addition, person inside/on a vehicle or inside a building (even if visible through windows or doors) has been labelled as a person. People or faces appearing in the scene but not corresponding to a physical pedestrian or drivers have been also annotated (e.g., advertisements featuring real people appear on vehicles, bus stops, or billboards). However, graphic representations of people, such as, drawings, paintings or sculptures, have not been labelled.

Annotations are provided as axis-aligned 2D bounding boxes, format detailed in Hafnia Data format documentation ( https://github.com/milestone-hafnia/hafnia/blob/main/docs/dataset.md ). Objects are annotated when they are at least partially visible, including partially occluded or truncated objects.

Although Hafnia datasets may support additional metadata such as tracking information in other contexts, Track 6 is a single-image object detection task. Tracking information and temporal cues will not be available for the challenge dataset and must not be used at inference time. 

  • Data Access

Participants will not be able to directly download, view, or extract the full training or benchmarking datasets.

The full training data will be accessible only through managed Hafnia training jobs. The platform will block direct access to the full dataset.

Participants will have access to:

    • A small downloadable sample dataset for local development
    • Dataset statistics and metadata
    • Documentation describing the dataset format
    • Platform-based access to the full training split through managed experiments

The sample dataset may be used locally for the purpose of Track 6 to understand the visual content, validate data loading pipelines, adapt training code, and test Docker compatibility.

The full benchmarking data will remain hidden. Benchmarking will be performed in two steps once the functionality becomes available: first, inference will run inside the Hafnia platform on the hidden benchmarking data; second, the generated prediction files will be submitted to the official AI City Challenge evaluation system, where evaluation against the ground truth and ranking will take place.

  • Task

Given a single input image, participants must detect objects belonging to the 10 predefined Track 6 classes and assign each detection the correct class label.

The task is formulated as a single-image fine-grained object detection problem. In this context, fine-grained means that models must distinguish between specific object categories, such as different types of vehicles, rather than detecting only a broad vehicle category. This is still an object detection task, not an instance segmentation task: predictions must consist of bounding boxes and class labels, not pixel-level masks.

No temporal information, tracking information, or video sequence information may be used at inference time.

The main challenge is not only to achieve strong detection accuracy, but to build models that generalize across geographic domains, including different cities, camera viewpoints, road layouts, visual conditions, and object distributions.

  • Training Workflow

Training will be performed through the Hafnia Training-as-a-Service platform.

Participants will prepare and upload a trainer package containing their code, model definition, training command, configuration files, and required environment information. The platform will then execute the training job on the selected dataset split using the selected compute resources.

Participants may start training after they have been accepted into the platform and granted access to Track 6.

A quickstart guide and example trainer package for training object detectors will be provided through the Hafnia platform to help participants prepare their training workflow.

For the AI City Challenge, each participant account will receive 30,000 Hafnia credits. These credits can be used to run training jobs on the available GPU tiers throughout Track 6. GPU availability, credit cost, and detailed compute configuration will be described in the Hafnia documentation.

Participants should expect that lower-tier GPUs will provide approximately 400–500 hours of training, depending on the final credit cost. Higher-tier GPUs will consume credits faster and may have lower availability, meaning that jobs using larger GPUs may take longer to start.

Participants may choose among the available GPU tiers while credits remain available. Credits are consumed when used and will not reset or be automatically replenished.

Each participant may run one experiment at a time.

Uploaded training materials are subject to platform limits. The total upload package, including trainer files, model files, Docker-related files, and associated materials, must fit within a 2 GB upload limit.

If a training job is stopped or fails before completion, participants will not receive a trained model artifact from that incomplete run. Restarting from checkpoints is not currently supported as a platform feature. Participants may download completed model artifacts and use them as the basis for a new training job, where applicable.

  • Pretrained Models, Ensembles, and External Data

Pretrained models are allowed.

External datasets are not allowed for training within the Track 6 challenge workflow. Participants may bring models that were pretrained outside the challenge, including models pretrained on their own data, but the challenge training stage must use the Track 6 data provided through Hafnia.

Ensembles are allowed, provided that they can be executed as a single inference pipeline within the platform constraints.

Very large models may be limited by platform constraints, including upload size, model size, runtime, memory, and available compute resources. Participants should review the platform documentation before preparing large models or complex inference pipelines.

  • Benchmarking and submission

Benchmarking functionality is expected to become available in early June 2026.

Once benchmarking is enabled, participants will upload their inference package to Hafnia. The package will include the trained model weights, inference source code, Docker environment, configuration files, and any required runtime parameters.

Benchmarking will run in two steps. First, the Hafnia platform will run inference on the hidden benchmarking data and generate prediction files in the format required by the official AI City Challenge evaluation system. Second, those prediction files will be submitted to the AI City Challenge evaluation and ranking website, where evaluation against the hidden ground truth and ranking will take place.

Participants will manually download the generated prediction files from Hafnia and submit them to the official AI City Challenge evaluation and ranking website.

Automatic transfer from Hafnia to the AI City Challenge evaluation system will not be part of the Track 6 workflow.

  • Evaluation

The primary evaluation will be based on object detection performance on the hidden benchmarking data.

Ranking is expected to based on standard object detection metrics, including mAP, IoU thresholds, class-level averaging, and class/city-level performance. The final ranking is expected to use a single aggregate score combining these evaluation components.

The benchmarking set will include both source-city and target-city samples, with emphasis on robust performance under cross-city domain shift.

  • Model Ownership and Research Use

Participants remain responsible for any rights in the models they train, and no ownership is transferred to the organizers.

Trained model weights may be downloaded after completed training jobs during throughout Track 6 duration. Models and results obtained through Track 6 are intended for research purposes only and are not authorized for commercial use.

In accordance with the AI City Challenge rules, awarded models must be reproducible and made publicly available through GitHub. Award candidates will be required to release the source code and materials needed to reproduce their submitted approach by the open-source deadline listed below.

To ensure fairness and rule compliance, organizers may inspect submitted training and inference materials, including code and configuration files. Challenge results may be used by the organizers to report, communicate, and promote the challenge and platform outcomes. Organizers will not claim ownership of participants’ models or source code.

Bringing your own data into Hafnia is not available as part of Track 6. Participants may see references to bring-your-own-data functionality in general Hafnia documentation, but this functionality is outside the scope of the AI City Challenge Track 6 workflow.

  • Privacy and Compliance

A key motivation of Track 6 is to enable research on real-world traffic data while preserving privacy and compliance.

The challenge data are anonymized. Participants do not receive direct access to the full training or benchmarking corpus. Instead, full-data training and benchmarking are performed through managed platform workflows that prevent direct dataset extraction.

This setup enables privacy-conscious benchmarking and compliance-aware experimentation on real-world traffic imagery.

  • Dataset Citation

Participants using the Track 6 dataset in papers, reports, or other research outputs should cite the dataset as follows:

@misc{hafnia-dataset-eccv-cross-city,
author = {Milestone Systems},
title = {Hafnia Dataset: ECCV Cross City Object Detection Dataset},
url = {https://mdi.milestonesys.com/datasets/070b3bd5-0266-4446-a318-052c993558ef},
note = {Part of the Hafnia project},
version = {0.0.1},
address = {Copenhagen, Denmark},
year = {2024},
month = {dec},
}
  • Important dates

      • May 18, 2026 – Track 6 page released
      • May 18, 2026 – Registration information released
      • May 18, 2026 – Hafnia access process opens
      • May 18, 2026 – Sample dataset released
      • Early June 2026 – Benchmarking and inference functionality expected to become available
      • June 10th – Q&A with Milestone Project Hafnia
      • July 10, 2026 – Challenge submission
      • July 24, 2026 – Workshop paper due
      • August 1, 2026 – Acceptance notification
      • August 7, 2026 – Open source by award candidates
      • August 15, 2026 – Camera-ready papers due
      • September 8–9, 2026 – Presentations and awards at ECCV
  • Prizes

Prizes for Track 6 will be provided through the AI City Challenge organization, with NVIDIA as the challenge organizer and prize provider. Prize amounts, categories, eligibility, and award conditions will be disclosed by the challenge organization.

  • Organizers

Track 6 is organized by:

    • Milestone Project Hafnia
    • Universidad Autónoma de Madrid

The overall AI City Challenge is organized by NVIDIA.

Track organizers:

    • Fulgencio Navarro — Milestone Project Hafnia
    • Rafael Martin — Milestone Project Hafnia
    • Peter Christiansen — Milestone Project Hafnia
    • Juan Carlos SanMiguel — Universidad Autónoma de Madrid
    • Álvaro García-Martín — Universidad Autónoma de Madrid
  • Contact

For Track 6 and Hafnia-related questions, please contact:

info@comms.hafnia.milestonesys.com

Additional registration and community information will be available through the Hafnia Track 6 community page:

https://community.hafnia.milestonesys.com/home/clubs/ai-city-challenge-track-6-omnhs/overview