Frequently Asked Questions

General Challenge Questions

1. We would like to participate. What do we need to do?

Please fill out this participation intent form to list your institution, your team and the tracks you will participate in. You just need to follow the instructions and submit the form.

2. I am interested only in submitting a paper but not in the Challenge. Can I do that?

Yes. Please make sure to submit your paper by the submission deadline.

3. How large can a team be?

There are no restrictions on team size.

4. What are the rules for downloading the data set?

A participation agreement is available ahead of the data being shared. You need to accept that agreement and submit that response ahead of getting access to the dataset.

5. Can I use any available data set to train models for detecting vehicles in this Challenge?

Yes. There is no constraint in terms of model and approach used for performing your tasks. You are free to use whatever method you deem best.

6. What are the prizes?

This information is shared in the Awards section.

7. Will we need to submit our code?

Winning teams will need to submit their code for verification purposes so that organizers can ensure that the tasks were performed by algorithms and not humans.

8. How will the submissions be evaluated?

The submission formats for each track are detailed on the Data and Evaluation page.

9. When is the deadline to submit our final evaluation results?

Evaluation results are due on Apr 5 at 5 pm Pacific. Please see the updated timeline. The submission system will be opened again after a few days and will allow teams to submit additional results, but these results will not be considered when choosing winning teams for the challenge.

10. Some of the videos have objects that are tiny at the far end. Will you provide us with the region of interest (ROI) for each video sequence?

While ROIs will not be provided for the videos, teams can safely ignore small distant objects or vehicles on side-streets in all videos.

11. Will all teams need to submit a paper in mid-February, even though the challenge does not end until April?

Please see the updated timeline. Teams that have successfully submitted results to at least one track by the submission deadline should also submit a paper. Accepted papers may not be limited only to those of winning teams.

12. How long should submitted research/challenge papers be?

Both research and challenge papers should be 6-8 pages in length and follow the CVPR format.

13. Is it possible to submit our solution as the paper submission? It means no quantitative results since we don’t have ground truth of testing data. Can we just leave the experiment results and fill in the values after you announce it?

Yes, a paper discussing only a solution is acceptable.

Users will be able to see their own local scores as soon as they start submitting. Additionally, there will be 1 day between releasing the challenge leaderboard and the paper deadline, which will allow teams to include results in the papers before submitting. An updated timeline for the challenge has been posted here.

14.  When is the deadline of CVPRW paper submission for review? We only see the deadline of camera-ready paper submission on the webpage. But there is no deadline given for paper submission for review.

The paper deadline is April 6, which should be camera ready for review, since it will be reviewed in a very short time frame, and given that it is accepted, the authors will only have 1 day to send a final version of the paper. So this is way we ask for “camera-ready” version by April 6. We think that this is better than setting the paper deadline way back to the old deadline time.

Track 1 Questions

1. Is there a training set of speed estimation and multiple-sensor tracking for us to fine-tune the performance? It looks like UA-DETRAC does not provide ground truths for the two tasks.

No such training set will be provided. Teams are free to use any other datasets available on the Web that could help train their models.

2. What are the metrics to evaluate the performance of speed estimation and multiple-sensor tracking? Is ground truth data available for all the objects in the experimental videos?

The evaluation strategy and metrics are detailed on the Data and Evaluation page. Tracks 1 and 3 are evaluated based on a subset of ground-truth control vehicles, whose speed and location information was collected during the experiment. Track 2 videos have been manually annotated by a third party.

3. Are the box coordinates (ymin, xmin) calculated from the videos’ upper or bottom left corner (for which point do we have y=0)? Does the pixels counting start from 0 or from 1? Are the box coordinates integers or floating-point numbers?

The bounding box coordinates are integers. Similar to the VOC2012 challenge format, bounding box coordinates (xmin, ymin, xmax, ymax) are computed from the top-left corner of the image, and counting starts at 0, i.e., a bounding box sitting on the left and top edges of the image would have xmin and ymin equal to 0.  We have clarified the evaluation page to reflect this.

4. The ground truth speed values provided by the DETRAC dataset, which will be used for training, consist essentially of the tracking’s distance of the moving object. Is it exactly this value (pixel’s magnitude) that you are going to evaluate in Track 1 task or do you require the actual speed (km/h) of the vehicle?

The response variable is the actual speed in mi/h (not km/h). We have clarified the evaluation page to reflect this.

5. What do you mean by normalized RMSE (NRMSE)?

RMSE scores range between 0 and infinity. Min-max normalization reduces the range to [0-1], providing a relative RMSE score of each team in relation to the scores obtained by other teams. NRMSE is computed as

    \[ NRMSE_i = \frac{RMSE_i-RMSE_{min}}{RMSE_{max}-RMSE_{min}} \]

where RMSEmin  and  RMSEmax are the minimum and maximum RMSE values among all teams, respectively. We have clarified the evaluation page to reflect this.

6. In submission system, what does the local S score I got mean? Is it the actual score calculated based on evaluation equation?

The idea of S scores in submission system is to display the relative scores rather than the actual scores. Even if we display the actual score for the first submission, the second submission makes the first submissions S score 1 or 0. This is the same logic we are following for the global S scores too.

Track 2 Questions

1. In the instructions of the submission format there is an ambiguity concerning the value <timestamp>. Does it refer to the time at which an anomaly starts (is detected) or the time at which the anomaly score is the highest? Moreover, what happens with the duration of an anomaly? Are we interested in this information during evaluation process? Is this incorporated in a way in the aforementioned timestamp value?

Teams should indicate only the starting point of the anomaly, e.g., when the first vehicle hit another vehicle or ran off the road. The duration of the anomaly does not need to be reported.

2.Concerning videos with multiple anomalies, if a second anomaly occurs while the first anomaly is still in progress should we identify it as a new anomaly?

No, only one anomaly should be reported. We have clarified the evaluation page to reflect this. In particular, if a second anomaly happens within 2 minutes of the first, it should be counted the same anomaly as the first (i.e., a multi-car pileup is treated as one accident).

3. Finally in the submission file, should we send only the anomalies detected or the top 100 scores, concerning the most possible abnormal events or could it be less/more?

You should not submit more than 100 predicted anomalies. The evaluation strategy is designed to penalize false positives. As such you should only submit N high confidence results, where N <= 100.

4. For the “timestamp”, what kind of format should we follow? In your evaluation page, you mention “<timestamp> is the relative time, in seconds, from the start of the video (e.g., 12.3456)” But in you testing data, you give the results “2 260 0.86”. Which one is correct? “260” vs “12.2456”?

The timestamp 260 refers to 260.0 seconds from the onset of the video, i.e. 4 minutes and 20 seconds into the video. If you look at video 2, you will see a car stopping on the side of the road around 4 minutes and 20 seconds into the video.

5. If our confidence scores are all binary value, like 0 or 1 of each frame, how do you handle this case?

The confidence score should be between 0 and 1. It is not currently used in the evaluation but may be used in the future. As such, it would be beneficial to include confidence scores if possible.

Track 3 Questions

1. In the challenges of multiple-object tracking, usually all the tracking methods under comparison need to be based on the same set of noisy detection results in order to be fair. Will there be any shared detection results provided, or should teams use their own detectors for testing?

A shared detection result will not be provided. The challenge is a holistic one, and does not focus on tracking quality alone. As such, teams may use any detection and tracking methods they choose.

2. Will information regarding the specific starting and ending time in real world for videos in Track 3 be provided? Such information is needed to properly perform multi-sensor tracking.

Task 3 asks that all vehicles that pass through all 4 checkpoints at least once are identified and tracked through those points. It does not assume that the vehicles pass through those points within the same trip, going from one location to another. As such, synchronization between recording locations or exact recording start and end times will not be needed to successfully solve this task.

3. What kind of cars are we looking for (car with green label, for example)? Are there any distinct characteristics, or the choice was completely random?

There are no distinct characteristics of the vehicles in question. As stated in track 3, the only requirement is that the vehicle must pass through each checkpoint (sensor location) at least once.

4. Are there more than 100 valid trackable moving cars? Should we submit those 100 that we are more confident for?

The evaluation strategy is designed to penalize false positives. As such you should only submit N high confidence results, where N <= 100.