2021 Evaluation System
The 2021 evaluation system is available here.
Please use the “Sign Up” link in the upper-right corner to sign up for an evaluation account. The system requires an email as the username and allows you to select your own password. We request that you use the same email in the evaluation system as one of the team member emails in your original dataset request. This email will be used to validate your request and provide you results submission access. After filling out your team details and submitting your evaluation account request, you will receive an email asking you to verify the account. Once you verify the account, you can log in to the evaluation system. However, you will not be able to submit results until an administrator approves your submission access. Please allow 24-48 hours for this approval to be completed.
When your account has been approved to submit results, the “Add” button under the appropriate Track tab on the Submissions page will be enabled and you will be able to submit to the given track. Note that submissions may not be enabled for all tracks at the same time, thus the Add button may be present for some but not all tracks. When submitting results, teams can choose to submit to the “Public” or the “General” leaderboards. As the name suggests, the Public leaderboard will be shared with the public and published in our workshop summary paper. To submit to the Public leaderboard and compete for the challenge prizes, teams may not use external data in computing their prediction models for any of the tracks and must submit the code, models, and any labels they created on the training datasets to the competition organizers before the end of the challenge. Alternatively, teams can submit to the General leaderboard, which will also include results from Public leaderboard submissions.
Several submission limits have been imposed in the challenge. Each team may only submit up to 5 results per track per day (Pacific time) and may only submit a total of 20 results for tracks 2, 3, and 5, and 10 results for tracks 1 and 4. The limits include both Public and General submissions. Note also that the results displayed in the submissions tables and on the leaderboard are computed on a 50% subset of the test data. After the competition deadline, the system will automatically display evaluation results based on the full test set. The leaderboard will only display the top 3 results and your current rank (if not in the top-3) before the competition deadline, but will show the entire list of best team submissions at the end of the competition, including team names (please choose a descriptive team name).