GQA: Visual Reasoning in the Real World

Welcome to the GQA Challenge 2020!

Overview
Guidelines
FAQ

Please notice updated timeline. The Timeline is currently tentative and we will finalize and announce it as soon as the competition begins. Given an image and a natural language question about the image, the task is to provide an accurate natural language answer. The challenge will be hosted on EvalAI.

Nov, 2020:

GQA Challenge 2020 launched!

January 24, 2021:

Submission deadline at 23:59:59 UTC

March, 2021:

Winners' announcement

The GQA dataset, with more than 110K images and 22M questions, is available on the download page. Each image is associated with a scene graph of the image's object, attributes and relations. Each question is associated with a structured representation of its semantics, a functional program that specifies the reasoning steps have to be taken to answer it. All annotations on the training and validation sets are publicly available.

Many of the GQA questions involve multiple reasoning skills, spatial understanding and multi-step inference, thus are generally more challenging than previous visual question answering datasets used in the community. We made sure to balance the dataset, tightly controlling the answer distribution for different groups of questions, in order to prevent educated guesses using language and world priors. The dataset is complemented with a suite of new metrics to test not only the accuracy, but also the consistency, validity and plausibility of models' responses, shedding much more light on their behavior.

After the challenge deadline, all challenge participant results on the test split will be made public on the test leaderboard.

The GQA challenge has started on Nov, 2020. In this year's challenge we begin with a development phase that takes place in Nov and will then start the Test phase in January. It will end on January 24, 2021, 23:59 UTC with winners being announced in October.

Note that in this year's challenge we will also release new and updated test and challenge question sets in Novemeber. When attempting the challenge please make sure to answer each question in the dataset independently, based on the question and its corresponding image only.

Evaluation

Results must be submitted to the evaluation server by the challenge deadline. The competitors will be evaluated according to the metrics described on the evaluation page. We encourage people to first submit to the validation phase to make sure that you understand the submission procedure, as it is identical to the test and challenge submission procedure. Note that the validation and challenge evaluation servers do not have public leaderboards.

Submission

To enter the competition, first you need to create an account on EvalAI. We allow people to enter our challenge either privately or publicly. Any submissions to the challenge phase remain private and will be considered to be participating in the challenge. For submissions to the test phase, only those that were submitted before the challenge deadline and posted to the public leaderboard will be considered to be participating in the challenge.

Before uploading your results to EvalAI, you will need to create a JSON file which provides an answer to each question in submission_all_questions.json, and conforms to the following format:

results = [result]
result = {
"questionId": str,
"prediction": str
}

To submit your JSON file to the GQA evaluation servers, click on the Submit tab on the GQA Challenge 2020 EvalAI. Select the phase (Validation, Test or Challenge) and the JSON file to upload, fill in the required fields (e.g. method name and method description), and finally click Submit. After the file is uploaded, the evaluation server will begin processing. To view the status of your submission please go to My Submissions tab and choose the phase to which the results file has been uploaded. Please be patient, the evaluation may take quite some time to complete. If the status of your submission is Failed please check your stderr file for the corresponding submission.

After evaluation is complete and the server shows a status of Finished, you will have the option to download your evaluation results by selecting Result File for the corresponding submission. The result file will contain the aggregated accuracy for the corresponding phase. If you want your submission to appear on the public leaderboard, please submit to the Test phase and check the box under Show on Leaderboard for the corresponding submission.

Please limit the number of entries to the challenge evaluation server to a reasonable number, e.g., one entry per paper. To avoid overfitting, the number of submissions per user is limited to 1 upload per day (according to UTC timezone) and a maximum of 5 submissions per user. It is not acceptable to create multiple accounts for a single project to circumvent this limit. The exception to this is if a group publishes two papers describing unrelated methods, in this case both sets of results can be submitted for evaluation. However, the Validation phase allows for 10 submissions per day.

Resources

The download page contains links to all GQA images, questions, and for train/val splits, also the associated annotations. Please specify any and all external data used for training in the method description when uploading results to the evaluation server.

We provide an off-line evaluation script as well as baselines for the GQA datasets. To download the GQA evaluation script, please visit our evaluation page. Baselines can be found in our GitHub repo.

For any questions or suggestions regarding the GQA challenge or dataset, please contact dorarad@cs.stanford.edu. In case of technical questions related to EvalAI, please post a message on the GQA Challenge forum. For further clarity, we answer some common questions below:

Q: What do I do if I want to make my test results public and participate in the challenge?
A: Making your results public (i.e., visible on the leaderboard) on the test phase implies that you are participating in the challenge.

Q: What do I do if I want to make my test results public, but I do not want to participate in the challenge?
A: We do not allow this option.

Q: What do I do if I want to participate in the challenge, but I do not want to make my test results public yet?
A: Submit results to the challenge phase, which was created for this scenario.

Q: When will I find out my accuracies on the challenge split?
A: We will reveal challenge results some time after the deadline. Results will first be announced in October.

Q: Can I participate from more than one EvalAI team in the GQA challenge?
A: No, you are allowed to participate from one team only.

Q: Can I add other members to my EvalAI team?
A: Yes, you can add other members to your group.

Q: Is the daily/overall submission limit for each user or for the whole team?
A: The submission limit is for the whole team.