Each team can submit their forecasts for validation set and test set (test set will be released 24 hours before the end of the competition). The forecast is the probability that an invitee will accept the invitation and answer the question.
The submission file has 4 columns, the first 3 columns define an invitation, and the last column is the predicted probability. Please do NOT change the order of rows (i.e re-sort rows)! You can use the sample_submission.txt as a reference of format.
Format of submitted files
All participants need to add their forecasts as the 4th column in the invitation dataset (Dataset 8 The validation set: invite_info_evaluate.txt). Each column is separated by a ‘/tab’. The forecast is the probability that an invitee will accept the invitation and answer a question.
[Qxxx, Mxxx, D3-H4, Score]
Please note: please keep the original order of rows and do NOT re-sort rows of the validation and test set.
1. Question ID in an invitation, in the format of ‘Qxxx’.
2. ID of the invitee (or expert) in the invitation, in the format of ‘Mxxx’.
3. Invitation created time, in the format of ‘D3-H4’.
4. Forecast of the probability that an invitee will accept an invitation and answer the question. The predicted value is a probability between 0 and 1.
We use AUC to evaluate the submitted file:
In the equation, M and N are positive and negative samples respectfully, rank is the location of the ith sample.
BAAI-Zhihu Expert Finding