In recent years, deep learning has made great progress in several fundamental tasks such as image recognition and speech recognition, but there are still many problems to be explored in the field of video content understanding.
Douyin is a short-form video app launched in China in 2016. Its parent company ByteDance has been working on technologies that can better understand and recommend short videos.
A picture is worth thousands of words. One picture contains a large amount of information which is difficult to be described in a few words, not to mention more rich media such as short videos. Currently, ByteDance has tons of short video clips and the behavior data of hundreds of millions of users who watched them. Therefore, ByteDance has enough video content and user behavior data to predict users' preference for short videos.
This challenge provides multi-modal video features, including visual features, text features, and audio features, as well as user interactive behavior data, such as click, like, and follow. Each participant needs to model the user's interest through video features and user interaction behavior data set, and then predict the user's click behavior on another video dataset.
The leaderboard will use the method which is described on the evaluation webpage to score the results submitted by the participants.
The challenge asks participants to predict the probability that each user finishes watching and likes a given video of test dataset.
We use AUC (area under ROC curve) as our challenge metric. The higher the AUC, the higher the ranking.
A large-scale dataset with hundreds of millions of data.
A relatively small-scale dataset with tens of millions of data.
If you have any question, please post it in the discussion webpage. For questions of the datasets, please send an email to AI-Labfirstname.lastname@example.org. For questions of submission, please send an email to email@example.com.
Short Video Understanding Challenge
Sponsor：ICME2019 & ByteDance Inc.