The Conversational Intelligence Challenge
Call for human evaluators to enter pre-NIPS human evaluation round
Donate your dialogue to chatbot research
We experience conversational artificial intelligence increasingly often, from bots that we can chat with on messaging apps to personal assistants and voice control interfaces. While dialogue systems are becoming more widespread, there is still a great deal of work involved in making conversational intelligence more sophisticated.
To create dialogue systems, we need conversations. So this year, the Conference on Neural Information Processing Systems (NIPS) is sponsoring an open competition called The Conversational Intelligence Challenge aimed at creating a chatbot that can hold an intelligent conversation with a human partner. The event will produce evaluations of state-of-the-art dialogue systems and an open-source data set for the future training of end-to-end systems.
Traditionally, chatbot technology was developed using sets of conversational rules, which were fine-tuned manually. Now, given the availability of large collections of real human conversations on the web, it is becoming increasingly feasible to generate rules automatically. However, the results depend on the quantity and quality of the data used to train models. This is why every volunteer human evaluator is critical for the success of the competition.
“Within the first round of human evaluation, we collected 2,500 dialogues. From this data we learned that people get involved less in a conversation with a bot, compared to chatting with a person. When two people chat, they often exchange short messages, which they can interpret correctly. With bots, the vocabulary is not rich enough, and they still lack understanding,” says Mikhail Burtsev, one of the challenge organizers and the head of the Laboratory of Neural Networks and Deep Learning at the Moscow Institute of Physics and Technology (MIPT). “We see that dialogue quality evaluations do not depend exclusively on whether bots understand messages correctly and respond adequately. There are other parameters at play. To improve the ability of bots to maintain meaningful conversation, we need to know what other factors are involved and how to measure them. Our aim in the pre-NIPS round of evaluation is to get at least 3,000 more dialogues to draw statistically sound conclusions and improve the tools for creating bots.”
The pre-NIPS human evaluation round will be held remotely from Nov. 20 to Dec. 8, 2017, via the @ConvaiBot on messaging platforms. Starting with these two weeks before the NIPS conference, teams and volunteers will chat with bots and perform quality evaluation of the teams’ solutions, which will then be adjusted over the tuning round. The final ratings of the submissions will be presented Dec. 8, during the competition session at NIPS. Please join our effort to create an open data set for the development of the next generation of conversational AI solutions and donate your dialogue via the @ConvaiBot on Telegram or Facebook Messenger.
Six teams from universities and companies from around the world are taking part in the competition finals: the University of Wroclaw, MIPT, McGill University, KAIST, AIBrain, Crosscert, UMass Lowell’s Text Machine Lab, Trinity College, the Hong Kong Polytechnic University, and Fudan University.
The competition is organized by MIPT, Université de Montréal, McGill University, and Carnegie Mellon University in partnership with Facebook, Flint Capital, IVADO, Maluuba, and Element AI.