Search Box

Monday, December 21, 2015

AI Machines Understanding Stories

Now AI Machines Are Learning to Understand Stories

Face and speech recognition is now child’s play for the most advanced AI machines. But understanding stories is much harder. That looks set to change.

arXiv Org | December 14, 2015

Artificial-intelligence techniques are taking the world by storm. Last year, Google’s DeepMind research team unveiled a machine that had taught itself to play arcade video games. Earlier this year, a team of Chinese researchers demonstrated a face-recognition system that outperforms humans, and last week, the Chinese internet giant Baidu revealed a single speech-recognition system capable of transcribing both English and Mandarin Chinese.
Two factors have made this possible. The first is a better understanding of many-layered neural networks and how to fine-tune them for specific tasks. The second is the creation of the vast databases necessary to train these networks.

The team then asked human annotators to choose read the synopses for each movie. They then had to formulate a number of questions about each paragraph they read, along with the answer. On average, the annotators wrote five questions per paragraph. They also had to highlight a section of the text that provided the answer to each question.
Finally, Tapaswi and co asked the annotators to read each question and answer and come up with four wrong answers to create a multiple choice quiz. The resulting database contains over 7,000 questions about 300 films.

<more at; related links: (MovieQA: Understanding Stories in Movies through Question-Answering. Makarand Tapaswi, Yukun Zhu, Rainer Stiefelhagen, Antonio Torralba, Raquel Urtasun, and Sanja Fidler. arXiv:1512.02902v1 [cs.CV]. Submitted December 9, 2015. [Abstract: We introduce the MovieQA dataset which aims to evaluate automatic story comprehension from both video and text. The dataset consists of 7702 questions about 294 movies with high semantic diversity. The questions range from simpler "Who" did "What" to "Whom", to "Why" and "How" certain events occurred. Each question comes with a set of five possible answers; a correct one and four deceiving answers provided by human annotators. Our dataset is unique in that it contains multiple sources of information -- full-length movies, plots, subtitles, scripts and for a subset DVS. We analyze our data through various statistics and intelligent baselines. We further extend existing QA techniques to show that question-answering with such open-ended semantics is hard. We plan to create a benchmark with an active leader board, to encourage inspiring work in this challenging domain.]) and (2014 in Computing: Breakthroughs in Artificial Intelligence. The past year saw progress in developing hardware and software capable of human feats of intelligence. December 29, 2014)>

No comments:

Post a Comment