ACM International Conference on
Multimedia Retrieval,
Hong Kong, Jun. 5 - 8, 2012



Keynote 1: Cortically-Coupled Computing for Media Retrieval

by Paul Sajda, Professor of Biomedical Engineering, Columbia University

Keynote 2: Aggregating Local Image Descriptors for Large-scale Image Retrieval and Classification
by Cordelia Schmid, INRIA Research Director, Head of the LEAR project-team, Grenoble, France

Keynote 3: The Road to Pervasive Multimedia Search and Multimodal Interaction
by Hsiao-Wuen Hon, Managing Director, Microsoft Research Asia, Beijing, China


Paul Sajda
Dept of Biomedical Engineering
Columbia University
New York, U.S.A.
Title: Cortically-Coupled Computing for Media Retrieval

Abstract: Our visual systems are amazingly complex multimedia information processing machines. Using our brain's visual system we can recognize objects at a glance, under varying pose, illumination, and scale, and are able to rapidly learn and recognize new configurations of objects and exploit relevant context even in highly cluttered scenes. However our brains are subject to fatigue and have difficulty finding patterns in high-dimensional feature spaces that are often useful representations for multimedia data. In this talk I will describe our work in developing a synergistic integration of human visual processing and computer vision via a novel brain computer interface (BCI). Our approach, which we term cortically-coupled computer vision (C3Vision), uses non-invasively measured neural signatures from the electroencephalogram (EEG) that are indicative of user intent, interest and high-level, subjective and rapid reactions to visual and multimedia data. I will describe several system designs for C3Vision and current applications that are being developed for government and commercial applications.

Biography: Paul Sajda is Professor of Biomedical Engineering and Radiology at Columbia University and Director of the Laboratory for Intelligent Imaging and Neural Computing (LIINC). His research focuses on neural engineering, neuroimaging, computational neural modeling and machine learning applied to image understanding. Prior to Columbia he was Head of The Adaptive Image and Signal Processing Group at the David Sarnoff Research Center in Princeton, NJ. He received his B.S. in Electrical Engineering from MIT and his M.S. and Ph.D. in Bioengineering from the University ofPennsylvania. He is a recipient of the NSF CAREER Award, the Sarnoff Technical Achievement Award, and is a Fellow of the IEEE and the American Institute of Medical and Biological Engineering (AIMBE). He is also the Editor-in-Chief for the IEEE Transactions in Neural Systems and Rehabilitation Engineering and a member of the IEEE Technical Committee on Neuroengineering. He has been involved in several technology start-ups and is a co-Founder and Chairman of the Board of Neuromatters, LLC., a neurotechnology research and development company.

Cordelia Schmid
INRIA Research Director
Head of the LEAR project-team
Grenoble, France
Title: Aggregating Local Image Descriptors for Large-scale Image Retrieval and Classification

Abstract: We address the problems of large scale image retrieval and classification. In both cases an appropriate image representation is important. We present and evaluate different ways of aggregating local image descriptors into a vector and show that the Fisher kernel achieves better performance than the reference bag-of-visual words approach for any given vector dimension.

Biography: Cordelia Schmid holds a M.S. degree in Computer Science from the University of Karlsruhe and a Doctorate from the Institut National Polytechnique de Grenoble (INPG). Her doctoral thesis received the best thesis award from INPG in 1996. Dr. Schmid was a post-doctoral research assistant in the Robotics Research Group of Oxford University in 1996--1997. Since 1997 she has held a permanent research position at INRIA Grenoble Rhone-Alpes, where she is a research director and directs the INRIA team called LEAR for LEArning and Recognition in Vision. Dr. Schmid is the author of over a hundred technical publications. She has been an Associate Editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence (2001--2005) and for the International Journal of Computer Vision (2004---), and she has been program chair of the 2005 IEEE Conference on Computer Vision and Pattern Recognition and of the 2012 European Conference on Computer Vision. In 2006, she was awarded the Longuet-Higgins prize for fundamental contributions in computer vision that have withstood the test of time. She is a fellow of IEEE.

Hsiao-Wuen Hon
Managing Director
Microsoft Research Asia
Beijing, China

Title: The Road to Pervasive Multimedia Search and Multimodal Interaction

Abstract: Thanks to the tremendous progress in multimedia, and natural user interface (voice, vision, touch, pen, etc.) technologies, we are entering a new era with pervasive multimedia and multimodal experience. This experience realized by a diverse array of devices, including mobile phones, PC's and TV's plus persistent cloud services will have revolutionary impact in people consume all contents and information. Many of our long await scenarios in artificial intelligence and "information at your fingertips" will be fulfilled. Technologies that enable general public accumulate and disseminate of human knowledge in multimedia form with natural multimodal user interface will be critical to improve life and well-being throughout the world. In this talk, I would like to use some recent technological advances in related areas to illustrated the excitements and opportunities in front of us. At the same time, At the same time, most upcoming technical challenges call for multi-disciplinary innovation beyond the current makeup. Thus, I will advocate how the community can exploit multi-disciplinary innovation fully to lead this mission.

Biography: Hsiao-Wuen Hon is the Managing Director of Microsoft Research Asia, located in Beijing, China. Founded in 1998, Microsoft Research Asia has since become one of the best research centers in the world that MIT Technology Review called “the hottest computer science research lab in the world.” Dr. Hon oversees the lab’s research activities and collaborations with academia in Asia Pacific.
An IEEE fellow and a Distinguished Scientist of Microsoft, Dr. Hon is an internationally recognized expert in speech technology. He serves on the editorial board of the international journal of the Communication of the ACM. Dr. Hon has published more than 100 technical papers in international journals and at conferences. He co-authored a book, Spoken Language Processing, which is a graduate-level textbook and reference book in the area of speech technology in many universities all over the world. Dr. Hon holds three dozens of patents in several technical areas.
Dr. Hon has been with Microsoft since 1995. He joined Microsoft Research Asia in 2004 as a Deputy Managing Director, responsible for research in Internet search, speech & natural language, system, wireless and networking. In addition, he founded and managed search technology center (STC) from 2005 to 2007, the Microsoft internet Search product (Bing) development in Asia Pacific.
Prior to joining Microsoft Research Asia, Dr. Hon was the founding member and architect in Natural Interactive Services Division at Microsoft Corporation. Besides overseeing all architectural and technical aspects of the award winning Microsoft® Speech Server product (Frost & Sullivan's 2005 Enterprise Infrastructure Product of the Year Award, Speech Technology Magazine’s 2004 Most Innovative Solutions Awards and VSLive! 2004 Editors Choice Award.), Natural User Interface Platform and Microsoft Assistance Platform, he is also responsible for managing and delivering statistical learning technologies and advanced search. Dr. Hon joined Microsoft Research as a senior researcher at 1995 and has been a key contributor of Microsoft's SAPI and speech engine technologies. He previously worked at Apple Computer, where he led research and development for Apple's Chinese Dictation Kit.
Dr. Hon received Ph.D in Computer Science from Carnegie Mellon University and B.S. in Electrical Engineering from National Taiwan University.


Hong Kong Tourist Information

ACM International Conference on Multimedia Retrieval, Jun. 5 - 8, 2012