Improvements in network bandwidth along with dramatic drops in digital storage and processing costs have resulted in the explosive growth of multimedia (combinations of text, image, audio, and video) resources on the Internet and in digital repositories. A suite of computer technologies delivering speech, image, and natural language understanding can automatically derive descriptive metadata for such resources. Difficulties for end users ensue, however, with the tremendous volume and varying quality of automated metadata for multimedia information systems. This lecture surveys automatic metadata creation methods for dealing with multimedia information resources, using broadcast news, documentaries, and oral histories as examples. Strategies for improving the utility of such metadata are discussed, including computationally intensive approaches, leveraging multimodal redundancy, folding in context, and leaving precision-recall tradeoffs under user control. Interfaces building from automatically generated metadata are presented, illustrating the use of video surrogates in multimedia information systems. Traditional information retrieval evaluation is discussed through the annual National Institute of Standards and Technology TRECVID forum, with experiments on exploratory search extending the discussion beyond fact-finding to broader, longer term search activities of learning, analysis, synthesis, and discovery.
Table of Contents: Evolution of Multimedia Information Systems: 1990-2008 / Survey of Automatic Metadata Creation Methods / Refinement of Automatic Metadata / Multimedia Surrogates / End-User Utility for Metadata and Surrogates: Effectiveness, Efficiency, and Satisfaction