Home Page of MUSCLE WP5 (former WP6 and WP10)
Welcome to the web page of the 6th WorkPackage of the
MUSCLE (Multimedia Understanding through Semantic Computation and Learning) Network of Excellence, sponsored by the European Commission. The WorkPackage´s objective is to foster multi-disciplinary research on
Cross-Modal Integration for Performance Improving in Multimedia.
The Big Picture: Muscle Project Overview
Due to the convergence of several strands of scientific and technological progress we are witnessing the emergence of unprecedented opportunities for the creation of a knowledge driven society. Indeed, databases are accuiring large amounts of complex multimedia documents, networks allow fast and almost ubiquitous access to an abundance of resources and processors have the computational power to perform sophisticated and demanding algorithms. However, progress is hampered by the sheer amount and diversity of the available data. As a consequence, access can only be efficient if based directly on content and semantics, the extraction and indexing of which is only feasible if achieved automatically.
MUSCLE aims at creating and supporting a pan-European Network of Excellence to foster close collaboration between research groups in multimedia datamining on the one hand and machine learning on the other in order to make breakthrough progress towards the following objectives:
- Harnessing the full potential of machine learning and cross-modal interaction for the (semi-) automatic generation of metadata with high semantic content for multimedia documents.
- Applying machine learning for the creation of expressive, context-aware, self-learning, and human-centered interfaces that will be able to effectively assist users in the exploration of complex and rich multimedia content.
- Improving interoperability and exchangeability of heterogeneous and distributed (meta)data by enabling data descriptions of high semantic content (e.g. ontologies, MPEG7 and XML schemata) and inference schemes that can reason about these at the appropriate levels.
- Through dissemination, training and industrial liaison, contribute to the distribution and uptake of the technology by relevant end-users such as industry, education, and the service sector. In particular, close interactions with other IP's and NOE's in this and related activity fields are planned.
- Through accomplishing the above, facilitate the broad and democratic (i.e. obviating the need for special expertise) access to information and knowledge for all European citizens (e.g. e-Education, enriched cultural heritage).
MUSCLE WorkPackage 5: Multimodal Processing and Interaction
In multimedia analysis, most of the tools are usually devoted to a single modality, the other ones being treated as illustrations or complementary components. For example, web search engines do not use images, image retrieval systems barely mix textual and visual descriptions, video processing is usually done separately on sound and on images. One main reason for this is that the different media concern different and sometimes very separate scientific fields. However, even without learning, performance of multimedia analysis and understanding systems (especially in terms of robustness) can be greatly enhanced by combining different modalities. Thus one of the goals of this NoE is to develop algorithms and systems processing several different media present in the same multimedia. This requires a strong collaboration between many research groups. Examples of modalities to integrate include all possible combinations of:
- Vision/Video and Speech/Audio.
- Image/video and text or speech/audio and text.
- Multiple-cue versions of vision and/or speech.
- Vision (or speech) and tactile.
- Other semantic information or metadata.
These combinations of modalities can be either of the cross-interaction type or of the cross-integration type. Interaction implies an information reaction-diffusion among modalities with feedback control of one modality by others. Integration involves exploiting heterogeneous information cumulatively from various modalities and a data feature fusion toward improved performance. Our work addresses research on the theory and applications of multimedia analysis approaches that improve robustness and performance through cross-modal interaction and/or integration. Its general research objectives include several scientific and technological goals and can be grouped into the following categories:
- State-of-the-Art Evaluation and Roadmap
- Cross-Modal Interaction in Multimedia Problems
- Cross-Modal Integration for Multimedia Analysis and Recognition.
- Dissemination of Results.
During the first 18 months of the research activities of the NOE, the work in this WP will be accomplished by focusing on specific problem areas that exemplify the philosophies of cross-media interaction and/or integration. Afterwards, during the second phase of the NOE, we also envision engaging in research efforts to develop a unified treatment that will encompass many specific cases.
Miscellaneous
Notes:
- You are currently in the MuscleWP5 web. The color code for this web is this background, so you know where you are.
- If you are not familiar with the TWiki collaboration platform, please visit WelcomeGuest first.