Skip to content

Personal tools
You are here: Home » Research » Projects » Phase I » Document Integration

Document Integration

Document Actions
DI: Document Integration

University of Fribourg, Rolf Ingold (IP leader),
HES-SO Fribourg, Rudolf Scheurer,,
HES-SO Lausanne, Pierre Kueffer,

Context and Goals
The main objective of IM2 Document Integration project is to align various types of documents with video and speech recordings. Different kind of static documents used during a meeting, either distributed in paper form or projected on a screen, will be analyzed and compared to video and speech data in order to allow further linking between the different modalities. In other words, the main goal of IM2.DI is to bridge the gap between non-temporal documents and other temporal medias.

For that purpose various documents (agenda, reports, transparencies, participants' notes, etc) used during meetings are put on a repository, analyzed and indexed according to their physical and logical structures. We make explicitly the hypothesis that all documents are available in an electronic form, most often in PDF format, which is considered as a pivot representation for the analysis.

Research Issue
During the first two years, the research activities are concentrated on two major aspects of document alignment:

  • Document content alignment: The goal of this task is to align textual content of documents with speech transcripts in order to detect citations, references or more general thematic links. For that purpose, annotations, resulting of various segmentations, are produced. An obvious application of this alignment is a system that automatically links document parts with audio-video extracts of a meeting.
  • Document image alignment:The goal is to match low-resolution document images (such as video capture of projected slides) with the electronic form of the document available in the repository. Document images extracted from videos are analyzed in order to extract temporal information contained in the presentation of the document (slideshow mainly). Various methods are developed to associate timestamps with state changes of projected documents (such as: new slide, new list item for an animation, etc.) and to identify the documents inside the repository.
  • User evaluation: Document annotations are evaluated quantitatively by comparing them with manual ground-truth. Further, we also want to assess the results of the produced annotations qualitatively by setting up a demonstrator. An application called the Meeting Minutes Authoring Tool is intended to confirm the relevance of the produced annotations.
Further activities include:
  • Document-centric meeting environment: Set up a flexible and adequate environment to record meetings efficiently and to capture all the documents handled during meeting, minimizing the technical assistance.
  • Recording document centric meetings: Collect a corpus of meeting that is useful for our research on document alignment. For this reason, we capture meetings that deal with documents either projected (video) or discussed (speech) or both.
  • Design and implementation of document analysis web services: Our local database is fed with the output of the document-centric meeting room (audio, video and document related data), and with the annotation productions of document analysis algorithms. The goal of this task is to define and to test a distributed framework for collecting and processing these data and annotations.
  • Prototype of meeting organizer application;
  • Meeting capture tool, linked with meeting organizer;
  • Document-centric meeting recordings (press reviews), with manual transcription of meetings and document annotations;
  • Thematic alignment using combination of segmentation techniques;
  • Slide change detection and slide identification (image matching with document data base);
  • Minimal set of document analysis algorithms wrapped up as web services;
  • Prototype of the Meeting Minutes Authoring Tool integrating document alignment annotations

To see a list of publications click here

Quarterly status reports
Available on the local site (password protected).

Last modified 2006-02-03 15:52

Powered by Plone