Home

Natural Language Access to Visual Data: Dealing with Space and Movement


Author(s) : Thomas Rist Gerd Herzog Elisabeth Andre, 
Publisher : N/A
Publication Date : 1989
ISSN : N/A
Abstract : Combining vision/image understanding and natural language generation the following issues arise: How can we extract information concerning time, space and movement from visual data and how should the information be represented to permit the generation of natural language descriptions? In this paper, we will report on practical experience gained in the project Vitra (VIsual TRAnslator). In order to design an interface between image understanding and natural language systems, we have examined different domains of discourse and communicative situations. Two natural language systems are presented which are combined with a concrete vision system. To give an impression which input data are supplied to the NL systems, the levels of image analysis will be briefly sketched. By means of the four orientation-dependent relations right, left, in front of and behind, we demonstrate how a purely geometrical descripition can be transformed into a propositional description of the spatial arrangement. After that, we present our approach to event recognition. In contrast to previous approaches, we don't start from a completely analyzed image sequence. Rather, events are to be represented in such a way that they can be simultaneously recognized and described in natural language as the scene progresses.,