|
Abstract : |
A Standard Generalized Markup Language (SGML) document has a document type definition (DTD) that specifies the allowed structures for the document. The basic components of a DTD are element declarations that contain for each element a content model, i.e., a regular expression that defines the allowed content for this element. The SGML standard requires that the content models of element declarations are unambiguous in the following sense: a content model is ambiguous if an element or character string occurring in the document instance can satisfy more than one primitive token in the content model without look-ahead. Bruggemann-Klein and Wood have studied the unambiguity of content models, and they have presented an algorithm that decides whether a content model is unambiguous. In this paper we present a disambiguation algorithm that, based on the work of Bruggemann-Klein and Wood, transform an ambiguous content model into an unambiguous one by generalizing the language. We also present some experimental results obtained by our implementation of the algorithm in connection to an automatic DTD generation tool. 1, |