|
Abstract : |
Approximate inference of finite state machines from sparse labeled examples has been proved NP-hard when an adversary chooses the target machine and the training set [Ang78, KV89, PW89]. We have, however, empirically found that DFA's are approximately learnable from sparse data when the target machine and training set are selected at random. 1 A Greedy Learning Algorithm Trakhtenbrot and Barzdin described the following polynomial time algorithm for constructing the smallest DFA consistent with a complete labeled training set [TB73]. The input to the algorithm is the prefix-tree acceptor which directly embodies the training set. This tree is collapsed into a smaller graph by merging all pairs of states that represent compatible mappings from string suffixes to labels. The algorithm contains two nested loops. In the outer loop, each node i is visited in breadth-first order starting at the tree's root. In the inner loop, each node j between the root and node i \Gamma 1 is evaluated for compatibility with node i by comparing the labels in the subtrees rooted at i and j. If every corresponding label is the same, the transition from i's parent to i is altered to point at node j instead. The node i and its descendents are then inaccessible This downloadable postscript document has been reformatted from the COLT-92 original (which was put together with scissors and paste). It is still owned by the ACM, whose, |