Error-driven learning
Error-driven learning is a sub-area of machine learning concerned with how an agent ought to take actions in an environment so as to minimize some error feedback. It is a type of reinforcement learning.
Brill's Transformation-based Error-Driven Learning is a symbolic machine learning technique which has been successfully applied, inter alia to part-of-speech tagging ([Brill, 1995]). Figure 1: Transformation-based Error-Driven Learning |
The general scheme of Brill's algorithm is as follows: - Unannotated text is passed through an initial state annotator which may be simple (such as assignment of NN or NNP to all words irrespective of context), or relatively complex (a stochastic n-gram tagger).
- The output of the initial state annotator is then compared to a reference corpus, and the system searches for a transformation which gives the greatest local improvement in a user-defined and application specific error function.
- A transformation specifies a triggering environment and a rewrite rule.
- Once a rule has been chosen it is applied to the current annotation of the corpus, generating a new and slightly changed annotation. This is in turn compared to the reference corpus.
- The above process is iterated until no further improvement can be achieved. The sequence of transformations learnt by the system is the final output of the learning process.
- We now have a sequence of transformations which can be applied any text which has been passed through the initial state annotator.
This scheme is sufficiently general to apply to many tasks, including Chinese word segmentation, although each task will require the choice of a suitable set of rule templates, which serve to define the space of transformations to be searched by the algorithm. Performance will hinge crucially on the choice of appropriate templates.