Ontology-based Information Extraction
Ontology-based information extraction, or OBIE for short, is the use of ontologies and their specifications to "drive" or inform the information extraction process. The terms and concepts in the source ontology(ies) form the basis for term matching when tagging text documents. OBIE is a form of knowledge extraction where the knowledge basis is the ontology.
Web: http://wiki.opensemanticframework.org/index.php/Ontology-based_Information_Extraction
OBIE is now available via a variety of plug-ins to the GATE system, and is becoming more common in other general text processing (NLP) systems.
As used in OSF, some of the best practices for OBIE include to make sure that:
- All ontology concepts have a definition, and to include that in the extraction basis
- All ontology concepts have alternative labels, and to include those in the extraction basis
- Where appropriate, ontology concepts have hidden labels to account for common misspellings, and to include those in the extraction basis
- Inferencing is used as appropriate during the extraction (tagging) process.