|
OpenDMAP Pozicija1 #249100 Open source DMAP software | http://opendmap.sourceforge.net/ OpenDMAP is an ontology-driven, rule-based concept analysis and information extraction system. Unlike traditional parsers, OpenDMAP does not have a lexicon that maps from words to all the possible meanings of these words. Rather, each concept is associated with phrasal patterns that are used to recognize that concept. OpenDMAP processes texts to recognize concepts and relationships from a knowledge-base. OpenDMAP uses Protégé knowledge-bases to provide an object model for the possible concepts that might be found in a text. Protégé models concepts as classes that participate in abstraction and packaging hierarchies, and models relationships as class-specific slots |
+Citavimą (1) - CitavimąPridėti citatąList by: CiterankMapLink[1] An open-source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression.
Cituoja: Hunter, L, Z Lu, J Firby, WA Baumgartner, Jr., HL Johnson, PV Ogren, KB Cohen Publication info: Bioinformatics 2008 Cituojamas: Jack Park 10:23 PM 3 February 2013 GMT URL: | Ištrauka - Background Information extraction (IE) efforts are widely acknowledged to be important in harnessing the rapid advance of biomedical knowledge, particularly in areas where important factual information is published in a diverse literature. Here we report on the design, implementation and several evaluations of OpenDMAP, an ontology-driven, integrated concept analysis system. It significantly advances the state of the art in information extraction by leveraging knowledge in ontological resources, integrating diverse text processing applications, and using an expanded pattern language that allows the mixing of syntactic and semantic elements and variable ordering.
Results OpenDMAP information extraction systems were produced for extracting protein transport assertions (transport), protein-protein interaction assertions (interaction) and assertions that a gene is expressed in a cell type (expression). Evaluations were performed on each system, resulting in F-scores ranging from .26 – .72 (precision .39 – .85, recall .16 – .85). Additionally, each of these systems was run over all abstracts in MEDLINE, producing a total of 72,460 transport instances, 265,795 interaction instances and 176,153 expression instances.
Conclusion OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this level of performance appears to generalize to other information extraction tasks, including extracting information about predicates of more than two arguments. The output of the information extraction system is always constructed from elements of an ontology, ensuring that the knowledge representation is grounded with respect to a carefully constructed model of reality. The results of these efforts can be used to increase the efficiency of manual curation efforts and to provide additional features in systems that integrate multiple sources for information extraction. |
|
|