Wikipedia Collection Reader
The JULIE Lab Wikipedia Collection Reader reads in Wikipedia articles from a database, parses the raw wikitext, composes a cleansed document text and retains the original document structure in terms of UIMA annotations (appropriate annotation types are defined in our UIMA type system, version 2.6.8 or higher). The reader uses the Java Wikipedia Library (JWPL) Parser developed by the Ubiquitous Knowledge Processing Lab (TU Darmstadt) for parsing wikitext.
CONTEXT(Help)
-
OpenSherlock Project »OpenSherlock Project
Resources »Resources
UIMA-related resources »UIMA-related resources
Julie Lab NLP Toolset »Julie Lab NLP Toolset
Wikipedia Collection Reader
+Comments (0)
+Citations (1)
+About