SIREn Semantic Information Retrieval Engine

SIREn: Efficient semi-structured Information Retrieval for Lucene

http://siren.sindice.com/index.html  AGPL license

Querying graph structured data (RDF) is commonly achieved using specific solutions, called triplestores, typically based on DBMS backends. In Sindicewe however needed something much more scalable than DBMS and with the desirable features of the typical Web Search engines: top-k query processing, real time updates, full text search, distributed indexes over shards, etc.

While Lucene has long offered these capabilities, its native capabilities are not intended for large semi-structured document collections (or documents with very different schemas). For this reason we developed SIREn - Semantic Information Retrieval Engine - a Lucene plugin to overcome these shortcomings and efficiently index and query RDF, as well as any textual document with an arbitrary amount of metadata fields.

RELATED ARTICLESExplain
OpenSherlock Project
Resources
Harvesting Process Support
SIREn Semantic Information Retrieval Engine
Triplize
Searching Web Data: Entity Retrieval High-Performance Indexing Model
HTML Processing
NLP - Natural Language Processing
Topic Modeling
Word Meaning Analysis
ACE - Automatic Content Extraction
Berkeley Data Analytics Stack (BDAS)
Domeo Annotation Toolkit
FreeEed Open-source eDiscovery engine
H2O Big Data Prediction Engine
LanguageTool Style and Grammar Checker
Lingpipe
Link Grammar Parser
nlp2rdf
OpenDMAP
OpenSextant
RelEx Dependency Relationship Extractor
ReVerb (Github)
SketchEngine
Taming Text Book (code)
Triplify
Graph of this discussion
Enter the title of your article


Enter a short (max 500 characters) summation of your article
Enter the main body of your article
Lock
+Comments (0)
+Citations (0)
+About
Enter comment

Select article text to quote
welcome text

First name   Last name 

Email

Skip