A merge engine has the
tall task of deciding if a new resource (topic), brought into the map is essentially the same as another resource already in the map. Let us use the example of a structured conversation
as a topic map (it really is one!). Consider this Question, and two Answers (Issue followed by two responses known as Positions)
What are the known causes of climate change?
Carbon dioxide in the upper atmosphere is a cause of climate change
Climate change is caused by CO2
That is a non-trivial merge situation; on visual inspection, we see that both answers are the same. Reduced to a simple triple, they both say this:
{ CO2, Cause, Climate Change }
The final conversation should
reduce the noise by merging both answers, being certain to give credit to both participants in the conversation.
A different, and still non-trivial task, much closer to the well-researched database
record linkage problem is that of determining if, say, "J.Smith" is the same person as "John Smith". If we have a topic in the topic map which is about a particular "John Smith", then all references to any name close to that become subject to merge behaviors which reconcile the names against all other
identity properties available to the map.
We believe that the UIMA platform is ideal for making merge decisions, very much in the same sense that it participated in Watson's monumental Question Answering demonstrations on free text and hints. Thus, in a SolrDrWatson project in which a topic map is slated to perform the
memory and
knowledge organization functions of a Watson-like agent, the fact that the topic map already needs Watson-like features to tend the map is a major reason for the proposed architecture.