Semantic inference or inference over the Semantic Web is a process by which new data is added to a dataset, created from the existing data. That’s why it’s so powerful—no extra data has to be collected to produce new knowledge and insights. These insights come in the form of new relationships, providing connections in the data that were previously unobserved.
Inference relies on two tools, ontologies and rules. The former describes the structural model of the data, how its layered into classes and sub-classes etc., and the latter dictates the laws which the data must obey. If those sound as if they overlap, that’s because they do—they share a great deal of functionality. An ontology will tell you that the countries in a continent are a sub-set of all the countries, and a rule will tell you that a city in a country must also be in that country’s continent.
These are very simple cases where inference is used but the extent of its value is unlimited. Untold insights can be gained from deeply interconnected data in a graph database that would be missed without other significant effort.
When dealing with certain types of financial fraud, the first step is often finding a single malicious person or organisation, but targeting them alone is rarely enough to stop the scheme altogether. If a bank were to look at the fraudster’s financial transactions, a number of properties could tip them off to others in their fraud ring—value, frequency, time, etc. From this information, we can infer who are their accomplices and who are innocent bystanders who happened to sell them something on ebay. Therefore, the addition to the database—the inferred information—is the tagging of individuals as criminal or innocent. This chain of inference can then continue until the whole gang is identified, at which point you have the capacity to tackle the problem properly by shutting down the unit as a whole.
You can read about how we tackled this problem for real here.