In this episode we asked Ian, BSc Lovelace Medallist and OST co-founder how LLMs work and how to make them more useful for business?
To listen along, check out our podcast episode below, available on YouTube.
My name is Ian Horrocks, I'm one of the founders of OST, Oxford Semantic Technologies, and I'm a professor at the University of Oxford where I've been working in the area of knowledge representation and reasoning for, well, more decades than I’d care to mention.
The basis for LLMs is just next word prediction. We're all familiar with typing messages into our iPhone and it predicting what the next word is going to be. Well LLMs in essence are just a sort of massively scaled up version of that where the model is so huge that the context can be quite large, even thousands of words, so that instead of just looking at the last two or three words that you typed in it's looking at the last thousand or even 10's of thousands of words of our conversation and then it's predicting what comes next. If you make the context big enough, we've all seen now interacting with these large language models that can magically seem like interacting with areal human being. The problem is that, like real human beings, large language models tend to make mistakes get things wrong and even just completely make things up when they don't really know the answer.
Already, a lot of alarm systems that are being used in practise are in fact integrated with a layer of rules that try to avoid [the LLM] completely making things up. One way that rules based AI can be used in conjunction with large language models is this idea called ‘RAG’, or retrieval augmented generation. Basically what happens here is that users interact with a large language model using natural language, with all its benefits, but instead of the answers coming directly from the large language model, what happens is that the large language model formulates a query against some sort of data or knowledge base and the answer to that query would then be sent back to the user in order to answer that question. The advantage of this is that the answer to the user query is coming from a carefully curated large scale knowledge resource rather than coming from some statistically generated large language model where we can't be quite sure about the accuracy of the answers.
RDFox combines our research on the underlying theory of how to do reasoning at scale developing algorithms and data structures with decades of experience in how to engineer these systems to optimally exploit the underlying algorithms and data structures that we've developed. This allows us to deal with applications at the scale that would have been unimaginable and with a performance that will really surprise most people. I really suggest that you download RDFox and give it a try. I guarantee you'll be amazed at the results!
Stay tuned for future podcast episodes, where we ask Ian more about how RDFox was created and the applications of KRR in various industries.
Check out our other interview series, ‘Meet the Founders’, where we asked our founders about their journey in bringing OST to life:
The team behind Oxford Semantic Technologies started working on RDFox in 2011 at the Computer Science Department of the University of Oxford with the conviction that flexible and high-performance reasoning was a possibility for data-intensive applications without jeopardising the correctness of the results. RDFox is the first market-ready knowledge graph designed from the ground up with reasoning in mind. Oxford Semantic Technologies is a spin-out of the University of Oxford and is backed by leading investors including Samsung Venture Investment Corporation (SVIC), Oxford Sciences Enterprises (OSE) and Oxford University Innovation (OUI).