This tutorial is a follow-up to my article about fully interconnected construction projects that automatically adjust for changes. I encourage you to read it first to understand the whole ecosystem in which this should live.
Ideally, you should have some basic understanding of using graphs as a data structure, either from labelled property graphs or preferably Resource Description Framework (RDF).
In order to follow along you should also download and install a free trial version of the RDFox Knowledge Graph and Semantic Reasoner. Just download and unzip where you want it (in my case /Applications on a Mac), copy the license file to that folder, and add it to your PATH (Google is your friend). Due to when it was first created, this tutorial is based on RDFox version 6.2 although new versions are available.
Everything will be run through a simple script file that you execute from the terminal of your computer, so don’t be scared of this as it’ll help out a lot!
We will be using this simple International Foundation Class (IFC) model of a flow system. We will export it to RDF using the Flow Systems Ontology (FSO). This can be accomplished using the LD-BIM tool or the IFC-LBD converter. The necessary triples can also be found in this repository where all the resources that we create through this tutorial are available.
In the image below you can see examples of resources modelled with their classes, Systems and Components, and sub-classes. The sub-classes Return System/Supply System and Terminal/Flow Controller/Segment/Fitting/Flow Moving Device allow us to be more specific when identifying the resources. In linked data, multiple classes can be assigned to an instance, so a Terminal is also a Component that is implicitly given from its superclass. You can follow the links to read the definitions of each of these classes.
Next, we have the relationships between the instances—the so-called object properties. A couple of examples are: ‘has component’ which is used to assign some component to a system and ‘supplies fluid to/returns fluid to’ that are sub-properties of the more generic ‘feed fluid to’.
You get all these class assignments directly with the IFC-LBD parser, so let’s stop here for a while and take a look at what we have so far.
The easiest way to do testing with RDFox is through script files. To get started, create a new file flow-system-totorial.script in a directory of your choice. Add the following content to it:
This will create a store named flow-system-totorial, activate it, declare that outputs should be logged directly to the console, remove it again, and quit. To execute it, run the following command from your terminal (for this command to work, you must first add the installed folder to your PATH alongside the RDFox license file).
You should now see the following in your console:
With that, we are ready! Save the file fso-abox.ttl to the folder you are working in and add the following between the parts of the script where we activate and delete the store:
This time when you execute it, in addition to what we had before, you should see the following:
Let’s try and change the query to return all Components.
This will not return any results as we never haven’t explicitly associated any instances with the class ‘Component’, however, this classification can be inferred by superclass hierarchy. In order for RDFox to understand this, we need to import the FSO ontology fso.ttl and load its axioms. So, after loading the triples, we import and parse the axioms.
RDFox will perform RDFS and OWL reasoning and infer that all instances of subclasses of Component are in fact also a component themselves, and we will end up with 85 results corresponding to all Terminals, Segments, Fittings, etc.
In the file boundary-conditions.ttl I have manually described some properties of the systems and components. These are specified as simple datatype properties and include the power output of the terminal/radiator, the diameter of pipes and fittings, and the fluid temperature of the systems. I have also specified the type of fluid in the systems and added the file fluids.ttl that describes the thermal properties of the fluids at different temperatures.
Now we are ready to write some rules. They are described in the declarative logic programming language Datalog which differs slightly from SPARQL but uses similar patterns. In the example below I have tried to color code the various elements and use a pseudo syntax. The rule states that we are interested in inferring a temperature difference for all green elements where the following is the case. Green is a Terminal and belongs to two Systems blue and yellow. Blue is a Return System and yellow is a Supply System. Blue has temperature tR and yellow has temperature tF (both in orange) and the absolute difference between them is the requested temperature difference.
In datalog syntax, this would look like shown below.
To test that this works, let’s import the boundary conditions and fluids files in the script and also load the rules file terminal-temperature-difference.dlog. In my example, I put the rules in a subfolder RULES. Lets edit the script with the following information and test that it works by executing a query that returns the new fluid temperature difference over the terminals:
We can see from the terminal output that the files are loaded correctly and we get the expected result with our two terminals now each having a fluid temperature difference.
We can always ask RDFox to explain how a certain implicit fact was derived. In later versions of RDFox, this explanation can even be explored visually!
This will give us the following explanation:
If we read the proof from the bottom up we can analyse why it was derived. In the actual data we use IRIs instead of colours, so for simplicity, here is the mapping:
Using this mapping the explanation (read bottom to top) show us the following:
Since we can always get an explanation of why a certain fact was derived, this is a white-box AI approach. The following sections will simply add more rules of this kind, so if you are mainly reading to understand the technology you can jump ahead to the Dataset Investigation or Changes sections.
Let’s extend this with a new rule terminal-flow-simple.dlog that uses a simple formula to calculate the terminal’s volume flows from the desired power output and temperature difference. This rule uses the result of the previous rule, so we now have a “rule depth” of 2. We specify the flow in m3/s to remain in SI units and thereby accept to get a really small decimal number.
We load the rule and execute a test query to make sure that we have a result:
In the next rule, we first define that every time we have a supplies fluid to or returns fluid to relationship we also have an indirectly supplies/returns fluid to relationship. We then use this to describe that any component actually indirectly supplies/returns fluid to the components further out in the system.
The segment in the image above has GlobalId 3ToSAz1uv2RhyV07DgOoVk, so we can test the rule with the query below. The results we expect are 40 components that are indirectly fed by the segment.
This rule uses an aggregate function to summarise the volume flow for each segment. This is described as the total volume flow of all the terminals that are indirectly supplied by the particular segment.
We get the expected flow both in m3/s and m3/h.
This rule simply calculates the inner cross-sectional area of each segment from the pipe diameter. It depends only on inputs that are explicitly given explicitly given and hence it has a “rule depth” of 1.
We load the rule and confirm that it works.
With an area and a volume flow, we can calculate the velocity of the fluid in each of the segments.
Let’s take a deeper look at the data model that we have now created. Writing the command info in our script will give the following stats:
We can see that the store has 707 explicit facts. This includes our FSO ABox triples, the boundary conditions, fluid properties and the FSO ontology. However, we can also see that the store has 3,158 facts in total. Our store expanded by more than a factor 4 because we described some rules for inferring implicit knowledge.
We can also write the command info rulestats to get some information about the rules we have loaded. This will print the following:
In the documentation it states that the rules have been divided into eight components according to how they depend on each other. Rules in components with higher indexes only depend on rules in components with lower indexes. We can use the info rulestats print-rules to see all rules and which group they belong to. Some rules are from the FSO ontology and the rest are those we have explicitly defined in this tutorial.
The deepest derived property we have is the velocity which is dependent on 1) the inner sectional area and 2) the segment volume flow. 1) is derived from the inner pipe diameter and 2) is derived from the indirectly supplied terminals and their individual volume flows. Their volume flows depend on the temperature difference of the systems they belong to and the desired power output. The explain command yields the following:
Now let’s make a change and see what happens throughout the system.
Changing the power output can be handled with the query below:
And in the console we can confirm that the update query took 0.003 seconds.
Let’s try a change that will have even deeper effects by making an update query that changes the temperature difference for all terminals.
Again we can confirm that the model updates almost immediately.
One can argue that this type of calculation can also be handled in traditional programming code, but what is demonstrated here is totally implementation agnostic and can in principle be executed on any system. The difference is that the logic is captured as part of the data model and thereby formalized for reuse in the future. This is the key feature of linked data. The data, schema, and logics coexist and everything is well described so that future generations can relatively easily pick up a dataset and immediately make use of it.
Of course, this method is limited to arithmetic calculations but it would be interesting to test a setup with more complex things such as simulations. I could imagine an approach where events are emitted when some derived property needs to be updated which then automatically triggers a simulation service that will return the resulting new states so the graph can be updated.
Mads Holten Rasmussen is Business Development Director at NIRAS and Ph.D. in Linked Building Data. He is also M.Sc. in Architectural Engineering with a specialization in energy and indoor climate simulations and HVAC design and worked in this field in his early career.
NIRAS and Oxford Semantic Technologies have a partnership.
The team behind Oxford Semantic Technologies started working on RDFox in 2011 at the Computer Science Department of the University of Oxford with the conviction that flexible and high-performance reasoning was a possibility for data-intensive applications without jeopardising the correctness of the results. RDFox is the first market-ready knowledge graph designed from the ground up with reasoning in mind. Oxford Semantic Technologies is a spin-out of the University of Oxford and is backed by leading investors including Samsung Venture Investment Corporation (SVIC), Oxford Sciences Enterprises (OSE) and Oxford University Innovation (OUI).