We are interested in everything that language contains – prepositions and relational clauses and layered, complex verbs, and anaphora, and relative pronouns and …
Some particular areas where Orion stands out.
Relations As Objects
This seems unremarkable – natural language does this all the time. Even so, as far as we know, Orion is the only semantic technology that treats relations as objects – that is, they can be objects operated on by other relations, without limit. Most other formalisms follow that of computer languages – a relation is a way of connecting one or two objects, and cannot itself be an object in another relation
A car is an assembly relation holding together a lot of spare parts – take away the windscreen wiper and it is still a car – take away the assembly relation and it is a pile of spare parts.
Logical and Existential Control
“He can’t swim” – natural language is full of existential control (about 5%, actually), to the point where we don’t notice how pervasive and powerful it is – the statement “Section 9 is void” means that everything in Section 9 disappeared in a puff of smoke. A document becomes like a piece of machinery, with doors opening or closing depending on the state of the reader and their world. Without the interweaving of logical and existential control, natural language would be very limited in its expressiveness.
These humble things help to create a narrative flow, and each of them can have a wide variety of meanings (which means they are very hard to handle).
At their simplest –
The cat sat on the mat.
OK, so “on” goes with “sat” to make a collocated verb. But then it has multiple meanings – “He sat on the report”, “She sat on the board of ABC Corp”.
A more complex example, showing how a preposition can control the narrative:
He saved three people from drowning at Bondi.
He saved three people from drowning at the cost of his own life.
Grammar doesn’t work – the meaning of the words and phrases has to be understood. And that means dense modelling and thousands of special cases.
For a semantic system to be reliable, it needs vast, detailed modelling, and the ability to sit at a point in the text and figure out, from a whole lot of possibilities, what the meaning is.
Get It the First Time
The system needs a way of handling areas which are not already modelled, or not fully modelled – ways of filling in unknown words and concepts. Some of it can be done automatically, a lookup of Google or Wikipedia and parsing of the dictionary/encyclopedia entry, or building of a preliminary structure for a concept (we call them wordgroups). The system can’t expect to be spoonfed in the order of concept arrival – from simplest to most complex, so it is continually altering its connections as new information arrives (a reasonable definition of knowledge). As an example, it finds the wordgroup “nuclear power station”, and, knowing nothing about nuclear power or power stations, it creates a wordgroup “nuclear power station”. It then encounters “nuclear power” as a wordgroup, makes it a member of the noun “power”, and adjusts “nuclear power station”, so it is a combination of ”nuclear power” and “station”. It then reads about “other power stations will be closed down” – the word “other” cannot start a wordgroup, so it has “power station” as a wordgroup, goes back and adjusts “nuclear power station” to be a child of the new wordgroup “power station”. The approach is radically different from “big data” or AI approaches, where the system is intended to learn from thousands of events that occurred in the past. With most corporate knowledge held in text, if you don’t get it the first time, you are not going to get it, because it Is not going to be repeated for the slow learners.
Why Not Machine Learning?
People learn language from nothing – why can’t machines do the same? Machines have neural nets, right? Machine-based neural nets are equivalent to networks of resistors, where the value of each resistor has been adjusted (or “learned”) to give the desired output. Human neural nets are made up of neurons – high gain switchable amplifiers, with potentially thousands of inputs. If you can’t see the difference, try to make a cell phone only using resistors, no transistors. The note on Lane Following shows how plastic is the human version. Orion builds an active, switchable structure to represent the words in text.
For more differences, visit our technical website.
The basic difference between Orion and other semantic technologies is the approach – we demand high reliability, which means no short cuts, which means a great deal of structural detail – detail which may seem unnecessary until it is needed, and which a person can remember, but a computer which took a short cut cannot.