Industry News  |  In Practice  |  The Bigger Picture  |  Digital Marketing  |  Your Business

Latest Articles

More Web 2.0 Needed In Schools

An influential think-tank calling for more Web 2.0 use in school and technology experts agree, arguing that children should get used to collaborative tools before they enter the workplace.

more

UK Council for Child Internet Safety Launches

The UK Government launched its programme to help protect children from exposure to potentially harmful content on the Internet, including some forms of advertising. New Media Knowledge spoke to AOL, one of the companies involved, to see what real impact the new group would have.

more

US Presidential Election Gets Social

Last week, Twitter launched its US Presidential Election microblogging site and, with social media likely to play a big part in the outcome, politicians this side of the pond should be looking closely at its impact, experts say.

more

Related Articles

Related Events

Ontologies And The Semantic Web

Filed under: all articles
By: NMK Created on: April 8th, 2006
Bookmark this article with: Delicious Digg StumbleUpon

The annual British Comupter Society Roger Needham lecture delivered by Ian Horrocks at the Royal Society on 7th December 2005 delivered some key insights into the reasoning behind - and challenges of - the semantic web, reports Deirdre Molloy...

Some of this talk – the annual British Computer Society Roger Needham Lecture delivered by Professor Ian Horrocks at the Royal Society on 7 December 2005 – went a little over my head, but other parts percolated the grey matter and I learned a lot more than I usually do at public events...

By Deirdre Molloy

[Register and post your own comments on this article below...]

By way of introduction, Dr Peter Kay, Assistant Director of Microsoft Research (UK) mentioned that the late Professor Roger Needham (1935-2003), in whose honour the lecture is held, created the concept of “clumps”, the “clusters” of today.

Horrocks began by asking what is the semantic web and he defined it by first looking at some of the problems and limitations of the current web.

The current web isn’t much more than distributed hypermedia. The information on it is designed to be consumed by human beings but is not accessible to automated processes.

Tim Berners-Lee’s original vision of the web was more ambitious, Needham explained, proffering this TBL quote about the WWW to support his assertion: “A set of connected applications forming a consistent, logical web of data… given well-defined meaning.”

The messy, syntactic web

It’s hard work using this “syntactic web”, as Horrocks termed it. A search for his colleague Alan Rector on Google Images turns up images of someone called ‘Reverend Alan’. What else is impossible to find using the syntactic web? Complex queries involving background knowledge; locating information in data repositories – eg. travel enquiries, prices, goods and services; the results of human genome experiments.

It’s also hard to find and use web services eg. Given a DNA sequence, identify its genes, determine the proteins, etc… its very difficult if not impossible to get a coherent pathway to this information.

The problem here lies in the fact that mark-up is all about presenting the information, not about the semantic content. The web page is very accessible to us but very difficult for a machine to understand. Even worse, text is sometimes buried inaccessibly inside images and graphics.

Delegating complex tasks to web agents

So what is the proposed solution? To add semantic annotation to web resources. In the case of the pictures, semantic information along the lines of: “Dr Alan Rector is a Professor of Computer Science at the University of Manchester”

What does giving semantic annotation to web resources mean? Horrocks explained that it involves firstly an external agreement on the meaning of annotations, for instance the Dublin Core Metadata Initiative. It must have limited flexibility and extensibility; and a limited number of things that can be expressed. It uses ontologies to specify the meaning of annotations.

He then elaborated on the role of ontology in information science. An ontology is an engineering artefact. It’s also a vocabulary used to describe (a particular view of) some domain. And it’s an explicit specification of the intended meaning of the artefact category.

Horrocks listed some applications of ontologies currently in operation, such as e-Science (bioinformatics); in medicine where they are building and maintaining terminologies such as Snomel, NCI and Gaten. Ontologies are also used for organising complex and semi-structured information, such as that held by the UN, NASA, Ordinance Survey, General Motors and Lockheed Martin.

What are ontology languages?

Next he turned to the Semantic Web itself, and began by considering ‘ontology languages’. If ontology languages are to serve their purpose for the web, they need to be agreed. One of the first to be agreed was the RDF schema, but it’s very weak and doesn’t allow us to explain many things. It’s also very high order, difficult to explain and difficult to provide support for.

Two languages have been developed to address the deficiencies and problems of RDF. OIL was developed by a group of largely European developers, and DALMONT was developed by a group of largely US-based researchers. They merged to produce DAML+OIL. This was submitted to the W3C as OWL.

Both were based on the same underlying description logics, a family of knowledge logic-based knowledge representation formalisms. These were descendants of the semantic networks KL-one. They describe the domain in terms of concepts (classes), roles (properties, relationships) and individuals. The operators allow for composition of complex concepts. And names can be given to complex concepts.
Eg “happy parent” = Parent-child-smart-fit.

Reasoning & logic

Semantics and reasoning are distinguished by formal semantics (which are typically model theoretic) and decidable fragments of FOL (often contained in C2). [note: at this point I was totally lost and I must apologise if my recap contains errors! But Horrocks gradually began to talk in less specialist language]. The provision of inference services is guided by decision procedures for key problems (satisfyability, subsumption, etc) and by highly optimised implemented systems.

But why the description logic? OWL exploits the results of 15 years of DL research and is based on well-defined (mood theoretic) semantics. Its formal properties are well understood (complexity; decidability). We know the reasoning algorithms, and there is an implemented system – Cerebra.

Why all the strange names? Description logics are a family of KR formalisms – mainly distinguished by available operators. The available operators are indicated by letters in the name eg. S,H,O,I,N. OWL had to come up with a web-friendly syntax.

Ontologies for the working web

In turn Needham explained why they chose ontology reasoning. Given the key role of ontologies in many applications it’s essential to provide tools and services to help users, and therefore to design and maintain high quality ontologies that are meaningful – ie. All names classes have instances.

What were the research challenges? Firstly increasing expressive power, which involved complex role inclusion axioms; concrete domains / data types; database style keys; rule language extensions.

The second challenge was improving scalability, requiring optimization techniques; reduction to disjunctive datalog, and hybrid DL-DB. Tools and infrastructure was another challenge, as was design methodologies (based on foundational ontologies).

Q&A with the audience:

The question of the commercial barriers to asking high-quality questions on the WWW was raised, and the delegate wondered if the commercial cost of annotating the source info would be met. In turn, he continued, some people and organisations won’t want the source information so annotated to then be given away for free, suggesting that it won’t find its way onto the open web.

Horrocks responded that he hoped the process will bootstrap itself, and interest in ontology annotation will grow and become a groundswell.

Elsewhere in the audience David Holsworth noted that Google’s translation service from French into English had recently improved, and said this indicated improving semantic algorithms.

[NB:Notes on the lecture can also be downloaded here]

Further thoughts on this lecture can be found on the Beers & Innovation blog

About Professor Ian Horrocks: 
Ian is Professor in the School of Computer Science at the University of Manchester. His primary research interest is Knowledge Representation; in particular ontologies and ontology languages, tableaux algorithms for Description Logics (DLs), optimisation techniques for such algorithms, and the application of all of the above to e-Science and the Semantic Web. He was a member of the W3C WebOntology working group that developed the OWL language (now a W3C recommendation), and is an author of the SWRL Semantic Web Rules Language proposal. For more information visit his web page

Further reading:

The British Computer Society

The Semantic Web Community portal

The Semantic Web: An Introduction – article by Sean B Palmer

What Is the Semantic Web? – article from Altova

Tim Berners-Lee started blogging in January 2006 on the DIG research journal group blog

Comments

You must be logged in to comment.

Log into NMK

Register

Lost Password?
Login

Newsletter


For the latest news from NMK enter your email address and click subscribe:


Subscribe