WoDOOM 2012: A brief review of the First International Workshop on Debugging Ontologies and Ontology Mappings

galway

I travelled to Galway (Ireland) in early October for the First International Workshop on Debugging Ontologies and Ontology Mappings, or WoDOOM 2012 in short, which was co-located with EKAW 2012. With around 20 attendees and 4 speakers, the half-day workshop was fairly small, but it was definitely an interesting start for, hopefully, more workshops to come.

The invited speaker was Bijan Parsia, who gave a rather awesome talk laying out the landscape of what we generally refer to as ‘errors’ in OWL ontologies. We can categorise errors into logical and non-logical errors. Logical errors include the ‘classical’ errors such as incoherence and inconsistency, wrong entailments, missing entailments, but also less obvious problems such as tautologies and ‘concept idleness’. Non-logical errors are problems that we might not think of straight away when we talk about debugging; these include wrong naming of concepts and properties, structural irregularities, and performance problems.

The first research paper by Valentina Ivanova, Jonas Laurila Bergman, Ulf Hammerling and Patrick Lambrix was dealing with the debugging of ontology alignments based on an interesting use-case (ToxOntology, an ontology describing toxicological information of food). The main idea was to validate mappings based on the structural relations of concepts in the  ontology. Valentina also demoed a prototype of the RepOSE tool which nicely combines the “accept/reject” task of debugging alignments with a graph-based user interface (see screenshot below), making the job slightly less painful.

screen-shot-2012-11-13-at-12-21-36

Next up was Tu Anh Nguyen from the Open University who presented her work on justification-based debugging using patterns and natural language. The approach taken to measuring the cognitive complexity of justifications is very appealing: They first identified a set of frequently occurring patterns in justifications which were sub-sets of justifications of maximally 4 axioms, using justifications from around 500 ontologies. The  50 most frequent patterns were then translated into natural language and evaluated using a mechanical turk style web service by presenting the ‘rule’ to a user, then asking them to decide whether a given entailment followed from that rule. This is quite close to what we did in our complexity study, but with the advantage that the natural language rules could be presented to a much wider audience than our DL/OWL Manchester syntax patterns. The result of the user study was a ranking of the most frequent rules, which can be used to rank the complexity of OWL justifications – at least in their natural language form. It would obviously be interesting to find out whether the complexity measure translates directly to Manchester syntax as used in Protege, for example.

And finally, I presented my paper “Declutter your justifications“, which deals with grouping multiple justifications based on their structural similarities. My talk followed on quite nicely from Tu Anh’s presentation, as she basically solved the problem of “obvious proof steps” using her natural language approach to testing justification sub-patterns. The slides for my presentation are available here.

In summary, this first WoDOOM turned out really well, and the papers presented were very interesting. I also have to admit that I was very pleased with the rate of 75% female speakers / first authors, which is pretty awesome. I’m hoping that we’ll have some more papers next year, as at least two had a very similar approach to debugging (justifications!), especially given Bijan’s highlighting other errors which are currently not considered in most debugging approaches.

[Photo of Galway by Phalinn Ooi, cc-licensed]

Reading material: “Interactive ontology revision”

This is the second in a series of blog posts on “interesting explanation/debugging papers I have found in the past few months and that I think are worth sharing”. I’m quite good with catchy titles!

Nikitina, N., Rudolph, S., Glimm, B.: Interactive ontology revision. J. Web Sem. 12: 118-130 (2012) [PDF]

This paper follows a semi-automated approach to ontology repair/revision, with a focus on factually incorrect statements rather than logical errors. In the ontology revision process, a domain experts inspects the set of ontology axioms, then decides whether the axiom is correct (should be accepted) or incorrect (axiom is rejected). Each decision thereby has consequences for other axioms, as they can be either automatically accepted (if they follow logically from the accepted axioms) or rejected (if they violate the accepted axioms). Rather than showing the axioms in random order, the proposed system determines the impact a decision has on the remainder of the axioms (using some ranking function), and gives higher priority to high impact items in order to minimize the number of decisions a user has to make in the revision process. This process is quite similar to Baris Sertkaya’s FCA-based ontology completion approach, which employs the same “accept/decline” strategy.

The authors also introduce “decision spaces”, a data structure which stores the results of reasoning over a set of axioms if an axiom is accepted or declined; using this auxiliary structure saves frequent invocation of a reasoner (83% of reasoner calls were avoided in the revision tool evaluation). Interestingly, this concept on its own would make for a good addition to OWL tools for users who have stated that they would like a kind of preview to “see what happens if they add, remove or modify an axiom” while avoiding expensive reasoning.

1-s2-0-s1570826811001028-gr3

Conceptually, this approach is elegant, straightforward, and easily understandable for a user: See an axiom, make a yes/no decision, repeat, eventually obtain a “correct” ontology. In particular, I think it the key strengths are that 1) a human user makes decisions whether something is correct or not, 2) these decisions are as easy as possible (a simple yes/no), and 3) the tool (shown in the screenshot above) reduces workload (both in terms of “click count” as well as cognitive effort, see 2)) for the user. In order to debug unwanted entailments, e.g. unsatisfiable classes, the set of unwanted consequences can be initialised with those “errors”. The accept/decline decisions are then made in order to remove those axioms which lead to the unwanted entailments.

On the other hand, there are a few problems I see with using this method for debugging: First, the user has no control over which axioms to remove or modify in order to repair the unwanted entailments; in some way this is quite similar to automated repair strategies. Second, I don’t think there can be any way of the user actually understanding why an entailment holds as they don’t get to see the “full picture”, but only one axiom after another. And third, using the revision technique throughout the development process, starting with a small ontology, may be doable, but debugging large numbers of errors (for example after conversion from some other format into OWL or integrating some other ontology) seems quite tedious.

Reading material: “Direct computation of diagnoses for ontology debugging.”

After my excursion into the world of triple stores, I’m back with my core research topic, which is explanation for entailments of OWL ontologies for the purpose of ontology debugging and quality assurance. Justifications have been the most significant approach to OWL explanation in the past few years, and, as far as I can tell, the only approach that was actually implemented and used in OWL tools. The main focus of research surrounding justifications has been on improving the performance of computing all justifications for a given entailment, while the question of “what happens after the justifications have been computed” seems to have been neglected, bar Matthew Horridge’s extensive work on laconic and precise justifications, justification-oriented proofs, and later the experiments on the cognitive complexity of justifications. Having said that, in the past few months I have come across a handful of papers which cover some interesting new(ish) approaches to debugging and repair of OWL entailments. As a memory aid for myself and as a summary for the interested but time-pressed reader, I’m going to review some of these papers in the next few posts, starting with:

Shchekotykhin, K., Friedrich, G., Fleiss, P., Rodler, P.: Direct computation of diagnoses for ontology debugging. arXiv 1–16 (2012) [PDF]

The approach presented in this paper is directly related to justifications, but rather than computing the set of justifications for an entailments which is then repaired by repairing or modifying a minimal hitting set of those justificationsthe diagnoses (i.e. minimal hitting sets) are computed directly. The authors argue that justification-based debugging is feasible for small numbers of conflicts in an ontology, whereas large numbers of conflicts and potentially diagnoses pose a computational challenge. The problem description is quite obvious: For a given set of justifications, there can be multiple minimal hitting sets, which means that the ontology developer has to make a decision which set to choose in order to obtain a good repair.

Minor digression: What is a “good” repair?

“Good repair” is an interesting topic anyway. Just to clarify the terminology, by repair for a set of entailments E we mean a subset R of an ontology O s.t. the entailments in E do not hold in O R; this set R has to be a hitting set of the set of all justifications for  E. Most work on justifications generally assumes that a minimal repair, i.e. a minimal number of axioms, is a desirable repair; such a repair would involve high power axioms, i.e. axioms which occur in a large number of justifications for the given entailment or set of entailments. Some also consider the impact of a repair, i.e. the number of relevant entailments not in E that get lost when modifying or removing the axioms in the repair; a good repair then has to strike a balance between minimal size and minimal impact.

Having said that, we can certainly think of  a situation where a set of justifications share a single axiom, i.e. they have a hitting set of size 1, while the actual “errors” are caused by other “incorrect” axioms within the justifications. Of course, removing this one axiom would be a minimal repair (and potentially also minimal impact), but the actual incorrect axioms would still be in the ontology – worse even, the correct ones would have been removed instead. The minimality of a repair matters as far as users are concerned, as they should only have to inspect as few axioms as possible, yet, as we have just seen, user effort might have to be increased in order to find a repair which preserves content, which seems to have higher priority (although I like to refer to the anecdotal evidence of users “ripping out” parts of an ontology in order to remove errors, and some expert systems literature which says that users prefer an “acceptable, but quick” solution over an ideal one!). Metrics such as cardinality and impact can only be guidelines, while the final decision as to what is correct and incorrect wrt the domain knowledge has to be made by a user. Thus, we can say that a “good” repair is a repair which preserves as much wanted information as possible while removing all unwanted information, but at the same time requiring as little user effort (i.e. axioms to inspect) as possible. One strategy for finding such a repair while taking into account other wanted and unwanted entailments would be diagnoses discrimination, which is described below.

Now, back to the paper.

In addition to the ontology axioms and the computed conflicts, the user also specifies a background knowledge (those axioms which are guaranteed to be correct), and sets of positive (P) and negative (N) test cases, such that the resulting ontology O entails all axioms in P and does not entail the axioms in N (an “error” in O is either incoherence/inconsistency, or entailment of an arbitrary axiom in N, i.e. the approach is not restricted to logical errors). Diagnoses discrimination (dd) makes use of the fact that different repairs can have different effects on an ontology, i.e. removing repair R1 and R2 would lead to O1 and O2, respectively, which may have different entailments. A dd strategy would be to ask a user whether the different entailments* of O1 and O2 are wanted or unwanted, which leads to the entailments being added to the set P or N. Based on whether the entailments of O1 or O2 are considered wanted, repair R1 or R2 can be applied.

With this in mind, the debugging framework uses an algorithm to directly compute minimal diagnoses rather than the justifications (conflict sets). The resulting debugging strategy leads to a set of diagnoses which do not differ wrt the entailments in the respective repaired ontologies, which are then presented to the user. When taking into account the set of wanted and unwanted entailments P and N, rather than just presenting a diagnosis without context, this approach seems fairly appealing for interactive ontology debugging, in particular given the improved performance compared to justification-based approaches. On the other hand, while justifications require more “effort” in comparison than being presented directly with a diagnosis, they also give a deeper insight into the structure of an ontology. In my work on the “justificatory structure” of ontologies, I have found that there exist relationships between justifications (e.g. overlaps of size >1, structural similarity) which add an additional layer of information to an ontology. We can say that they not only help repairing an ontology, but also potentially support the user’s understanding of it (which, in turn, might lead to more competence and confidence in the debugging process).

* I presume this is always based on some specification for a finite entailment set here, e.g. atomic subsumptions.

Videolectures from ISWC 2011 online

The video lectures from ISWC 2011 have been online for a while, so if you’re interested in checking out our talks, you can do this here: http://videolectures.net/iswc2011_research/

The Manchester talks are:

Try and sit through the first 5 minutes of my talk, I *will* slow down eventually.

List of helpful OWL API Tools

500px-glazier_tools

The OWL API is a Java interface for creating and modifying OWL ontologies, an essential (or, the essential)  component of any OWL tools. Mike Bergman compiled a list of tools that make use of the OWL API, including the popular ontology editor Protégé with its plugins, various OWL reasoners, and software of the more exotic variety: Thirty OWL API Tools – www.mkbergman.com

[Photo by Hans Bernhard (Schnobby) (Own work) [CC-BY-SA-3.0 or GFDL], via Wikimedia Commons]

‘MANCHustifications’

At this year’s International Semantic Web Conference ISWC 2011, Manchester will be heavily present with 4 papers in the research track, of which 2 focus on justifications.

In the first one, which we presented in similar form at DL, we discuss our user study on the cognitive complexity of OWL justifications. It is quite interesting that, despite the fairly large body of work on explanation for entailments, there have only been few attempts at analysing the effectiveness of explanation approaches with regard to how understandable they are to the average OWL user. Starting with fine-grain justifications (Kalyanpur, Parsia, Cuenca-Grau, DL 2006) which were then defined formally as laconic & precise justifications (Horridge, Parsia, Sattler, ISWC 2008 – won best paper award at the conference), there has been a line of research dealing with making justifications easier to understand by removing superfluous parts (i.e. parts of the axioms in the justification that are not necessary for the entailment to hold). The notion of (non-)laconicity is based on the assumption that superfluous parts distract the user and therefore make it harder to understand why the entailment holds – which, intuitively, seems sound. Moving away from distracting parts only, we want to have a general picture of how easy or difficult justifications are to understand, and why. These ideas are captured in a complexity model (again, Horridge, Parsia, Sattler) which considers certain properties of a justification and the respective entailment, and gives us a score for the cognitive complexity, or hardness of the justification. The considerations behind this model, issues related with cognitive issues, and the validation of the model are discussed in the paper “The Cognitive Complexity of OWL Justifications”, which we’re presenting at ISWC in October.

The second paper is part of my own PhD research and deals with “The justificatory structure of the NCBO BioPortal Ontologies”. Again, this is a topic which has hardly been touched yet by other researchers who deal with explanation in the form of minimal entailing subsets (i.e. MinAs, MUPS  – if for unsatisfiable classes, justifications… maybe we should simply call them MEHs – minimal entailment-having subsets?). While we generally focus on a) finding efficient mechanisms for computing all MEHs, errmm, justifications, or b) analysing the cognitive complexity of individual justifications, there is only a small body of work that looks at multiple justifications. This seems an obvious step, since we know that considering individual justifications for an entailment in isolation does not give us the full picture of why an entailment holds in an ontology. The consequences can be only partial understanding, ineffectual repair attempts, or over-repair (removing or modifying more than necessary). Further, we even know that those multiple justification have relations between them, as they can share axioms, entail each other, be subsets of one another (if we consider justifications for multiple entailments), etc.  To which degree multiple justifications occur in an ontology, and what relations there are between them, can tell us more about the ontology than the simple metrics we see in ontology editors – in the paper I call it ‘making implicit structure explicit’. An analysis of the prevalence of multiple justifications and their relations in a set of BioPortal ontologies is the focus of the paper, which, again, will be turned into a talk at ISWC.

And, in a amusing move, we have had the research track session which contains the two talks named after us: The ISWC organisers decided to call it “MANCHustifications”. You know where you can find us.

Axiomatic Richness – is your ontology full fat or skimmed?

The term ‘axiomatic richness’ is used in various places to talk about a certain property of an OWL ontology, mostly meaning ‘how much do we say about a particular concept’. Axiomatically rich ontologies are in some way considered better and more interesting than axiomatically lean ones. There is, however, no clear definition of the term. A quick google search for ‘axiomatic richness’ throws up only a few distinct sources that attempt to answer the question ‘what makes an (OWL) ontology axiomatically rich?’. In what follows, I discuss some of the main points of the papers and blog posts I have found.

‘Possibility of Deriving Inferences’

Robert Stevens and Sean Bechhofer discuss the term in their post on the OntoGenesis blog:

The axiomatic richness of an [ontology] refers to the level of axiomatisation that is present. […] A lack of axiomatic richness limits the possibility of deriving inferences from an [ontology]. […] Axiomatic richness could be measured in a number of ways. Hayes for example, in the Naive Physics Manifesto, discusses density. […]

(from http://ontogenesis.knowledgeblog.org/257, 2010)

It also states that in order to be axiomatically rich, the information in the ontology has to be “in a form amenable to machine processing”; plain text descriptions, such as in a SKOS vocabulary, are not sufficient.

This states that axiomatic richness is somehow related to the inferential potential in the ontology, but doesn’t give any further hints as to how we can measure axiomatic richness, or how we can tell whether ontology A is ‘richer’ than ontology B.

‘Large Number of Justifications’

Further down the list of search results, I happened to stumble across my own paper about the Justificatory Structure of OWL ontologies (OWLED 2010), where I state that

[…] taxonomic ontologies containing only trivial axioms of the form (A SubClassOf: B) are commonly regarded as axiomatically weak. A simple indicator for axiomatic richness could be a large average number of justications for entailments.

(from http://owl.cs.manchester.ac.uk/explanation/owled2010/JustStructure_OWLED2010.pdf, 2010)

“Could be” – nothing definitive here either. Many justifications (on average) for the entailments in the ontology simply means that there are many reasons why a certain entailment holds (entailment in the sense of asserted and inferred axioms that satisfy the entailment relationship with the ontology – blog post on this issue to follow soon, potentially including and discussing reviews from my DL workshop paper). While this might be an indicator of redundancy in the ontology (for which we haven’t got a definition either), the number of justifications alone doesn’t tell us much about how much we say about a particular concept, which is usually the focus when talking about axiomatic richness.

We could probably extend this guess to say “a concept A is axiomatically rich if there are 1) many justifications for 2) entailments of the form A SubClassOf B or EquivalentClasses(A,B)”, i.e. entailments that somehow define the concept. (Counter) examples might follow.

Using ‘Expressive’ Constructors

Mikel Egana Aranguren‘s thesis is a rich (haha) source of information about axiomatic richness. I found this quote quite interesting:

The OWL version of the Gene Ontology […] is implemented exploiting a rigorous formalism (OWL), but a limited fragment of the expressivity of OWL is used in axioms. On the other hand, the OBO version of the Sequence Ontology […] is axiomatically rich (e.g. symmetric properties and intersections of classes can be found in the ontology).

(from http://www.sindominio.net/~pik/thesis.pdf, 2009)

He also claims that “bio-ontologies represent biological knowledge in a limited, lean and not rigorous manner”.

A similar assumption is made in Martin Hepp’s description of “A Methodology for Deriving OWL Ontologies from Products and Services Categorization Standards”

[…] the semantic richness needed for most business scenarios will come from the usage of the huge collection of properties.

(from http://is2.lse.ac.uk/asp/aspecis/20050152.pdf, 2005)

Well. I see the point in this argument (similar to the one I made above, i.e. we can’t really say much if we only use atomic subsumptions in our ontology), but I disagree with the statement that expressivity=axiomatic richness. In many of our experiments, we have found that expressivity doesn’t really tell us much about how ‘complex’ the ontology is – reasoner performance, number and size of justifications, etc., do not correlate with the types of constructors found in the ontology (to a certain extent, obviously). Just using the constructors in some way to define a concept doesn’t necessarily make the ontology ‘richer’. Trust me, Son, I’ve seen some of those allegedly weak EL++ ontologies that could have made “the strongest man on earth whimper like a frightened kitten”.

Ontology Design Patterns

Robert Stevens and Mikel Egana Aranguren mention the term again in their paper “Applying Ontology Design Patterns in Bio-Ontologies”. They claim that Ontology Design Patterns (ODP)

[…] have already brought benefits in terms of axiomatic richness and maintainability […]

(from http://www.springerlink.com/content/d2lp476v0p281q73, 2008)

They refer to two more papers dealing with ODP in bio-ontologies, which I won’t cover here.

Locality Based Modules (LBM)

My (current and past) office neighbour Chiara Del Vescovo and Thomas Schneider drop a hint at defining axiomatic richness in a WoMo workshop talk:

[…] extract all (relevant) LBMs in order to […] draw conclusions on characteristics of an ontology:[…] What is the axiomatic richness of O?

(from http://www.informatik.uni-bremen.de/~ts/talks/1005_dl+womo.pdf, 2010)

Unfortunately, the slides don’t go into detail, and I don’t remember any discussions from the talk, so I can’t say much about this.

Non-Trivial Entailments

Yet another explanation from Manchester can be found in Pavel Klinov’s and Bijan Parsia’s paper on “Implementing an Efficient SAT Solver for Probabilistic DL“:

For axiomatically weak TBoxes, where almost all subsumptions can be discovered by traversing the concept hierarchy […]. More complex TBoxes may have non-trivial entailments involving concept expressions on both left-hand and right-hand sides […]

(from http://www4.in.tum.de/~schulz/PAPERS/STS-IWIL-2010.pdf, 2010)

To clarify, I assume the ‘non-trivial entailments’ means subsumptions that are inferred, not asserted, whose justifications involve GCIs. This sounds similar to my statement above about ‘many complex’ justifications for entailments.

Conclusion

Scio me nihil scire. I do however quite like the idea of relating axiomatic richness to the number and type of reasons (i.e. justifications) I have for an (some, all?) entailment of the ontology. We could certainly use some formal definition (or multiple, depending on which aspect is most relevant to the developer, the domain, the application…) which allows us to think of the same things when talking about ‘axiomatic richness’ and comparing ontologies. To be continued…