On Visualizing the Semantic Web in MS Office


 

 

Christian Fillies,

Semtation GmbH, Germany, cfillies@semtalk.com

York Sure,

Institute AIFB, University of Karlsruhe, Germany, sure@aifb.uni-karlsruhe.de

 

 

 


Abstract

 

This article gives an impression how existing RDFS based Semantic Web knowledge web services can already now be integrated in the creation process of semantic nets in MS Visio with SemTalk. We are presenting two examples: looking up object names in WordNet and using Ontobroker as an inference engine on ECCMA [11] ontologies.  This article shows how different 2-dimensional visualization techniques such as DAMLVisio [5] or simple Visio shapes can be applied to semantic web models in order to address a broad range of users with very different expectations on the notation.

Keywords: RDFS, Web  Services, Visio, Ontobroker

1.  Introduction

SemTalk [1], using a Microsoft Visio front-end, offers an easy to use editor for semantic web ontologies and processes. Using an open, graphically configurable meta model, Visio can be easily adapted to different model worlds such as CASE Tools and organizational models. These models, with the help of Microsoft Office XP SmartTags, allow users to easily use semantic webs during their daily work with other MS Office products such as Winword, Excel or Outlook.

 

The whole idea of the semantic web is to share common – more or less – formalized knowledge via the Internet [2]. While the main idea was to establish a common knowledge platform for machines, we are focussing on people exchanging their ideas in a (graphical) knowledge network. Key issue for sharing knowledge is the use of a common terminology, e.g. in form of ontologies [3]. Though there exists already common practices how to develop and use ontologies [4], ontologies are due to their complex nature far from being a commodity and require substantial tool support during development.

 

Most of the existing online and offline glossaries can be viewed by browsers that visualize exactly one node in an network. Some tools generate hyperbolic trees or similar visualizations on demand. Our experience about communicating knowledge is, that complex problems can better be understood by manually created diagrams describing a specific scene or scenario. People prefer using a mixture of  drawing tools combined with modelling tools which gives them a great flexibility to use the advantages of both. This article shows how different 2-dimensional visualization techniques as DAMLVisio [5] or UML [6] can be applied to semantic web models using Visio for multiple audiences. Our goal is to integrate existing RDF(S) [7,8] based semantic web knowledge services tightly into the creation process of semantic networks and business processes. We are covering too examples: looking up object names in WordNet [9] and using Ontobroker [10] as an inference engine to process ECCMA ontologies.  

2. Architecture

SemTalk does work on an RDF(S)-like XML data structure. Diagramming information and object oriented features like methods and states have been added to RDF(S). It also has an optimized structure for basic inferences as inheritance and graph traversals. There is an object engine providing a COM API in order to be able to use the engine within MS Office products. For the graphical presentation of models we have used MS Visio for two reasons: (i) the tool is widely used in industry, therefore people are used to it and (ii) it is  easily extensible through an API.

Figure 1: Architecture Overview of SemTalk

 

The SemTalk object engine is used to define semantics - in other words a Meta Model - for existing Visio shapes. You can graphically define which shapes are allowed to be connected with each other. SemTalk supplies the infrastructure to define complete modeling methods inside Visio. Those methods are e.g. for DAML, for Enterprise Resource Planing (ERP) product modeling and for Business Process Modelling (BPM) methods. SemTalk has a couple of interfaces to CASE tools like Rational Rose and to BPM tools. There is a simple report generator that creates HTML tables by using XSL for formatting.

3. Notation for Semantic Webs

In respect to the very broad audience we want people to be able to read our models without learning a notation. We have best experiences using the very simple bubble notation, shown in some of the pictures below. It is important to label most of the links and not to use graphical encodings which are known from graphical languages as Entity Relationship diagrams or UML.

 

For readers with a technical background more complex notation with various shape types can be used. Examples are the DAML Notation and e.g. a user interface for a product configuration engine.

One of the great advantages of using Visio is that is contains a large collection of predefined and extendable shapes. The shapes correspond quite natural to classes. Using pictures improves the acceptance of the models which is an important success factor in Knowledge Management.  

4. WordNet

WordNet®, which was developed by the Cognitive Science Laboratory at Princeton University under the direction of Professor George A. Miller, is a huge online lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets. SemTalk uses WordNet via Dan Brickley’s RDF(S) web service for WordNet 1.6 [12].

 

 

 

 

Figure 2: A small vehicle model build from WordNet

 

The models are being build with external model repositories incrementally. Once you have used a class name in a model you can look for related objects in external repositories and integrate them into your model (Figure 3). The idea of using an external glossary basically ensures that people are talking about the same thing with a well defined Uniform Resource Name (URN) to identify objects and related hyperlinks to access their definitions. The other benefit users have from such ontologies is that they are getting hints for related objects or subclasses to use in the model.

 

Figure 3: Subclasses of vehicle offered by WordNet

 

The objects remember their origin and can be refreshed (or replicated) from their external data source once the source has changed.In a very similar way you can link one class to a another class living in an external model which was created using SemTalk and which is published on a web server. This technology results in a web of hyperlinked models based on RDF(S) as a common standard.

5. Ontobroker

Finding knowledge on the semantic web can be done more intelligent than just looking up words in a dictionary even if this is based on RDF(S). Additional to web services such as WordNet and the SemTalk internal indexing  inference engines like Ontobroker[1]  can provide a new dimension of reasoning capabilities.

 

Ontobroker exploits knowledge models and data from different sources to answer queries. For that purpose it evaluates axioms contained in the knowledge models to derive new knowledge or to check the consistency of the available knowledge. It runs as a middleware system and thus may be used by a variety of applications of as an information delivering base. Ontobroker is already used by the W3C as a reference implementation (SiLRI [13]) as an inferencing tool for semantics for the Web.

 

Ontobroker can be accessed as a web service similar to WordNet. SemTalk stores models as knowledge bases somewhere on a webspace reachable for Ontobroker. A query in Ontobrokers query language F-Logic [14] and a list of possibly relevant knowledge bases is being send to an Ontobroker server. The server returns a list of XML encoded solutions to the query.  Each variable binding in a solution can be a reference to an object in a knowledge base. The user can insert objects into SemTalk directly from the URNs provided in the result set.

Figure 4 is showing the query interface in SemTalk. A user selects one or more ontologies and a query. The Result set consists of all possible Solutions for the F-Logic query “FORALL X,Y<-X::Y.” which means X is a (direct or indirect) subclass of Y.

In the example we are using a subset (Farming & Fishing & Forestry & Wildlife &..) from a large ontology named Universal Standard Products and Services Classification (UNSPSC) developed by ECCMA.

The Electronic Commerce Code Management Association (ECCMA) is a not-for-profit, unbiased, membership organization that oversees the management and development of the UNSPSC Code. The UNSPSC is a new classification first developed in the summer of 1998. Both the Dun & Bradstreet Standard Product and Services Classification (SPSC) and the United Nations Development Program (UNDP) United Nations Common Coding System (UNCCS) were used in its development. The UNSPSC currently covers 56 industry segments from electronics to chemical, to medical, to educational services, to automotive to fabrications, etc.

A good overview of this and similar content standards can be found at [17].

 

 

Figure 4: Administration of Ontobroker

 

The mission of SemTalk is to empower end users to publish models to the semantic web and to exploit the knowledge from their desktop applications. Since we do not expect end user to learn and use an object oriented  logical language like F-Logic, we have build a query-by-example-style graphical interface show in Figure 5.

 

 

Figure 5: A graphical F-Logic Query

 

Queries are defined the same way users are already familiar with from reading the models. Existing models can be used as patterns. Some parts of the model are marked as variables (single uppercase character). The graphical query is then translated to F-Logic and send to Ontobroker.

6. DAML

DAML is the Darpa Agent Markup Language [15].  The goal of the DAML effort is to develop a language and tools to facilitate the concept of the semantic web. DAML is basically a much richer layer on top of RDF(S).  For SemTalk we are using the VisioDAML shapes developed by John Flynn [5].

 

What SemTalk currently does for DAML is:

 

  • driven by the meta-model it will check if you can use a connector between any to objects e.g. you can not use "SubClassOf" between to instances.
  • keep consistency between multiple visualizations of the same object.
  • navigation etc.

 

 

Figure 6: Subset of the DAML meta-model in SemTalk

 

 

 

 

Figure 7: DAML Notation in SemTalk

 

The (RDF(S) compatible) classes which you will find in SemTalk are just used for the DAML meta model (Figure 6). The SemTalk object engine controls which pairs of shapes may be connected. E.g. a “subClassOf” connection is not allowed between a class and a property.

Figure 7 is a DAML drawing build with SemTalk. A DAML class is represented as an instance of the SemTalk class "DAML#Class", DAML instances are instances of the SemTalk class "DAML#Instance". "DAML#HasClass" is just a link between two SemTalk instances. This implies that the basic inferences like inheritance of attributes are not available for DAML in the dialogs.

 

Compared to this the integration of RDF(S) is much deeper. If we read an RDF(S) file we are creating SemTalk classes which can be used as:

 

-          classes for Visio masters

-          business objects in business processes

-          ...

 

SemTalk classes implement a subset of RDF(S) (e.g. no SubProperty of, binary properties only).

 

The current DAML Visio shape set does not cover the DAML extension DAML+OIL [17].  As soon as a proposal for a depiction of DAML+OIL is available this can be implemented in SemTalk by extension of the graphical Meta Model and by adding the new Visio shapes.

7. Using the Semantic Web from MS Office XP

 

If you read this text from an Internet Explorer 6.0 or Word XP you will mention a SmartTag on the word SmartTag in this sentence as shown in Figure 8.

 

Figure 8: A SmartTag action menu

 

SmartTags are recognized while you a typing. The SemTalk SmartTag Recognizer finds any class in models of users choice. The “HTML” option will open an HTML representation of the model.

“Make Hyperlink” converts the SmartTag into a static hyperlink for those users not having IE. Static Hyperlinks are early bound (in the context of the author) while SmartTags are using late binding (in the context of the reader). Late Binding is more interesting since it may translate terms from one context to another on the fly.

If users have SemTalk installed on their machine they may also edit the underlying Visio model in order to add new information to the model or change the text based on linked objects from the model.

SmartTag technology is recognized by most of the users as one of the key enablers for knowledge management because it is proactive. It points the user to the semantic model which describes in an easy understandable way what the text / email / sheets is talking about. This technology helps e.g. (i) technical writers to be sure to use the right term in their context Or (ii)  shows a business process that describes how to proceed.

A concrete usage scenario for this approach is to ensure the consistency of technical documentation in the IT department of a bank [16]. 

8. Summary

Using SemTalk models are able to give context to keywords.  They also create a starting point to understand and communicate process information. As the availability of semantic web knowledge source is increasing, the need for reliable and scalable inference engines such as Ontobroker becomes obvious.

The Visio editor enables a wide range of users to use and understand models. By the integration of the technology into the daily work processes, the acceptance, and thus the usefulness of the models rises. 

9. References

 

[1]

Fillies,C.; Weichhardt, F.; SemTalk: A RDFS Editor for Visio 2000
Position Paper, ICCS 2001 9th International Conference on

Conceptual Structures / Semantic Web Working Symposium (SWWS)

[2]

 Tim Berners-Lee, Jim Hendler, and Ora Lassila published an article about the Semantic Web in Scientifc American. "A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities". See http://www.scientificamerican.com/2001/0501issue/0501berners-lee.html

[3]

Gruber, T. (1995). Towards principles for the design of ontologies used for knowledge sharing. International Journal of Human-Computer Studies, (43):907–928.

[4]

S. Staab, H.-P. Schnurr, R. Studer, and Y. Sure. Knowledge processes and ontologies. IEEE Intelligent Systems, Special Issue on Knowledge Management, 16(1), January/Febrary 2001.

[5]

John Flynn. DAMLVisio Shapes, cf. http://www.daml.org/visiodaml/

[6]

Booch, G., Rumbaugh, J. and Jacobson, I. (1999) The unified modeling language user guide, Addison-Wesley, Reading Mass.

[7]

W3C. RDF Schema Specification. http://www.w3.org/TR/PR-rdf-schema/, 1999.

[8]

O. Lassila and R. Swick. Resource description framework (RDF). model and syntax specification. Technical report, W3C, 1999. W3C Recommendation. http://www.w3.org/TR/REC-rdf-syntax.

[9]

WordNet, cf. http://www.cogsci.princeton.edu/~wn/

[10]

S. Decker, M. Erdmann, D. Fensel, and R. Studer. Ontobroker: Ontology Based access to Distributed and Semi-Structured Information. In R. Meersman et al., editors, Database Semantics: Semantic Issues in Multimedia Systems, pages 351–369. Kluwer Academic Publisher, 1999.

[11]

cf. http://eccma.org/unspsc/

[12]

Dan Brickley. RDF(S) web service for WordNet 1.6, cf. http://xmlns.com/2001/08/wordnet/

[13]

S. Decker, D. Brickley, J. Saarela, J. Angele. A Query Service for RDF. Query Languages 98, W3C Workshop.

[14]

M. Kifer, G. Lausen, and J. Wu. Logical foundations of object-oriented and frame-based languages. Journal of the ACM, 42, 741–843, (1995).

[15]

Darpa Agent Markup Language (DAML), cf. http://www.daml.org

[16]

Fillies, C., Wood-Albrecht, G., Weichhardt, F., A Pragmatic Application of the Semantic Web Using SemTalk. WWW2002, May 7-11, 2002, Honolulu, Hawaii, USA ACM 1-5811-449-5/02/0005

[17]

DAML+OIL ontology markup language, cf.

http://www.daml.org/2001/03/reference.html  March 2001

[18]

Dörr, M., Guarino, N., Fernández López, M., Schulten, E.,

Stefanova, M., Tate, A., State of the Art in Content Standards. OntoWeb Deliverable 3.1.

www.ontoweb.org/download/deliverables/D3.1.pdf November 2001

 



[1] Ontobroker is available from Ontoprise GmbH, cf. http://www.ontoprise.com