Pragmatic Applications of the Semantic Web Using SemTalk
Christian Fillies
SC4 Solution Clustering, Falkensee,
Germany cfillies@semtalk.com
Frauke Weichhardt
Beratung im Netz, Potsdam, Germany fweichhardt@fweichhardt.de
Gay Wood-Albrecht
Bonapart Solutions, USA wood-albrecht@mindspring.com
Dietmar Wikarski
Fachhochschule
Brandenburg, Brandenburg,Germany wikarski@fh-brandenburg.de
Pragmatic applications of
the Semantic Web using SemTalk
Summary
The Semantic Web is a new layer of the Internet that enables distributed modeling of the contents
of existing web pages. Semantic
webs store not only text but they are similar to whiteboard models that
include the most relevant associated terms or keywords. Compared to
standardized ontologies, semantic
webs present powerful new
search strategies. “Ambient”, intelligent applications and agents can use this
knowledge network in various ways.
SemTalk, using a
Microsoft Visio
front-end, offers an easy to use editor for semantic web ontologies and processes. Using an open,
graphically configurable meta model, Visio can be easily adapted to different model worlds
such as ARIS EPKs or Bonapart process and organizational models. These models, with the
help of Microsoft Office
XP SmartTags, allow users to easily create semantic webs as a bi-product of their daily work with other MS Office products such as Winword, Excel or Outlook.
This paper will present
two applied uses of this technology:
1.
An Ontology Project:
Department-wide information modeling at the Credit Suisse Bank.
Emphasis was both on linguistic standardization and in the development of a
centralized description of all of the decentralized applications found in the
organization. Local knowledge
management teams were able to immediately take advantage of the available
terms and solutions created by the modeling teams.
2. A BPM Project: Distributed
process
modeling of the Bausparkasse Deutscher Ring, a
German financial institution
Several groups of students from the technical
university FH Brandenburg explored how to develop and apply an
industry-specific Semantic Web to Business Process Modeling.
1 Introduction
The next
generation internet or Semantic Web is a new
layer of the Internet that enables distributed
modeling of the contents of existing web pages. Semantic webs store not
only text, but also whiteboard files or frameworks that include
the most relevant associated terms. Compared to standardized ontologies, semantic webs present
powerful new search strategies. “Ambient” or embedded intelligent applications
and agents can use this knowledge network in various ways.
The Semantic Web is still in
its initial stages. Enormous possibilities for its further development can be
seen from the increasing number of pages available about semantic webs. Even though concrete applications are still
very rare, the definition of XML logs such
as RDF, RDFS and DAML+OIL by
the W3C suggest growing interest.
Therefore it is likely that an ever-increasing number of Semantic Web applications
will be seen in the near future.
Based on our
recent experiences, we predict that this new technology will spread first
within the Intranets of larger, distributed enterprises, since there is
continuous demand to fine-tune Knowledge Management system
structures between different areas of the enterprise. The creation and fine-tuning of these
Knowledge structures can easily be accomplished using Semantic Web technologies.
The creation of a central vocabulary within the context of ontologies
and processing concepts is a necessary prerequisite
SemTalk, using a Microsoft Visio front-end,
offers an easy to use editor for semantic web ontologies and processes. Using an open,
graphically configurable meta model, Visio can be easily adapted to different model worlds
such as ARIS EPKs or Bonapart process and organizational models. These models, with the
help of Microsoft Office
XP SmartTags, allow users to easily create semantic webs as a bi-product of their daily work with other MS Office products such as Winword, Excel or Outlook.
The following
is a description of two practical applications of Semantic Web
technology. The goal of the first
project was to create a department-wide information model within Credit Suisse,
whereby both linguistic standardization as well as a department specific
description of all of the decentralized applications used by the different
departments.
The second
project involved distributed process modeling of
the Bausparkasse Deutscher Ring, a
German financial institution. Several groups
of students from the technical university FH Brandenburg explored how to
develop and apply an industry-specific Semantic Webs.
2 Comprehensive Departmental Information Modelling at Credit Suisse
In the context of this
project several workshops were undertaken to create the basic repository for a
growing visual glossary. This glossary
was to be used as a possible basis for a knowledge
management system. The results of the workshops were summarized in the form
of conceptual models. These models were
then published and made available in the Intranet.
2.1 Assumptions
In today's
large-scale enterprises language variety is common because of rapid
technological change and integration of many smaller companies or departments
into larger conglomerates. This is particularly true in the IT area where there
is an abundance of different architecture descriptions, strategy papers and
technology concepts etc. The knowledge contained in these documents is often
strongly bound to the vocabulary of individuals, and is therefore difficult to
consolidate. Typically frequent is the use of homonyms, words having the same
sounds but different meanings. Although in the quite new area which IT is,
synonyms are also emerging that can also have quite different meanings
depending on the department.
2.2 Project Goals
In this
project an infrastructure and a practically usable base vocabulary needed to be
prepared using existing linguistically standardized documents. Glossaries
and/or models were represented as flexibly as possible in reusable forms so
that they could be easily inserted into technical applications such as Document Management and Content Management systems. A further application is the automatic
document classification system.
The emphasis
in this project was both linguistic standardization and the population of a
central version of a glossary that was to be used by people designing or
managing department-specific peripheral applications. The goal was not dogmatic
control or centralized specification management but rather to create awareness
of available terms and solutions at the level of the local knowledge manager or
member of the modeling team. In order to ensure that use of the glossary
permanently became a part of everyday practice, a general consciousness of context had to be
produced. This was most effectively
accomplished using already available contexts such as
integrating standard office applications, most importantly Office XP, in the
preparation of fundamental definitions.
From the start
of this project initial requirements demanded that the glossary available in
the Intranet should be in a form suitable for many different types of
users. This meant that it was not
acceptable to use complicated technical notations e.g. UML diagrams.
It was hoped
that this project would produce a possible measuring stick from which future knowledge
management systems could be structured. “Bootstrapping” of
such a system is always a very complex project. Initially if there was not
enough content available, the system would not be used sufficiently and
therefore would not begin to develop a life of its own. However a complete
ontology of all objects existing in the enterprise is not desirable. The world is constantly changing and the
language of the enterprise needs to reflect these changes.
Success
depends on being able to publish a glossary with sufficient content and basic
graphic definitions to encourage users to use and update the glossary as
appropriate. This required technology
that is easy to use and one that is integrated with standard office
applications. Similar to the creation
and indexing of textual web pages, this is best done if the system appeals to
the need for the user to participate in the process.
Within this
scope of this project only the creation and modeling of a glossary were
required.
2.3 Semantic Web as a Knowledge Management System
The glossary consists of
terms with definition text and Synonym/homonym relationships. In
addition, explicit relationships are defined between the terms and their classifications
to super ordinate and subordinate terms.
The formalized representation is presented as a model. In order to store
information models flexibly, both topic maps and W3C recommended RDFS based on XML standards are created.
SemTalk is used as
graphic editor. With help from SemTalk and RDFS the models
can be stored as individual HTML web pages in
the Intranet with all of their embedded hyperlinks. This type of the knowledge representation
requires no central maintenance for the model and it provides a coordinated
approval mechanism for the core terms that are used.

Figure 1: View of a SemTalk Model
Consistency between different
partial models is ensured during the modeling process by the SemTalk consistency Wizard. The Wizard points out which
terms are already used in another model. Instead of modeling the same term
again, a hyperlink from the reference term is formed. The SemTalk Wizard uses index
tables created by the SemTalk RDFS Crawler. This
Crawler creates a directory of the available knowledge within selected areas of
Intranet, Internet and within file systems.
These index tables are also
used to interface with MS Office. SemTalk SmartTag is a technology that analyses text while the user is writing in order to
mark the words that are already contained in the glossary as reference terms or
Synonyms. Synonyms that are found can be exchanged for
other reference terms if necessary. The definitions of the detected words are available
using a single click that will take you to either the Visio model or to the
available HTML representation. This results in substantial savings during complex
manual revision of texts.
The SemTalk Tool Suite also
produces pointers to revised documents and text passages.
Specific models, for
example, the representation of detailed connections between individual
documents are created. If these
connections are not to be included in the general glossary, SemTalk can be run on
the workstation during text revision. Models of individual documents or of specialty
areas extend or add specialized components to the general glossary. As each term is used again it is arranged in
the context of existing terms. Searches using general headings will also
include new models and glossaries containing specialized terms.
If new terms for the
general glossary emerge during document revision, they will be added after they
are reviewed.
Knowledge management systems are
usually initially created via workshops, usually with expert interviews.
Significant savings can be realized if the Concept composer from the TextTech company is utilized to
extract useful terminology.
·
The Concept Composer is assigned to search larger text quantities
(source text + collocation) together
with SemTalk relevant technical terms can be identified as well
as appropriate collocations
·
Concept Presenter in the
Intranet with graphic interface, can be integrated
into the HTML Viewer of Semtalk.

Figure 2: The Interface to Concept
Composer
Different versions of
definitions, associated Synonym/homonyms and text
passages can be managed with the SemTalk Glossary. The SemTalk
Glossary shows the interface between SemTalk and the Concept Composer.
2.4 Project Bootstrapping Methods:
- Creation of a list central, more prioritized
list of defining terms
- Scanned text from 100
representative documents via the Concept Composer (TextTech company). Results consist of a hit list of important
technical terms, an infrastructure for looking up passages in the text and
package collocations that show the frequently word pairs are found
together. Concept Composer was used
first externally as ASP solution.
- Execution of three, 3-5
days Workshops, with up to five experts.
During the workshops the SemTalk
Glossary is used for the documentation and
administration of definitions.

Figure 3: SemTalk Glossary
At the end of each Workshop
day the scenarios discussed during the day are modeled graphically in SemTalk. The resulting graphic models are crucial in
helping to simplify the resulting discussions. Relationships are easy to visualize and it is
easy to navigate through large amounts of information. Homonyms are places on
the opposite side of the graphic representation.
At the end of the
Workshop central terms are defined and graphically modeled. The glossary with
all of the graphic representations is then placed on the Intranet to be used by
the enterprise.
Creation of a
glossary using SemTalk acts as a knowledge foundation
that is designed to dynamically grow in ways that support better decision
making and communication within the enterprise especially as the environment
changes. The glossary is published on the Intranet. Periodic audit of the contents ensure that the
glossary remains up-to-date and useful. Modification requests are centrally
collected and updates are made on a regular basis with the collaboration of the
appropriate departments. A model is only
updated if the majority of general users deem the updates appropriate. Responsibility for the maintenance of the
models was given to the individuals responsible for Intranet updates.
2.6 Project Results
Two hundred
critical keywords were modeled over a three month period. Approximately 10 departmental representatives
defined these keywords during several Workshops that lasted between two hours
and three days. Project costs were
related to time lost from work. SemTalk Glossary was strongly
felt to be a critical factor in being able to effectively build a glossary is
such a short period of time.
The results were
published in the Intranet and updated periodically. SemTalk enabled users to access
keywords in several different contexts. The graphical view made it easier to
understand the meaning of the keywords in relationship to each context because both
the keyword and associated words are identified when doing searches.
SemTalk structured project
work in a way that enhanced communication between coworkers from different
departments. Additionally, purposeful revision of the documentation made it
easy to quickly identify which documents needed to be updated, especially if context for a keyword
changed.
2. 7 Future Perspectives
The glossary created for
Credit Suisse is currently in testing. If this project is deemed successful the
project will continue in other departments throughout the enterprise.
3 Distributed Process Modeling at the Deutscher Ring Bausparkasse
The primary goal of this distributed
process modeling project was to model order processing at the Deutscher Ring Bausparkasse. This
project took place over several weeks and was done by students from the
professional school Brandenburg.
The primary difference
between this project and conventional process modeling
projects was the use of an industry-specific Semantic Web. Semantic web allowed processes to be
easily fine-tuned and terminological work to be executed more efficiently.
3.1 Conditions
and Goals
Two separate groups, each
with four students, modeled all business processes in two different
departments. After interviewing
department members information was modeled systematically in SemTalk. Models of existing processes were shown next
to models of the “to be” processes that showed both the desires of each
department as well as the feasibility of implementing the processes.
The primary customer
targets were to make the processes clearer in the enterprise as well as
defining the processes needed for the new workflow management system.
In addition, a significant
project aim consisted of examining the processes associated with distributed
modeling. The project team examined how
communication can be improved within modeling teams and with the end-users.
Experience from business
modeling projects shows that a high quality distributed modeling tool with a
common repository is not always sufficient.
A repository can only guarantee the syntactic consistency of a model in a
best case scenario. Most modeling tools offer little assistance with the
creation of a common conceptual basis for functions, processes and information.
This problems becomes more important if processes are spread between enterprise,
e.g. such as the B2B area when different business partners must connect first
their enterprise languages with each other.
3.2. SemTalk Process Modeling Methods
The most important
philosophy behind the Internet and hence Semantic
Webs is that information is not copied, it is referenced. Creating links to external pages does not
alter the contents of those pages. A flexible
information system that develops in this way does not have the consistency of a
database but it has the advantage of being able to grow dynamically. SemTalk does not
create individual models, it creates a network of
linked models. While the emphasis of the Semantic Web is on pure
knowledge representation, or in the case of Credit
Suisse the modeling of information classes, SemTalk process models can
also be created and managed as webs. Models
can be created within one another or they can be linked with external models
such as industry-specific standards.
Semantic Web process modeling
procedures consist primarily of three steps:
1. Selection of suitable
reference library from the Internet
2. Customization of these
libraries to fit project requirements
3. Creation of the process model using
the reference model as a background
3.2.1. The Semantic Web Delivers Reference
Models
Our methodology consists of
using internet-based reference models that are easy to adapt to
users needs. There is
an increasing number of organizations that have developed such models:
There are also different XML-based
languages being used. Two popular
repositories from the EAI area are BizTalk www.biztalk.org
and RosettaNet.
General XML notation systems
are found at www.cyc.com and at Wordnet www.xmlns.com
These reference models
can also analyze source text such as that described
in analysis as in the first section.
3.2.2 Process Modeling
SemTalk supports
different business process modeling methods,
including the representation of enterprise processes developed
using PROMET, a method developed by Österle at IMG (http://prometatweb.img.com/). In the current
project, with its strong focus on internal processes, SemTalk uses the methodology of
communication structural analysis (CSA) developed by Krallmann
(http://www.sysedv.cs.tu-berlin.de/Homepage/SYSEDV.nsf/) The students in the Deutscher Ring Bausparkasse project were already
familiar with this method because of their
experience with the CSA-based modeling tool Bonapart.
In CSA, and therefore
also in Bonapart, a process consists of interfaces
between activities connecting by information flows made up of information and
media. Class models act as
building blocks for these process models. Class models help
to form structured and linguistic consistent processing concepts. This improves re-use and allows better methods
to evaluate models that are being developed. This mostly concerns model
elements from the modeling tool Bonapart.
Class models
maintain linguistic consistency of the processing concepts. They form the basis
of model re-use and offer better ways to evaluate the models that are being
developed. With SemTalk the class models in the
Semantic Web are written in standard RDFS and they
contain references to other of class models. The class models can be
created top-down using existing materials or bottom-up during workshops. Bottom-up
modeling is generally more efficient because it helps to limit the modeling
depth of the class models.
Thinking first about the
objects and then over the processes themselves is an important step in the
initial phases of the project. It is
also critical to make sure that class libraries are
consistent between several small related models. This will make it easier to integrate the
models later.
3.2.3 An Example
For the better
understanding of the distributed modeling with SemTalk the simple process " address modification " (Figure 4) is presented
in the following example.

Figure 4: Example process „Change Address“
3.2.4 Tool Support Using SemTalk
SemTalk supports the
user during the modeling process using a Wizard that monitor
the modeling process and offer suggestions.
Examples include tips about writing e.g. large/lower case, detecting synonyms
and the investigation of situations where the hierarchical structures appear to
be incorrect. A further agent
in Office XP is
embedded and examines each record to see if any of the keywords are used in
models.
For the
animation the agents is created using the agent toolkit from MS Office. The agents are supported
by a Crawler, which looks independently or requests available models and creates
index files for the agents. The Crawler looks not only in the local file system but also in
the Semantic Web for available sources of knowledge in
the format RDFS.
4 Summary
Using SemTalk models are able to give context to
keywords. They also create a starting
point to understand and communicate process information. The
Visio editor enables a wide range of users to use and
understand models. Application of knowledge models into popular business processing tools
introduces processing concepts and allows a
new navigation medium to support knowledge management. By the
integration the technology into the daily work processes, the
acceptance, and thus the usefulness of the models rises. Most importantly a process context for more
powerful intelligent retrieval using Semantic Web concepts is
unleashed.