Identifying and brokering mathematical Web services: a formal language can move service discovery forward faster – Focus: grid computing
An important part of the Web service vision being promoted by the Worldwide Web Consortium (W3C) and others is that of automated service discovery, the idea being that when we need a particular kind of service we will no longer have to go out and search for it manually; our computer will do it for us.
Nowhere is this more necessary than in scientific computation. International collaboration here is already the norm and is increasingly being supported by the Grid, where specialized resources are connected by high-speed, high-bandwidth networks. To realize this vision requires mechanisms for describing what services actually do, and for reasoning about those descriptions in the presence of a user’s problem.
MONET: Mathematics on the NET
The MONET project is a two-year investigation into how service discovery can be performed for mathematical Web services, funded by the European Union under its Fifth Framework program. The project focuses on mathematical Web services for two reasons: first, mathematics underpins almost all areas of science, engineering, and increasingly, commerce. Therefore, a suite of sophisticated mathematical Web services will be useful across a broad range of fields and activities. Second, the language of mathematics is fairly well formalized already, and in principle it ought to be easier to work in this field than in some other, less well-specified areas.
According to the Worldwide Web Consortium (W3C), the key to service discovery will be provided by the semantic Web, where information is encoded using formal languages or ontologies whose meaning is well defined and unambiguous. Given a set of service descriptions and a job specification written using a suitable collection of ontologies, a software agent ought to be able to select the appropriate service for the job and generate any necessary proxies to enable the interaction between the client and the service.
This is fine in theory but making it work in practice is rather difficult. It is actually quite hard to describe what a service does, particularly when it is based on an existing piece of mature software whose semantics are not fully documented. Moreover, the software agent needs to be able to understand the relationships between the high-level specifications described by the ontologies and the low-level service interfaces that are currently likely to be written in WSDL.
The MONET consortium includes commercial companies, national research laboratories, and universities; and aims to develop mechanisms that can be freely used by scientists worldwide. Details of the technology it is developing can be downloaded from the project Web site at http://monet.nag.co.uk.
While for many areas of mathematics there are generic algorithms that can, in theory, solve many problems in that class, in practice most of the algorithms scientists use are designed to address a relatively narrow range of problems. Specialized algorithms are generally much more efficient and predictable in terms of resource usage and can yield more accurate results. This leads to an interesting question about mathematical Web services: Will they be fairly monolithic and address a whole class of problem, or will each algorithm be a separate service?
The advantage of deploying Web services that address whole classes of problems is that it makes the process of matching problem to service much easier. The software doing the matching does not need to understand the more subtle features that are used to distinguish the subclasses; this kind of specialized knowledge is part and parcel of the service. On the other hand, specialized services are more lightweight from a developer’s point of view, and reflect the reality of how mathematical software is developed and used in practice. The MONET project takes the view that you need to be able to capture arbitrarily detailed information about the applicability of a service.
The MONET Architecture
A simple example of the kind of interaction MONET aims to facilitate is shown in Figure 1.
[FIGURE 1 OMITTED]
There are three “actors” in this process. The first is a group of services and the second is a client. The third is a piece of software called the Broker, but for convenience we distinguish between several of its components that undertake distinct, well-defined tasks.
The process of the client discovering an appropriate service and then invoking it can be broken down into five steps:
1. Registration: The services register their capabilities, access policies, etc., with the broker’s service manager.
2. Inquiry: The client sends a description of the kind of service it is looking for to the broker. This description may be generic (e.g,. something like “find me a service that performs definite integration”, or specific (e.g., “find me a service to solve the problem “.
3. Analysis: The planning manager inside the broker analyzes the problem and extracts the criteria on which to select a service, which it then matches against the registry maintained by the service manager. If it finds one or more possible matches their details are returned to the client along with an indication of how closely they fit its requirements.
4. Selection: The client selects a suitable service and requests access to it via the broker’s service manager. What this entails depends on the access policies of the service and the particular service infrastructure being used; in the case of grid services, for example, where every abstract service is actually a factory, a new service instance would be generated and a handle returned to the client.
5. Connection: If access is granted, then the client initiates a connection to the service.
This is a simple scenario and MONET has been designed to work with more complex and less easily defined problems. In practice, the broker might employ other, more sophisticated planning services to help it with the matching process, and the result might not be a list of candidate services but a list of execution plans based on the composition of multiple services using a suitable choreography language such as BPEL4WS. However, in all cases the key to making the process work is a sophisticated mechanism for describing problems and services that allows for the effective matching of one to the other.
Mathematical Service Description Language (MSDL)
MSDL is the collective name for the language we use to describe problems and services. Strictly speaking, it is not itself an ontology but it is a framework in which information described using suitable ontologies can be embedded. One of the main languages we use is OpenMath, which is an XML format for describing the semantics of mathematical objects. Another is the Resource Description Framework Schema (RDFS), which is a well-known mechanism for describing the relationship between objects. The idea is to allow a certain amount of flexibility and redundancy so that somebody deploying a service will not need to do too much work to describe it.
An MSDL description comes in four parts:
1. A functional description of what the service does
2. An implementation description of how it does it
3. An annotated description of the interface it exposes
4. A collection of metadata describing the author, access policies, etc.
MSDL Functional Description
There are two main ways in which it is possible to describe the functionality exposed by a service. The first is by reference to a suitable taxonomy such as the “Guide to Available Mathematical Software (GAMS)” produced by NIST, a tree-based system where each child in the tree is a more specialized instance of its parent. This has the twin advantages of being easy to do and of providing a hook into other taxonomy-based classification systems such as UDDI. The disadvantages are that fixed taxonomies fail to capture the evolving nature of mathematical algorithms, and a particular taxonomy may not be rich enough in certain areas (for example, GAMS makes detailed distinctions between software for numerical analysis while lumping all software for symbolic computation into one category). Moreover, while it is easy enough to grow the taxonomy from the leaves, adding internal nodes disrupts the inheritance structure.
The second way to describe the functionality exposed by a service is by reference to a Mathematical Problem Library, which describes problems in terms of their inputs, outputs, preconditions (relationships between the inputs), and postconditions (relationships between the inputs and outputs).
For example, the problem of finding the minimum value of an expression that is subject to simple bounds on its parameters where both the expression and its derivatives are present might be expressed as follows.
1. F([v.sub.1], …, [v.sub.n] = [R.sup.n] [right arrow] R
2. A [subset or equal to] [R.sup.n]
3. ([D.sub.i] ([v.sub.1], …, [v.sub.n]): [R.sup.n] [right arrow] R, i = 1 … n)
1. z [member of] A
2. f [member of] R
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
1. F(z) = f
2. Ay [member of] A: F (y) < f
One important use of the Mathematical Problem Library is to provide names for the various objects that form parts of the problem (in this case F, A, Di, x, and f). This is used in both formulating a problem (i.e., one can say, etc.) and also, as we shall see later, in understanding how to construct the messages defined in the WSDL file.
There are a number of other pieces of information that can be included in the functional description. A directive is used to indicate something about the approach taken by the service to tackle a user’s problem. In the above case the directive would usually be solve, but an alternative might be prove–i.e., given particular inputs and outputs plus the preconditions prove that the post-conditions hold. It is also possible to include statements in other formalisms such as RDFS or the emerging Web Ontology Language (OWL) being developed by the W3C.
While this may seem to involve a certain amount of redundancy, the various parts of the functional description can, in practice, be highly complementary. For example, the entry in the problem description library might indicate that the problem involves the solution of a particular kind of differential equation while the taxonomy reference would add the fact that the equation is stiff.
MSDL Implementation Description
The MSDL Implementation Description provides information about the specific implementation that is independent of the particular task the service performs. This can include the specific algorithm used (by reference to an Algorithm Library), the type of hardware the service runs on, and so on.
In addition, it provides details of how the service is used. This includes the ability to control the way the algorithm works (for example, by limiting the number of iterations it can perform or request a specific accuracy for the solution), and also the abstract actions that the service supports. While in the MONET model a service described in MSDL solves only one problem, it may do so in several steps. For example, there may be an initialization phase, then an execution phase that can be repeated several times, and finally a termination phase. Each phase is regarded as a separate action supported by the service.
MSDL Interface Description
While WSDL does a good job in describing the syntactic interface exposed by a service, it does nothing to explain the semantics of ports, operations, and messages. MSDL has facilities that relate WSDL operations to actions in the implementation description, and message parts to the components of the problem description. In fact the mechanism is not WSDL-specific and could be used with other interface description schemes such as IDL.
There are many other aspects of Web services–not least the ability to negotiate terms and conditions of access, measure the quality of the actual service provided, and choreograph multiple services to solve a single problem–that are still being worked out. The partners in the project’s ultimate goal is to develop products and services based on the MONET architecture but the viability of this depends to a large extent on solutions to the other emerging issues. While we are confident that this will happen, it is not yet clear what the timescale will be.
The MONET project is currently building prototype brokers that can reason about available services using MSDL descriptions encoded in the W3C’s OWL. We are also investigating the applicability of this technology to describing services deployed in the Open Grid Service Architecture (OGSA).
* MONET: http://monet.nag.co.uk
* OpenMath: www.openmath.org
* OGSA: www.ggf.org/ogsa-wg/
* BPEL4WS:www.ibm.com/developerworks/ Web services/library/ws-bpel/
* Worldwide Web Consortium: www.w3c.org
* OWL: www.w3.org/2001/sw/WebOnt/
* GAMS: http://gams.nist.gov
* UDDI: www.uddi.org
Mike Dewar is a senior technical consultant with the Numerical Algorithms Group (NAG, www.nag.com), a worldwide organization dedicated to developing quality mathematical, statistical, and 3D visualization software components. MIKE.DEWAR@NAG.CO.UK
COPYRIGHT 2003 Sys-Con Publications, Inc.
COPYRIGHT 2003 Gale Group