An Integrated Semantic Web Service Discovery and Composition Framework

Authors: Pablo Rodriguez-Mier, Carlos Pedrinaci, Manuel Lama, and Manuel Mucientes

Abstract—Intensive research efforts have been focused on solving the automatic discovery or the automatic composition of Web services. However, research and development on both fields has for the most part remained disconnected, without considering the existing overlap between them. This has led to work replication and prevented the adequate integration of discovery and composition engines which are key components at the core of most Service Oriented Architectures. In this paper we present a framework based on a theoretical analysis of service composition in terms of its dependency with service discovery, and a reference implementation of this framework on the basis of two pre-existing separate components, namely iServe and ComposIT. This reference implementation has been used to empirically study the impact of discovery and matchmaking on service composition, and we have provided three different configurations with varying performance. The empirical analysis proves the scalability and flexibility of our proposal and provides insights on how integrated composition systems can be designed in order to achieve good performance in real scenarios, where service registries and composition frameworks are likely to be distributed and controlled by diverse organisations.

Purpose of this web document

The purpose of this document is to extend the results obtained using the Web Service Challenge 2008 datasets with the most recent version of this challenge (Web Service Challenge 2009-2010). The main reason behind not including these results in the paper is that the WSC'09-10 competition is focused on Quality-Of-Service (QoS) optimisation, and therefore better results can be achieved by developing concrete techniques for filtering services by their QoS. Moreover, this difference makes our results not comparable with the other approaches since we are providing semantic compositions just optimising the composition length and the number of services ignoring QoS. We are currently working towards extending the proposed framework with QoS. Meanwhile, we think these results can be also interesting to compare the scalability of different composition frameworks.

Web Service Challenge 2009-2010 datasets

The datasets used in the 2009 and 2010 are the same datasets. The difference between both competitions are the rules to evaluate the results. However, since we are not considering QoS and our results are not comparable we are not affected by this criterion.

Dataset	#Serv	#Concepts
WSC'09 01	572	1578
WSC'09 02	4129	12388
WSC'09 03	8138	18573
WSC'09 04	8301	18673
WSC'09 05	15211	31044

Evaluation

We tested different configurations to study their individual performance and the overall impact on composition response times. In particular, we used the following configurations:

SPARQL D/M: pure SPARQL Discovery / Matchmaking where all interactions with the Service and Knowledge Base managers are directly implemented as SPARQL queries. This is the typical approach of discovery engines and was the original implementation of iServe.
Index. D/SPARQL+Cache M: I/O service discovery is based on an index. We additionally used herein an intermediate cache at the level of the concept matcher in order to avoid issuing recurrent SPARQL queries.
Full Indexed D/M: both service discovery and concept matchmaking relied on local indexes pre-populated at load time (and updated with writes). In this configuration, service discovery and concept matchmaking do not need to issue any SPARQL query to the backed.

The forward graph generation time + optimizations (G. time) and the total number of SPARQL queries generated (#SPARQL) are shown for each of these configurations. Column “Composition” shows the graph size (G. size (opt), measured as the number of services) after the optimizations, and the total composition time (Comp. time) of the optimal service composition search. The last column “Sol. (serv./length)” shows the size of the optimal solution found (services, length).

Dataset	Graph size	Discovery/Matchmaking (D/M)						Composition		Sol. (serv./length)
		1) SPARQL D/M		2) Index. D/SPARQL+Cache M		3) Full Indexed D/M		Composition
		G. time (s)	#SPARQL	G. time (s)	#SPARQL	G. time (s)	#SPARQL	G. size (opt)	Comp. time (s)
WSC'09/10 - D01	49	96.83	11454	28.23	5217	0.23	0	28	0.06	5 / 3
WSC'09/10 - D02	93	720.10	38614	92.71	17663	0.43	0	43	0.03	20 / 6
WSC'09/10 - D03	65	609.00	18410	67.98	12130	0.60	0	13	0.01	10 / 3
WSC'09/10 - D04	199	2830.52	120498	256.26	47212	1.16	0	118	>300	40 / 5
WSC'09/10 - D05	216	5472.00	181179	437.38	78018	4.63	0	96	0.13	30 / 19

All datasets were solved with optimal values for composition length and number of services, showing a similar scalability as observed in the WSC'08 datasets (see graph below).