An Integrated Semantic Web Service Discovery and Composition Framework
Authors: Pablo Rodriguez-Mier, Carlos Pedrinaci, Manuel Lama, and Manuel Mucientes
Abstract—Intensive research efforts have been focused on solving the automatic discovery or the automatic composition of Web services. However, research and development on both fields has for the most part remained disconnected, without considering the existing overlap between them. This has led to work replication and prevented the adequate integration of discovery and composition engines which are key components at the core of most Service Oriented Architectures. In this paper we present a framework based on a theoretical analysis of service composition in terms of its dependency with service discovery, and a reference implementation of this framework on the basis of two pre-existing separate components, namely iServe and ComposIT. This reference implementation has been used to empirically study the impact of discovery and matchmaking on service composition, and we have provided three different configurations with varying performance. The empirical analysis proves the scalability and flexibility of our proposal and provides insights on how integrated composition systems can be designed in order to achieve good performance in real scenarios, where service registries and composition frameworks are likely to be distributed and controlled by diverse organisations.
Purpose of this web document
The purpose of this document is to extend the results obtained using the Web Service Challenge 2008 datasets with the most recent version of this challenge (Web Service Challenge 2009-2010). The main reason behind not including these results in the paper is that the WSC'09-10 competition is focused on Quality-Of-Service (QoS) optimisation, and therefore better results can be achieved by developing concrete techniques for filtering services by their QoS. Moreover, this difference makes our results not comparable with the other approaches since we are providing semantic compositions just optimising the composition length and the number of services ignoring QoS. We are currently working towards extending the proposed framework with QoS. Meanwhile, we think these results can be also interesting to compare the scalability of different composition frameworks.
Web Service Challenge 2009-2010 datasets
The datasets used in the 2009 and 2010 are the same datasets. The difference between both competitions are the rules to evaluate the results. However, since we are not considering QoS and our results are not comparable we are not affected by this criterion.
Dataset | #Serv | #Concepts |
---|---|---|
WSC'09 01 | 572 | 1578 |
WSC'09 02 | 4129 | 12388 |
WSC'09 03 | 8138 | 18573 |
WSC'09 04 | 8301 | 18673 |
WSC'09 05 | 15211 | 31044 |
Evaluation
We tested different configurations to study their individual performance and the overall impact on composition response times. In particular, we used the following configurations:
- SPARQL D/M: pure SPARQL Discovery / Matchmaking where all interactions with the Service and Knowledge Base managers are directly implemented as SPARQL queries. This is the typical approach of discovery engines and was the original implementation of iServe.
- Index. D/SPARQL+Cache M: I/O service discovery is based on an index. We additionally used herein an intermediate cache at the level of the concept matcher in order to avoid issuing recurrent SPARQL queries.
- Full Indexed D/M: both service discovery and concept matchmaking relied on local indexes pre-populated at load time (and updated with writes). In this configuration, service discovery and concept matchmaking do not need to issue any SPARQL query to the backed.
The forward graph generation time + optimizations (G. time) and the total number of SPARQL queries generated (#SPARQL) are shown for each of these configurations. Column “Composition” shows the graph size (G. size (opt), measured as the number of services) after the optimizations, and the total composition time (Comp. time) of the optimal service composition search. The last column “Sol. (serv./length)” shows the size of the optimal solution found (services, length).
Dataset | Graph size | Discovery/Matchmaking (D/M) | Composition | Sol. (serv./length) | ||||||
---|---|---|---|---|---|---|---|---|---|---|
1) SPARQL D/M | 2) Index. D/SPARQL+Cache M | 3) Full Indexed D/M | ||||||||
G. time (s) | #SPARQL | G. time (s) | #SPARQL | G. time (s) | #SPARQL | G. size (opt) | Comp. time (s) | |||
WSC'09/10 - D01 | 49 | 96.83 | 11454 | 28.23 | 5217 | 0.23 | 0 | 28 | 0.06 | 5 / 3 |
WSC'09/10 - D02 | 93 | 720.10 | 38614 | 92.71 | 17663 | 0.43 | 0 | 43 | 0.03 | 20 / 6 |
WSC'09/10 - D03 | 65 | 609.00 | 18410 | 67.98 | 12130 | 0.60 | 0 | 13 | 0.01 | 10 / 3 |
WSC'09/10 - D04 | 199 | 2830.52 | 120498 | 256.26 | 47212 | 1.16 | 0 | 118 | >300 | 40 / 5 |
WSC'09/10 - D05 | 216 | 5472.00 | 181179 | 437.38 | 78018 | 4.63 | 0 | 96 | 0.13 | 30 / 19 |
All datasets were solved with optimal values for composition length and number of services, showing a similar scalability as observed in the WSC'08 datasets (see graph below).