Esta é unha revisión antiga do documento!
ComposIT: experiments and comparison with AI planners
In previous works1) 2), ComposIT was analyzed and compared with the winners of the Web Service Challenge 2008 (WSC'08). These experiments demonstrated that ComposIT can obtain optimal solutions, outperforming the results obtained by the winners with respect the number of services and the composition length. However, the aim of the experiments presented here is to compare the performance of ComposIT with two open source state-of-the-art composition engines, OWLS-Xplan and PORSCE-II, which use classical AI planning algorithms.
The experimentation is focused on the generation of semantically valid composite services taking into account the semantic information regarding the inputs and outputs of services in absence of preconditions and effects. To carry out this comparison, we developed a benchmarking tool that automates the evaluation process of the different composition algorithms using an adapted version of the WSC'08 datasets. This page extends the details of the evaluation done.
Datasets
To validate the three composition engines, we used three collections of datasets: Exact-Matching, Semantic-Matching and WSC'08 datasets. The first two collections are based on the original WSC'08 datasets but containing only the services from the solution itself for validation purposes. Characteristics of each dataset are shown below.
Semantic Matching datasets
Download the XML Semantic-Matching datasets semantic-matching.tar.gz
Dataset | #Serv | #Serv. Sol. | #Length | Avg. In./Out. | #Init. con. | #Goal con. |
---|---|---|---|---|---|---|
Semantic-Matching 01 | 2 | 2 | 2 | 1.0 / 1.0 | 1 | 1 |
Semantic-Matching 02 | 3 | 3 | 2 | 3.0 / 1.0 | 3 | 1 |
Semantic-Matching 03 | 10 | 10 | 3 | 4.03 / 3.08 | 3 | 2 |
Semantic-Matching 04 | 5 | 5 | 3 | 4.25 / 4.68 | 4 | 1 |
Semantic-Matching 05 | 40 | 40 | 23 | 5.07 / 2.44 | 3 | 1 |
Semantic-Matching 06 | 10 | 10 | 5 | 5.47 / 3.47 | 6 | 4 |
Semantic-Matching 07 | 20 | 20 | 8 | 3.03 / 5.11 | 2 | 3 |
Semantic-Matching 08 | 35 | 35 | 14 | 3.10 / 3.37 | 9 | 4 |
Semantic-Matching 09 | 20 | 20 | 12 | 3.07 / 4.40 | 8 | 1 |
Semantic-Matching 10 | 30 | 30 | 20 | 3.25 / 5.42 | 5 | 4 |
Exact Matching datasets
Download the XML Exact-Matching datasets exact-matching.tar.gz
Dataset | #Serv | #Serv. Sol. | #Length | Avg. In./Out. | #Init. con. | #Goal con. |
---|---|---|---|---|---|---|
Exact-Matching 01 | 2 | 2 | 2 | 1.0 / 1.0 | 1 | 2 |
Exact-Matching 02 | 3 | 3 | 2 | 3.0 / 20.5 | 3 | 40 |
Exact-Matching 03 | 10 | 10 | 3 | 4.03 / 16.53 | 3 | 100 |
Exact-Matching 04 | 5 | 5 | 3 | 4.25 / 21.05 | 4 | 77 |
Exact-Matching 05 | 40 | 40 | 23 | 5.07 / 14.20 | 3 | 407 |
Exact-Matching 06 | 10 | 10 | 5 | 5.47 / 24.09 | 6 | 157 |
Exact-Matching 07 | 20 | 20 | 8 | 3.03 / 26.39 | 2 | 261 |
Exact-Matching 08 | 35 | 35 | 14 | 3.10 / 23.21 | 9 | 507 |
Exact-Matching 09 | 20 | 20 | 12 | 3.07 / 24.71 | 8 | 219 |
Exact-Matching 10 | 30 | 30 | 20 | 3.25 / 33.81 | 5 | 473 |
Web Service Challenge 2008 datasets
Download the XML WSC'08 datasets wsc08.tar.gz
Dataset | #Serv | #Serv. Sol. | #Length | Avg. In./Out. | #Init. con. | #Goal con. |
---|---|---|---|---|---|---|
WSC'08 01 | 158 | 10 | 3 | 3.53 / 5.25 | 3 | 1 |
WSC'08 02 | 558 | 5 | 3 | 3.79 / 3.92 | 4 | 1 |
WSC'08 03 | 604 | 40 | 23 | 4.07 / 6.46 | 3 | 2 |
WSC'08 04 | 1041 | 10 | 5 | 4.23 / 5.47 | 6 | 1 |
WSC'08 05 | 1090 | 20 | 8 | 3.36 / 4.26 | 2 | 1 |
WSC'08 06 | 2198 | 35 | 14 | 6.00 / 4.31 | 9 | 4 |
WSC'08 07 | 4113 | 20 | 12 | 7.37 / 7.21 | 8 | 3 |
WSC'08 08 | 8119 | 30 | 20 | 5.44 / 6.54 | 5 | 4 |
These tables show: the number of services of each dataset (column #Serv); the number of services of the optimal solution (column #Serv. Sol.); the length of the shortest solution (column #Length); the average number of inputs and outputs (column Avg. In./Out.); the number of inputs provided by the user (column #Init. con.) and the number of wanted output concepts (column #Goal con.). The goal is to find, for each dataset, the optimal composite service (minimum number of services, minimum length) that satisfies the goal concepts, using only the initial inputs provided.
Exact-Matching datasets were calculated by extending the outputs of each web service, including all superclasses of each output as an output of the service itself (semantic expansion). Thus, the average number of outputs is bigger than in the other datasets. The semantic expansion transforms a semantic matching problem into a exact matching problem, when exact and plug-in match is used to perform the semantic matchmaking. This allows us to test composition algorithms (that do not use semantic reasoners) with the WSC'08 datasets. For example, suppose that a service S1 provides the instance “my_car” which is an instance of the concept “sedan”, whereas another service S2 requires an input “car”. Given that “my_car” is a sedan “car” (plug-in match), S2 can use it as an input. This behavior can be simulated by adding superclasses of my_car to the service. Thus, service S1 will return outputs “my_car”, “car”, “sedan” and “vehicle” after the semantic expansion.
The figure below represents an example of the semantic expansion. In the first case, a semantic reasoner is required to match output “my_car” to “car”. In the second case, the service returns the output “my_car” as well as all superclasses (sedan, car, vehicle). In this case, the output “car” from S1 matches exactly the input “car” from S2.
Evaluation
The performance of each composition algorithm has been measured using a test platform developed for this purpose. We carried out three different experiments, one for each collection of datasets shown in the previous tables, to determine the quality of the solutions and the composition time. The experiments consisted of five independent executions of each algorithm over each dataset. The number of services and the length of the composition, as well as the minimum, average and standard deviation of the time taken to solve the composition problem are shown in the tables below. All the experiments were performed using a Windows XP 32-bit with VirtualBox 4.1.22 (2 GB of RAM, 1 core), on a PC-station equipped with an Intel Core 2 Quad Q9550 at 2.83GHz running Ubuntu 10.04 64-bit.
1. Exact-Matching evaluation results
Dataset | Algorithm | Solution | Execution (ms) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
#Serv | #Length | Ex. 1 | Ex. 2 | Ex. 3 | Ex. 4 | Ex.5 | Min | Avg | SD | ||
Exact-Matching 01 | ComposIT | 2 | 2 | 238.20 | 158.04 | 153.63 | 153.61 | 145.07 | 145.07 | 169.71 | 38.57 |
Xplan | 2 | 2 | 306.65 | 348.32 | 328.53 | 359.50 | 305.37 | 305.37 | 329.67 | 24.29 | |
PORSCE-II | 2 | 2 | 3478.53 | 3717.59 | 3657.99 | 3524.25 | 3553.97 | 3478.53 | 3586.47 | 98.60 | |
Exact-Matching 02 | ComposIT | 3 | 2 | 274.59 | 209.04 | 211.42 | 196.24 | 206.73 | 196.24 | 219.60 | 31.28 |
Xplan | 3 | 3 | 416.11 | 490,68 | 391.30 | 476.10 | 422.66 | 391.30 | 439.37 | 42.17 | |
PORSCE-II | 9 | 9 | 5206.54 | 5283.59 | 5240.02 | 5482.74 | 5649.92 | 5206.54 | 5372.56 | 188.49 | |
Exact-Matching 03 | ComposIT | 10 | 3 | 422.66 | 379.86 | 369.33 | 330.27 | 365.57 | 330.27 | 373.54 | 33.19 |
Xplan | 10 | 10 | 815.73 | 642.90 | 616.80 | 595.58 | 622.55 | 595.58 | 658.71 | 89.38 | |
PORSCE-II | 22 | 22 | 16081.13 | 15721.06 | 15105.77 | 16452.69 | 14779.84 | 14779.84 | 15628.10 | 686.69 | |
Exact-Matching 04 | ComposIT | 5 | 3 | 321.52 | 316.63 | 316.42 | 354.47 | 381.93 | 316.42 | 338.19 | 29.13 |
Xplan | - | - | - | - | - | - | - | - | - | - | |
PORSCE-II | 14 | 14 | 10889.68 | 10777.27 | 10595.46 | 10898.34 | 10767.12 | 10595.46 | 10785.57 | 122.58 | |
Exact-Matching 05 | ComposIT | 40 | 23 | 7111.35 | 6676.08 | 6669.89 | 6775.57 | 6710.47 | 6669.89 | 6788.67 | 185.20 |
Xplan | - | - | - | - | - | - | - | - | - | - | |
PORSCE-II | - | - | - | - | - | - | - | - | - | - | |
Exact-Matching 06 | ComposIT | 10 | 5 | 463.92 | 468.82 | 603.30 | 484.27 | 505.79 | 463.92 | 505.22 | 57.20 |
Xplan | - | - | - | - | - | - | - | - | - | - | |
PORSCE-II | 27 | 27 | 24106.72 | 24340.28 | 23880.96 | 24732.57 | 24025.89 | 23880.96 | 24217.28 | 332.65 | |
Exact-Matching 07 | ComposIT | 20 | 8 | 952.07 | 1103.62 | 951.09 | 1015.01 | 895.89 | 895.89 | 983.54 | 79.27 |
Xplan | - | - | - | - | - | - | - | - | - | - | |
PORSCE-II | 52 | 52 | 38036.12 | 37755.37 | 35577.92 | 36377.60 | 36306.59 | 35577.92 | 36810.72 | 1043.49 | |
Exact-Matching 08 | ComposIT | 35 | 14 | 6071.00 | 6299.35 | 5859.83 | 6324.65 | 6174.44 | 5859.83 | 6145.85 | 189.58 |
Xplan | - | - | - | - | - | - | - | - | - | - | |
PORSCE-II | - | - | - | - | - | - | - | - | - | - | |
Exact-Matching 09 | ComposIT | 20 | 12 | 1079.93 | 1065.36 | 1096.51 | 1090.66 | 1100.17 | 1065.36 | 1086.53 | 14.09 |
Xplan | - | - | - | - | - | - | - | - | - | - | |
PORSCE-II | - | - | - | - | - | - | - | - | - | - | |
Exact-Matching 10 | ComposIT | 30 | 20 | 5236.27 | 5596.29 | 5352.37 | 5663.87 | 5263.09 | 5236.27 | 5422.38 | 195.88 |
Xplan | - | - | - | - | - | - | - | - | - | - | |
PORSCE-II | - | - | - | - | - | - | - | - | - | - |
2. Semantic-Matching evaluation results
In this experiment, we evaluate the performance of ComposIT and PORSCE-II with the semantic-matching datasets. Note that OWLS-Xplan does not support semantic matchmaking. Therefore, this algorithm has been excluded from this evaluation and also from the WSC'08 evaluation, which requires semantic matchmaking.
PORSCE-II uses different threshold values to configure the maximum semantic concept distance allowed for a valid semantic matching. The values of these thresholds are directly related to the performance of the algorithm so that the higher the thresholds, the slower the algorithm runs, but the better the quality of the solutions. Given the difficulty of selecting an appropriate value for the threshold without compromising too much the performance, we selected the optimal value for each dataset based on a study using the Semantic-matching datasets in which we analyzed the performance of PORSCE-II by using incremental values of the threshold from 1 to 19, which corresponds with the maximum depth for all ontologies.
Plug-in threshold | Time (for Semantic-Matching 01) | % Datasets solved | Semantic-Matching datasets solved | Performance |
---|---|---|---|---|
1 | 6721.69 | 10% | 01 | 100.00% |
2 | 9735.35 | 30% | 01, 02, 03 | 69.04% |
3 | 13257.96 | 40% | 01, 02, 03, 04 | 50.70% |
4 | 15139.06 | 50% | 01, 02, 03, 04, 07 | 44.40% |
5 | 20059.98 | 50% | 01, 02, 03, 04, 07 | 33.51% |
6 | 20616.51 | 60% | 01, 02, 03, 04, 06, 07 | 32.60% |
7 | 24607.99 | 60% | 01, 02, 03, 04, 06, 07 | 27.32% |
8 | 27791.03 | 60% | 01, 02, 03, 04, 06, 07 | 24.19% |
9 | 31261.37 | 60% | 01, 02, 03, 04, 06, 07 | 21.50% |
10 | 33648.79 | 60% | 01, 02, 03, 04, 06, 07 | 19.98% |
11 | 34074.82 | 60% | 01, 02, 03, 04, 06, 07 | 19.73% |
12 | 40787.78 | 60% | 01, 02, 03, 04, 06, 07 | 16.48% |
13 | 40704.43 | 60% | 01, 02, 03, 04, 06, 07 | 16.51% |
14 | 44085.17 | 60% | 01, 02, 03, 04, 06, 07 | 15.25% |
15 | 47256.00 | 60% | 01, 02, 03, 04, 06, 07 | 14.22% |
16 | 50164.25 | 60% | 01, 02, 03, 04, 06, 07 | 13.40% |
17 | 51815.99 | 60% | 01, 02, 03, 04, 06, 07 | 12.97% |
18 | 56561.49 | 60% | 01, 02, 03, 04, 06, 07 | 11.88% |
19 | 58981.02 | 60% | 01, 02, 03, 04, 06, 07 | 11.40% |
Figure below shows the percentage of semantic problems solved by PORSCE-II with different values of the threshold and the performance measured by comparing the time taken by the algorithm to solve the Semantic-Matching 01 with the different thresholds. As can be seen, the computational cost increases (performance decreases) when the threshold is incremented. Thus, using a threshold of 10, the algorithm obtains a performance close to 20\%, which means that the algorithm is about 1/0.20=5 times slower than using a threshold of 1. Based on these results, we selected the optimal thresholds N for each datasets: N=1 for Semantic-Matching 01, N=2 for Semantic-Matching 02 and 03, N=3 for Semantic-Matching 04, N=4 for Semantic-Matching 07 and N=6 for Semantic-Matching 06. Note that ComposIT does not require any special configuration for the semantic datasets as it calculates matches at unlimited depth.
Dataset | Algorithm | Solution | Execution (ms) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
#Serv | #Length | Ex. 1 | Ex. 2 | Ex. 3 | Ex. 4 | Ex.5 | Min | Avg | SD | ||
Semantic-Matching 01 | ComposIT | 2 | 2 | 92.46 | 214.92 | 108.11 | 297.97 | 387.13 | 92.46 | 220.12 | 125.32 |
PORSCE-II (1) | 2 | 2 | 7487.74 | 6655.72 | 6537.85 | 6190.85 | 8326.61 | 6190.85 | 7039.75 | 862.66 | |
Semantic-Matching 02 | ComposIT | 3 | 2 | 272.64 | 387.46 | 119.45 | 87.49 | 240.80 | 87.49 | 221.57 | 121.35 |
PORSCE-II (2) | 3 | 3 | 14355.59 | 14784.66 | 14467.15 | 13685.42 | 14779.62 | 13685.42 | 14414.49 | 449.48 | |
Semantic-Matching 03 | ComposIT | 10 | 3 | 104.14 | 261.09 | 229.25 | 307.21 | 317.30 | 104.14 | 243.80 | 85.79 |
PORSCE-II (2) | 11 | 11 | 47524.98 | 49847.54 | 49750.68 | 45989.71 | 45359.04 | 45359.04 | 47694.39 | 2076.83 | |
Semantic-Matching 04 | ComposIT | 5 | 3 | 287.12 | 213.46 | 97.21 | 112.70 | 101.86 | 97.21 | 162.47 | 84.48 |
PORSCE-II (3) | 7 | 7 | 29187.01 | 29593.40 | 29466.08 | 28759.73 | 31130.29 | 28759.73 | 29627.30 | 898.98 | |
Semantic-Matching 05 | ComposIT | 40 | 23 | 684.00 | 694.36 | 682.87 | 272.57 | 282.76 | 272.57 | 523.31 | 224.32 |
PORSCE-II (16) | - | - | - | - | - | - | - | - | - | - | |
Semantic-Matching 06 | ComposIT | 10 | 5 | 223.73 | 409.05 | 101.80 | 96.48 | 125.48 | 96.48 | 191.31 | 132.10 |
PORSCE-II (6) | 14 | 14 | 134258.53 | 114729.94 | 112889.12 | 114402.36 | 118113.16 | 112889.12 | 118878.62 | 8806.96 | |
Semantic-Matching 07 | ComposIT | 20 | 8 | 332.64 | 402.18 | 178.90 | 132.44 | 342.98 | 132.44 | 277.83 | 115.80 |
PORSCE-II (4) | 35 | 35 | 154651.56 | 155128.96 | 153448.76 | 156402.12 | 153196.42 | 153196.42 | 154565.56 | 1305.71 | |
Semantic-Matching 08 | ComposIT | 35 | 14 | 674.44 | 742.30 | 857.37 | 338.21 | 790.32 | 338.21 | 680.53 | 202.71 |
PORSCE-II (17) | - | - | - | - | - | - | - | - | - | - | |
Semantic-Matching 09 | ComposIT | 20 | 12 | 445.83 | 295.22 | 151.83 | 180.21 | 684.96 | 151.83 | 351.61 | 219.36 |
PORSCE-II (15) | - | - | - | - | - | - | - | - | - | - | |
Semantic-Matching 10 | ComposIT | 30 | 20 | 219.66 | 214.97 | 445.83 | 535.24 | 210.08 | 210.08 | 325.16 | 154.28 |
PORSCE-II (17) | - | - | - | - | - | - | - | - | - | - |
3. WSC'08 evaluation results
Dataset | Algorithm | Solution | Execution (ms) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
#Serv | #Length | Ex. 1 | Ex. 2 | Ex. 3 | Ex. 4 | Ex.5 | Min | Avg | SD | ||
WSC'08 01 | ComposIT | 10 | 3 | 308.45 | 223.35 | 224.23 | 217.05 | 231.03 | 217.05 | 240.82 | 38.13 |
PORSCE-II (2) | 11 | 11 | 700552.14 | 703166.16 | 706210.96 | 709269.96 | 688332.37 | 688332.37 | 701506.32 | 8056.46 | |
WSC'08 02 | ComposIT | 5 | 3 | 286.18 | 247.79 | 236.50 | 231.01 | 308.12 | 231.01 | 261.92 | 33.63 |
PORSCE-II (3) | 8 | 8 | 3633027.68 | 3635780.31 | 3494138.55 | 3518414.74 | 3501751.84 | 3494138.55 | 3556622.62 | 71551.69 | |
WSC'08 03 | ComposIT | 40 | 23 | 1448.94 | 1417.31 | 1429.48 | 1413.16 | 1457.06 | 1413.16 | 1433.19 | 19.27 |
PORSCE-II (16) | - | - | - | - | - | - | - | - | - | - | |
WSC'08 04 | ComposIT | 10 | 5 | 427.54 | 434.77 | 434.53 | 451.68 | 440.62 | 427.54 | 437.83 | 9.02 |
PORSCE-II (6) | 16 | 16 | 13594552.10 | 13746460.11 | 13260928.67 | 13153932.78 | 13152705.66 | 13152705.66 | 13381715.86 | 272606.85 | |
WSC'08 05 | ComposIT | 20 | 8 | 1052.68 | 1018.33 | 995.83 | 1019.80 | 1011.26 | 995.83 | 1019.58 | 20.80 |
PORSCE-II (4) | 45 | 45 | 9569454.60 | 9642162.17 | 9461697.18 | 9900376.49 | 9735598.82 | 9461697.18 | 9661857.85 | 166822.85 | |
WSC'08 06 | ComposIT | 42 | 7 | 3611.03 | 3587.10 | 3525.84 | 3535.71 | 3436.85 | 3436.85 | 3539.31 | 67.31 |
PORSCE-II (17) | - | - | - | - | - | - | - | - | - | - | |
WSC'08 07 | ComposIT | 20 | 12 | 4164.99 | 4043.20 | 4083.13 | 4062.16 | 4095.46 | 4043.20 | 4089.79 | 46.54 |
PORSCE-II (15) | - | - | - | - | - | - | - | - | - | - | |
WSC'08 08 | ComposIT | 30 | 20 | 5849.57 | 5751.64 | 5962.46 | 5740.37 | 5710.24 | 5710.24 | 5802.86 | 103.39 |
PORSCE-II (17) | - | - | - | - | - | - | - | - | - | - |
You can download the validation logs for each algorithm here:
Usage
We provide three different java binaries for each composition algorithm. In order to launch a test, you must have installed the Java JDK 6+. Latest java JDK is available here. Once installed, you can run the test platform in GUI mode or in console mode. To launch the GUI interface, simply type:
java -jar algorithm.jar
Where algorithm.jar is one of the available algorithms:
You can launch also a background test from the command line, with the following syntax:
java -jar algorithm.jar algorithm_name dataset_path http_server_port
Examples to launch tests on each repository with the dataset Exact-Matching 01, using the port 80 to create the http server to provide the services and redirecting the output to a file 01.txt:
ComposIT:
java -Xmx1024M -Xms512M -jar CompositAlgorithm.jar "ComposIT" "...\Datasets\Exact-matching\01" 80 > 01.txt
PORSCE-II:
java -Xmx1024M -Xms512M -jar algorithm.jar "PORSCE-II" "...\Datasets\Exact-matching\01" 80 > 01.txt
OWLS-Xplan:
java -Xmx1024M -Xms512M -jar OWLSXplanAlgorithm.jar "OWL-S Xplan 2.0" "...\Datasets\Exact-matching\01" 80 > 01.txt
Disclaimer
All the contents of this web-site are owned by the Centro de Investigación en Tecnoloxías da Información (CITIUS), University of Santiago de Compostela (USC), except the original versions of the WSC'08 datasets, OWLS-Xplan 2.0 and PORSCE-II, which are owned by their respective authors. The software available here is for private, non-commercial use and it is offered solely as a proof of concept for validation purposes.