Bootstrapping Ontologies for Web Services
Aviv Segev, Member, IEEE, and Quan Z. Sheng, Member, IEEE
Abstract—Ontologies have become the de-facto modeling tool of choice, employed in many applications and prominently in the semantic web. Nevertheless, ontology construction remains a daunting task. Ontological bootstrapping, which aims at automatically generating concepts and their relations in a given domain, is a promising technique for ontology construction. Bootstrapping an ontology based on a set of predefined textual sources, such as web services, must address the problem of multiple, largely unrelated concepts. In this paper, we propose an ontology bootstrapping process for web services. We exploit the advantage that web services usually consist of both WSDL and free text descriptors. The WSDL descriptor is evaluated using two methods, namely Term Frequency/Inverse Document Frequency (TF/IDF) and web context generation. Our proposed ontology bootstrapping process integrates the results of both methods and applies a third method to validate the concepts using the service free text descriptor, thereby offering a more accurate definition of ontologies. We extensively validated our bootstrapping method using a large repository of real-world web services and verified the results against existing ontologies. The experimental results indicate high precision. Furthermore, the recall versus precision comparison of the results when each method is separately implemented presents the advantage of our integrated bootstrapping approach. Index Terms—Web services discovery, metadata of services interfaces, service-oriented relationship modeling.
service can be separated into two types of descriptions: 1) the Web Service Description Language (WSDL) describing “how” the service should be used and 2) a textual description of the web service in free text describing “what” the service does. This advantage allows bootstrapping the ontology based on WSDL and verifying the process based on the web service free text descriptor. The ontology bootstrapping process is based on analyzing a web service using three different methods, where each method represents a different perspective of viewing the web service. As a result, the process provides a more accurate definition of the ontology and yields better results. In particular, the Term Frequency/Inverse Document Frequency (TF/IDF) method analyzes the web service from an internal point of view, i.e., what concept in the text best describes the WSDL document content. The Web Context Extraction method describes the WSDL document from an external point of view, i.e., what most common concept represents the answers to the web search queries based on the WSDL content. Finally, the Free Text Description Verification method is used to resolve inconsistencies with the current ontology. An ontology evolution is performed when all three analysis methods agree on the identification of a new concept or a relation change between the ontology concepts. The relation between two concepts is defined using the descriptors related to both concepts. Our approach can assist in ontology construction and reduce the maintenance effort substantially. The approach facilitates automatic building of an ontology that can assist in expanding, classifying, and retrieving relevant services, without the prior training required by previously developed approaches. We conducted a number of experiments by analyzing 392 real-world web services from various domains. In particular, the first set of experiments compared the precision of the concepts generated by different methods. Each method supplied a list of concepts that were analyzed to evaluate how many of them are meaningful and could be related to the services. The second set of experiments compared the recall Published by the IEEE Computer Society