Domain ontology learning from the web

Tesis doctoral de David Sanchez Ruenes

Ontology learning is defined as the set of methods used for building from scratch, enriching or adapting an existing ontology in a semi-automatic fashion using heterogeneous information sources. This datadriven procedure uses text, electronic dictionaries, linguistic ontologies and structured and semi-structured information to acquire knowledge. recently, with the enormous growth of the information society, the web has become a valuable source of information for almost every possible domain of knowledge. This has motivated researchers to start considering the web as a valid repository for information retrieval and knowledge acquisition. however, the web suffers from problems that are not typically observed in classical information repositories: human oriented presentation, noise, untrusted sources, high dynamicity and overwhelming size. Even though, it also presents characteristics that can be interesting for knowledge acquisition: due to its huge size and heterogeneity it has been assumed that the web approximates the real distribution of the information in humankind. the present work introduces a novel approach for ontology learning, introducing new methods for knowledge acquisition from the web. The adaptation of several well known learning techniques to the web corpus and the exploitation of particular characteristics of the web environment composing an automatic, unsupervised and domain independent approach distinguishes the present proposal from previous works. with respect to the ontology building process, the following methods have been developed: i) extraction and selection of domain related terms, organising them in a taxonomical way; ii) discovery and label of non-taxonomical relationships between concepts; iii) additional methods for improving the final structure, including the detection of named entities, class features, multiple inheritance and also a certain degree of semantic disambiguation. The full learning methodology has been implemented in a

 

Datos académicos de la tesis doctoral «Domain ontology learning from the web«

  • Título de la tesis:  Domain ontology learning from the web
  • Autor:  David Sanchez Ruenes
  • Universidad:  Politécnica de catalunya
  • Fecha de lectura de la tesis:  14/12/2007

 

Dirección y tribunal

  • Director de la tesis
    • Antonio Moreno Ribas
  • Tribunal
    • Presidente del tribunal: ricardo Baeza-yates
    • asuncion Gomez perez (vocal)
    • patty Kostkova (vocal)
    • horacio Rodríguez hontoria (vocal)

 

Deja un comentario

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *

Scroll al inicio