Database Ralationship

Topics: Relational model, Database, Relational database Pages: 25 (7781 words) Published: April 22, 2013
Linking Named Entities to Any Database
Avirup Sil∗ Temple University Philadelphia, PA Yinfei Yang St. Joseph’s University Philadelphia, PA Abstract Existing techniques for disambiguating named entities in text mostly focus on Wikipedia as a target catalog of entities. Yet for many types of entities, such as restaurants and cult movies, relational databases exist that contain far more extensive information than Wikipedia. This paper introduces a new task, called Open-Database Named-Entity Disambiguation (Open-DB NED), in which a system must be able to resolve named entities to symbols in an arbitrary database, without requiring labeled data for each new database. We introduce two techniques for Open-DB NED, one based on distant supervision and the other based on domain adaptation. In experiments on two domains, one with poor coverage by Wikipedia and the other with near-perfect coverage, our Open-DB NED strategies outperform a state-of-the-art Wikipedia NED system by over 25% in accuracy.

Ernest Cronin∗ Penghai Nie St. Joseph’s University St. Joseph’s University Philadelphia, PA Philadelphia, PA Ana-Maria Popescu Yahoo! Labs Sunnyvale, CA Alexander Yates Temple University Philadelphia, PA

referents, but exclusive focus on Wikipedia as a target for NED systems has significant drawbacks: despite its breadth, Wikipedia still does not contain all or even most real-world entities mentioned in text. As one example, it has poor coverage of entities that are mostly important in a small geographical region, such as hotels and restaurants, which are widely discussed on the Web. 57% of the named-entities in the Text Analysis Conference’s (TAC) 2009 entity linking task refer to an entity that does not appear in Wikipedia (McNamee et al., 2009). Wikipedia is clearly a highly valuable resource, but it should not be thought of as the only one. Instead of relying solely on Wikipedia, we propose a novel approach to NED, which we refer to as Open-DB NED: the task is to resolve an entity to Wikipedia or to any relational database that meets mild conditions about the format of the data, described below. Leveraging structured, relational data should allow systems to achieve strong accuracy, as with domain-specific or database-specific NED techniques like Hoffart et al.’s NED system for YAGO (Hoffart et al., 2011). And because of the availability of huge numbers of databases on the Web, many for specialized domains, a successful system for this task will cover entities that a Wikipedia NED or database-specific system cannot. We investigate two complementary learning strategies for Open-DB NED, both of which significantly relax the assumptions of traditional NED systems. The first strategy, a distant supervision approach, uses the relational information in a given database and a large corpus of unlabeled text to learn a database-specific model. The second strat-



Named-entity disambiguation (NED) is the task of linking names mentioned in text with an established catalog of entities (Bunescu and Pasca, 2006; Ratinov et al., 2011). It is a vital first step for semantic understanding of text, such as in grounded semantic parsing (Kwiatkowski et al., 2011), as well as for information retrieval tasks like person name search (Chen and Martin, 2007; Mann and Yarowsky, 2003). NED requires a catalog of symbols, called referents, to which named-entities will be resolved. Most NED systems today use Wikipedia as the catalog of

egy, a domain adaptation approach, assumes a single source database that has accompanying labeled data. Classifiers in this setting must learn a model that transfers from the source database to any new database, without requiring new training data for the new database. Experiments show that both strategies outperform a state-of-the-art Wikipedia NED system by wide margins without requiring any labeled...
Continue Reading

Please join StudyMode to read the full document

You May Also Find These Documents Helpful

  • Relational Database Management System (Dbms) Essay
  • database Essay
  • Fourth Normal Form in Relational Database Essay
  • Database for Scheduling Essay
  • Tour Operator Agency Database Essay
  • Essay about Normalization to Database
  • Database management system notes Essay
  • Essay about Database Management

Become a StudyMode Member

Sign Up - It's Free