Desambiguação de Entidades Mencionadas em Textos na Língua Portuguesa ou Espanhola (Named Entity Disambiguation over Texts Written in the Portuguese or Spanish Languages)

João Tiago Luís Santos (joaosantos.010@gmail.com), Ivo Miguel Anastácio (ivo.anastacio@ist.utl.pt), Bruno Emanuel Martins (bruno.g.martins@ist.utl.pt)


INESC-ID
This paper appears in: Revista IEEE América Latina

Publication Date: March 2015
Volume: 13,   Issue: 3 
ISSN: 1548-0992


Abstract:
This article addresses the problem of disambiguating named entities, in text documents, towards entries in a knowledge base like Wikipedia. The proposed approach uses supervised learning to sort candidate knowledge base entries for each entity mentioned in a text, and then to classify the entry ranked in the first position as either the correct disambiguation or not. We present results with Portuguese and Spanish texts for a wide range of models and configuration options. Our experiments attest to the effectiveness of supervised learning methods in this specific task, showing that out-of-the-box algorithms and relatively simple features can achieve a high accuracy.

Index Terms:
Information Extraction, Named Entity Disambiguation, Supervised Machine Learning   


Documents that cite this document
This function is not implemented yet.


[PDF Full-Text (650)]