Agrupamento Automático de Textos em Português Brasileiro Utilizando Computação Evolucionária (Brazilian Portuguese Text Clustering Based on Evolutionary Computing)

Alexandre Ribeiro Afonso (rafonso.alex@gmail.com)0


1

This paper appears in: Revista IEEE América Latina

Publication Date: July 2016
Volume: 14,   Issue: 7 
ISSN: 1548-0992


Abstract:
This paper describes a new system for automated text clustering, it was specifically designed to work with scientific articles written in Brazilian Portuguese. The system has two modules, the first one extracts the main terms and generates an index for each text, the second module uses this index to group the texts within their original topics. The first module innovates by selecting compound terms instead of single terms to produce the indexes; the second module applies a new evolutionary clustering algorithm, having a new method of work, over the indexes. The system was tested with four different corpora, the metrics reveal a reasonable clustering result when combining compound terms and the evolutionary method proposed, although the number of clusters generated is distant from the original number of topics. The time consumed by the new clustering algorithm is high when running it over a conventional personal computer.

Index Terms:
Automatic Indexing, Text Clustering, Evolutionary Algorithms.   


Documents that cite this document
This function is not implemented yet.


[PDF Full-Text (394)]