Indexação Distribuída de Colecções Web de Larga Escala (Distributed Indexing of Large-Scale Web Collections)

Miguel Costa (mcosta@xldb.fc.ul.pt), Mário. J. Silva (mjs@di.fc.ul.pt)


Faculdade de Ciências Universidade de Lisboa, Lisboa, Portugal
This paper appears in: Revista IEEE América Latina

Publication Date: March 2005
Volume: 3,   Issue: 1 
ISSN: 1548-0992


Abstract:
Sidra is a new indexing and ranking system for large-scale Web collections. Sidra creates multiple distributed indexes, organized and partitioned by different ranking criteria, aimed at supporting contextualized queries over hypertexts and their metadata. This paper presents the architecture of Sidra and the algorithms used to create its indexes. Performance measurements on the Portuguese Web data show that Sidra's indexing times and scalability are comparable to those of global Web search engines.

Index Terms:
Indexing, search engines, Web.   


Documents that cite this document
This function is not implemented yet.


[PDF Full-Text (89)]