A Distributed Weighted Centroid-based Indexing System

Miguel Rio, Joaquim Macedo, Vasco Freitas

Universidade do Minho
Departamento de Informática
P-4700-320 Braga, Portugal

Tel.: +351 253 604475
Fax.: +351 253 604471
E-mail: {rio,macedo,vf} (at) di.uminho.pt


Abstract

This paper describes the WHERE system, an approach to a distributed indexing service for document search on the Internet based upon an architecture of centroids.

Numerical data, produced with the aid of Information Retrieval techniques, in the form of a weighing measure, are added to the whois++ centroids enabling ranked results to be delivered to clients, which they use not only for presenting them to the user as an ordered response to his query but also for a more efficient interaction with the directory mesh.

The underlying retrieval engine is based upon the vector space model. The system provides for a reduced search space in the distributed index mesh, as compared to that of whois++, as allowed by the mesh traversal algorithm employed.

Keywords: Internet document searching, Common Indexing Protocol, centroid, Information Retrieval


8th Joint European Networking Conference - JENC8, Edinburgh, Scotland, May 12-15, 1997