Content area
Full Text
Introduction
Searching is a crucial activity on the web ([14] Madden, 2003; [9] Fallows, 2004) and search engines (SEs) are the powerful search tools for locating information in this environment. However, it is impossible for any single SE to index the entire web and large SEs cover only a fraction of it ([3] Bharat and Broder, 1998; [12] Lawrence and Giles, 1999). Recently, [21] Spink et al. (2006) conducted a large-scale study across the four most popular web SEs (MSN, Google, Yahoo and Ask Jeeves) to measure the overlap of search results on the first results page. Their findings showed that the percentage of total results unique to only one of the four web SEs was 84.9 per cent, shared by two of the three web SEs was 11.4 per cent, shared by three of the web SEs was 2.6 per cent, and shared by all four web SEs was 1.1 per cent. This small degree of overlap shows a single SE cannot be utilised effectively for finding information on the web.
By combining the coverage of multiple SEs through a system called a metasearch engine (MSE), a much higher percentage of the web can be searched. An MSE is a system that provides unified access to multiple existing SEs ([15] Meng et al. , 2002). When an MSE receives a user query, it first passes the query (with necessary reformatting) to the appropriate underlying SEs, and then collects and reorganises the results received from them. Due to the crucial role of MSEs in finding information on the web, evaluating their performance is an important area of investigation. In paper a new method for evaluating the performance of MSEs is proposed. Moreover the method was used in the study reported here to evaluate the eight major MSEs (Clusty, Dogpile, Excite, Mamma, MetaCrawler, Search.com, WebCrawler and Webfetch).
Related studies
This section briefly reviews the existing literature on various aspects of web MSEs. In the last ten years a large number of research papers on issues related to MSE design have been published. Some effective algorithms have been proposed to tackle several underlying challenges in building an efficient and effective MSE ([22] Yager and Kreinovich, 1999; [7] Dwork et al. , 2001; [20] Si and Callan,...