Full Text

Turn on search term navigation

Headnote

Abstract

Rank fusion meta-search engine algorithms can be used to merge web search results of multiple search engines. In this paper we introduce two variants of the Weighted Borda-Fuse algorithm. The first variant retrieves documents based on popularities of component engines. The second one is based on k user-defined toplist of component engines. In this research, experiments were performed on k={50,100,200} toplist with AND/OR combinations implemented on 'UNIB Meta Fusion' meta-search engine prototype which employed 3 out of 5 popular search engines. Both of our two algorithms outperformed other rank fusion algorithms (relevance score is upto 0.76 compare to Google that is 0.27, at P@10). The pseudo-relevance automatic judgement techniques involved are Reciprocal Rank, Borda Count, and Condorcet. The optimal setting was reached for queries with operator "AND" (degree 1) or "AND ... AND" (degree 2) with k=200. The 'UNIB Meta Fusion' meta-search engine system was built correctly.

Keywords: Weighted Borda-Fuse, rank fusion, meta-search engine, pseudo-relevance automatic judgement, query relevance

(ProQuest: ... denotes formulae omitted.)

1. Introduction

There are many proposals for a meta-search engine (MSE). Given a query (a set of keywords), typically an MSE system retrieves web pages that are relevant to the query by exploiting all its underlying search engines. It sends the query to these engines; the results obtained are then merged and ranked. It returns final web documents ranked by relevance. In the Helios architecture [1] the MSE system uses standard merger and ranker modules. To achieve high performance it utilizes async I/O and parallel TCP connections with the remote search engines. In the Tadpole architecture [2, 3], the rank fusion algorithms are based on a variety of parameters, such as the rank order and the number of times an URL appears in the results of each of its search engines components, to compute a weight for each collected results [2-4]. There is also the concern of user specific needs. For example, an MSE should ideally let the user choose his favourite search engines from an available list, and do query modifications, as well as explore available rank fusion techniques [2].

In general, rank fusion algorithms offer improvement of the relevance scores of the returned documents of multiple search engines. Dwork, Kumar, Naor, and Sivakumar propose the use of rank aggregation methods for MSEs viz. the Borda's method, Footrule and Scaled Footrule, and Markov Chain methods [5]. Lam and Leung propose a complete directed graph viz. MST Algorithm [6]. Supervised rank aggregation methods such as Borda-Fuse and supervised Markov Chain based methods are investigated in [7]. KE algorithm [8] and its variants [4] exploit the ranking on the results that an MSE receives from its component engines, by considering the number of documents appearances in the component engines' lists with equal reliability assumption of those engines. Another rank fusion MSE algorithm named Count Function [9] defines web documents ranking as summing ranks as per positions of a URL divided by the count of URL documents. Aslam and Montague introduces a voting fusion method named Borda-Fuse which is an adaptation of the Borda Count election process [1 0]. Borda-Fuse, tested in two of the five tests using TREC test data, performed better than the best component IR system in the election results [10, 11]. Borda-Fuse that is extended to a weighted variant is called Weighted Borda-Fuse algorithm; it multiplies the points in which a retrieval system Si assigns to a candidate URL with a system weight Wi . By using improved performance weights, Weighted Borda-Fuse has the potential of outperforming CombMNZ [12].

The coverage of each search engine is limited, only about 1% of billion pages are in the surface web while the rest are in the deep web. Therefore it is interesting to know how to merge different search engines and how deep we should crawl the web to potentially retrieved still relevant documents. Getting less search engines report ranking scores, we can convert the local ranks into local ranking scores. The KE algorithm [1 3] is a score-based method that exploits ranking of the search results of the component engines where all those engines are treated equally reliable. Consider a document x. In KE, the local ranks ( ri ) of x as returned from all component engines of an MSE are summed and converted to a single weight score (Wke ) using this formula:

... (1)

where å ri is the sum of all rankings from each search engines that the document appears, n i=1 is the number of component engines where the document appears in their results, m is the total number of component engines exploited, and k is the number of toplist documents crawled from each component engine. The lesser the weight, the better the ranking score is.

Patel and Shah propose to simply use the Count Function to compute the ranking of an MSE document [9]. The ranking of the document x is computed as follows:

... (2)

where Pi (x) is the local rank of document x returned from a component engine. Pi (x) in (2) is same with ri in (1). Unlike the KE algorithm, the documents are here ranked in descending order (the higher the weight the better the ranking score).

Borda Count [10] is a voting-based data fusion that is adopted to a meta-search engine environment in the Weighted Borda-Fuse (WBF) algorithm [14, 15]. Different from the KE and Count Function algorithms, in WBF, component search engines do not have to be treated as equally realiable. The votes for a web document that lays on the local rank i of the component search engine j are

... (3)

where wj is the weight of j , and maxk (rk ) is the number of toplist documents crawled from component search engine k. Retrieved web documents that appear in more than one search engines receive the sum of their votes. The documents are ranked in descending order of the total votes they receive (the higher the vote the better the ranking score) [3, 7, 10, 16].

Evaluation key parameters for ranking strategies in an MSE can be viewed (optionally) by its algorithmic complexity, rank aggregation time, overlap across search engines, and the precisions from the user's perspective [3, 17]. For situations where a search engine joins a meta-search engine on the fly [18], some rank fusion algorithms can not be implemented due to estimating a search engine score usually requires enough sample data from each component engine. In our work, we use several k-toplist values of component engines.

Research of query operators' utilization has reported that about 10% of search engine users use advanced query operators such as Boolean AND/OR (and the rest only use simple queries) [19]. We are interested in examining the effect of complexity degree of query, in particular degree one and two (using respectively one and two operators). According to [20], the performance of queries of complexity one outperform that of complexity two, in all cases; but is the decrease in relevance significant enough? All of our query experiments in this research are implemented on searching web documents using 3 search engines: Google, Altavista, and Fast search engine. It would be interesting to see whether the results are consistent using different component search engines.

This paper introduces two variants of Weight Borda Fuse algorithm that aims to improve query relevance of web search results in a meta-search engine environment. The rest of this paper is organized as follows: Section 2 describes our prototype of meta-search engine system and the proposed Weighted Boda-Fuse variants, Section 3 reports our experimental results, and finally Section 4 gives some conclusions.

2. The 'UNIB Meta Fusion' Prototype and The Algorithms

To overcome issues defined in Section 1, we built a prototype of a user adaptive MSE called 'UNIB Meta Fusion' that allows a user to choose his favourite search engines and ktoplist set up of retrieved web documents (k = 50, 100, 200). In this research we experiment with our two proposed variants of the WBF algorithm [14, 15], KE algorithm [8], and Count Function algorithm [9]. Relevance is computed using two IR metrics: precisions and MRR. All is measured from 10-toplist of MSE against 10-toplist of pseudo-relevance technique.

Our 'UNIB Meta Fusion' (Figure 1) is a meta-search engine prototype that supports a choice of 3 over 5 well-known search engines. Related to Figure 1, SE = {SE1, ... , SE5} is a list of component search engines: Google, Bing, Ask.com, Lycos, and Exalead, respectively. Since we intended this project for research purpose, processes of retrieving, parsing, merging, ranking, and reporting the results of the search engines are done separately in the off-line mode. The prototype will show only the list of URL results of the best rank fusion method. By 'UNIB Meta Fusion' we especially want to investigate which rank fusion method outperforms the others, in different toplist retrieved documents' setup. We modify the Weighted Borda-Fuse algorithm into 2 variants, called 'Default' WBF and 'MyOwn' WBF. 'Default' sets up the number of retrieved toplist documents based on the popularity of a search engine whereas in 'MyOwn' the user is free in defining the combination of toplist of multiple components search engines.

Figure 2 shows the architecture of 'UNIB Meta Fusion' system. The Web Interface allows a user to submit a query, specify a choice of three search engines, and a number k for how many toplist documents will be retrieved from each of the specified search engine. The Query Parser creates an appropriate format for the query and passes the information of k- toplist URLs and choices of search engines to Best Algorithm. Best Algorithm employs only best performed rank fusion algorithm viz. the algorithm that has the highest documents retrieval's relevance score to a set of gold standard retrieval relevant documents generated by either Rank Relevance, Borda Count, or Condorcet technique such as suggested by Nuray and Can [21]. Best Algorithm then returns the merged and ranked list of documents to Query Parser that in turn will return sets of [URL, title, snippet] as query search results into user.

The Offline Query Processor is the most time consuming part of this research. Given a list of queries, sets of k-toplist of component engines (k = 50, 100, 200), and sets of combinations of 3 out of 5 search engines, the HTTP Retrievers handles the network communications, and the Search Results Collector stores separately each k-toplist documents of the search engines. Results Parser parses the lists into URLs, titles, and snippets. These are then combined in Merger and Ranker and then ranked by one of the Rank Fusion Algorithms: KE, 'Default' WBF, 'MyOwn' WBF, and Count Function algorithm. We then carefully investigate and decide which algorithm that work best to be employed in Best Algorithm.

The 'UNIB Meta Fusion' returns documents processed by the off-line query processor using 10 queries of two terms and three terms length (see Table 1). Those queries are extended further using operators AND/OR. For example, for three terms query of "Java applet programming" is combined further to form 4 different new queries: "Java AND applet AND programming", "Java AND applet OR programming", "Java OR applet AND programming", and "Java OR applet OR programming". Two terms length queries get a similar treatment. At the end, the 10 queries are extended into 30 different queries.

2.1. The User Defined 'MyOwn' WBF Algorithm

The user defined 'MyOwn' WBF algorithm allows a user to define his own search engine weights. By considering (3), we set the rk of each search engine with an equal value of k-toplist; but with different weights as suggested by the user. The weights are usually proportionate with the user's trust of relevancy of the corresponding search engines. In other words, by using the user defined 'MyOwn' WBF algorithm we would like to know whether the MSE system will produce a good relevance score if we treat the system with different weights, given a user defined k values of k-toplist crawling document for all of component search engines.

The 'MyOwn' WBF's processes for meta-search 3 of 5 search engines are as follow:

Step 1. The user specifies k; this determines the number of documents of all of component search engines will later retrieve (k-toplist), e.g. k = 200.

Step 2. Define the set of search engines SE = {SE1, SE2, ... , SEn} that are available for metasearching. // In our case n = 5.

Step 3. The user set the weight w j (j = 1, ... , n) for each search engine in Step 2. // For example w j = {50, 30, 20, 25, 15}.

Step 4. The user selects three out of n search engines to be used.

Step 5. For each three engines from Step 4, set maxk (rk) to be the value of the constant k from Step 1 for each component engine.

Step 6. For each document found in the k-toplist returned by each component engine chosen in Step 4:

Step 6a. Compute V ( ri , j ) , using equation (3), where i is the ranking of the document in engine j.

Step 6b. Compute the document's WBF ranking score:

...

Consider a document x. The total_SE is the number of search engines where document x is found.

Step 7. Order the found documents descendingly by their WBF ranking scores.

Step 8. Presents the top 10 documents obtained from Step 7 to the user.

Example 1:

Consider a meta-search engine system built using 'MyOwn' WBF algorithm. Let n = 5, and the chosen search engines are SE1, SE2, and SE3, with k = 200 for all of them (this determines the k-toplist to be retrieved). Suppose the user specifies {50, 30, 20} as the weights of the component engines respectively. Assume we have 3 documents: Doc1, Doc2, and Doc3 and several facts:

- Doc1 is found respectively at rank 8,9, and 11 in the toplists of SE1, SE2, and SE3.

- Doc2 is found at rank 9 and 13 in the toplists of SE1 and SE3; it is not found by SE2.

- Doc3 is found respectively at rank 3, 5, and 4.

The WBF scoring of these documents is then shown in Table 2.

Table 2 shows the ordering: Doc3 > Doc1 > Doc2.

2.2. The 'Default' WBF Algorithm

The 'Default' WBF algorithm is a special instance of the 'MyOwn' WBF but with the differences in Step 1 and Step 5 of the 'MyOwn' algorithm. In 'Default' WBF, the k values for ktoplist of each component engine are influenced by each component engine's weight. In the 'Default' WBF, consider 3 component search engines and k is an element of {50, 100, 200}, then for equation (3) we set maxk (rk ) to be 200 for the engine with the highest w j ; 100 for the engine with the second highest w j , and 50 for the third engine.

Example 2:

Consider a meta-search engine system built using 'Default' WBF algorithm. Let n = 5, and the chosen search engines are SE1, SE2, and SE3. Suppose the user specifies {50, 30, 20} as the respective weigth of those engines. Assume we have 3 documents: Doc1, Doc2, and Doc3 and several facts:

- Doc1 is found on SE1 in rank 8 from SE1 toplist, on SE2 in rank 9 from SE2 toplist, and on SE3 in rank 11 from SE3 toplist.

- Doc2 is found on SE1 in rank 9 from SE1 toplist, not found on SE2, and on SE3 in rank 13 from SE3 toplist.

- Doc3 is found on SE1 in rank 3 from SE1 toplist, on SE2 in rank 5 from SE2 toplist, and on SE3 in rank 4 from SE3 toplist.

Then the scoring of 'Default' WBF as on Table 3.

From Table 3 we have Doc3 > Doc1 > Doc2 in the rank order of the MSE system. This rank order is influenced by WBF scores of each documents. The more relevant a document, the WBF will put it into higher position of web search retrieval of the MSE.

3. Results and Analysis

We have described two variants for ranking in WBF meta-search. We would like to compare them with other existing rank fusion algorithms: the KE algorithm [8] and the Count Function algorithm [9]. Queries are sent to each search engine, retrieving toplists until k {k = 50, 100, 200} URLs have been crawled from each component search engine and merged by the four algorithms ('Default' WBF, 'MyOwn' WBF, KE, and Count Function). For evaluation, as the queries are multi domain (not limited such as TREC datasets; these multi domain queries are for simulating real world situations) and also since using human judgment is expensive, we evaluate our system using three different gold standards: Reciprocal Rank (RR), Borda Count (BC), and Condorcet methods. The lattests are known as "Pseudo-Relevance" datasets as suggested in [21].

In this research, all experiments are executed on an Acer 4741 machine with an Intel core i3 and 5GB RAM. All prototyping processes from retrieval, parsing, merging, ranking, until presenting the query results to user, are implemented in Python. The language is efficient and a fast Python module, named webpy, helps in providing a user friendly interface of the MSE prototype. For the evaluation of the tasks in all of our experiments, we adopted two metrics that capture the relevance at different aspects [22]:

· Precision at rank n (P@ n ): Precision at rank n is defined as the proportion of retrieved documents that is relevant with the gold standard, averaged over all documents.

· Mean Reciprocal Rank (MRR): MRR measures where in the ranking the first relevant document (with the gold standard) is returned by the system, averaged over all the documents. This measure provides insight in the ability of the system to return a relevant document at the top of the ranking.

Table 4 to Table 6 shows the results in terms of both P@n and MRR of different ktoplist compared to gold standards (Pseudo-Relevance). From these results, we can see that our proposed method ('Default' and 'MyOwn' WBF) achieves the best results in terms of both P@n and MRR. In general, these verifies the effectiveness of our proposed method for rank aggregation.

For both WBF variants in all of experiments in this research we set up weights = {30, 20,15,25,10} respectively for Google, Bing, AskJeeves, Lycos (powered by Yahoo!), and Exalead. All relevance scores is obtained using best P@10 of each rank fusion algorithm compare to best P@10 of Google. Google has chosen as a benchmark since Google shows best performance of any individual search engines. We order individual component engines by their weights for convenience (Table 4 to Table 6).

The data in Table 4 are that of k-toplist with k = 50. Table 4 clearly shows that the highest relevance score (P@10) for 'MyOwn' WBF is 0.6950 for Borda Count, two times higher than that of Google (2.02 times); that of 'Default' WBF is 0.7300 for Borda Count, also two times higher than that of Google (2.12 times); and that of KE is 0.6650 for Reciprocal Rank that almost two times higher than that of Google (1.93 times). From Table 4, the best P@10 for Count Function is only 0.3050 for Condorcet that below the P@10 of Google (lesser, 0.89 times). From the MRR perspective, in general at k = 50 the two WBF methods outperform the others (WBFs do quite well in getting the first correct position).

Table 5 shows the same data, but for k = 100. It also shows improvement in P@10 with highest relevance score is for 'MyOwn' WBF with 0.7470 for Borda Count (2.41 times higher than P@10 of Google). The highest score of 'Default' WBF is now 0.6953 for Borda Count pseudo-relevance sets (2.25 times); that of KE is now 0.6993 for Borda Count (2.26 times); and Count Function is 0.300 for Condorcet (lesser, only 0.97 times). All against best P@10 of Google. In terms of MRR, the two WBF algorithms still perform better than the others.

Table 6 shows the results for k-toplist with k = 200. From Table 6 the highest improvement in relevance score (P@10) against best P@10 of Google for 'MyOwn' WBF is 0.7630 for Borda Count pseudo-relevance sets (2.78 times than that of Google); for 'Default' WBF the highest is 0.6377 for Borda Count pseudo-relevance sets (2.32 times); for KE method, the highest is 0.7213 for Borda Count pseudo-relevance sets (2.63 times); and for Count Function the highest is 0.3003 for Condorcet (1.09 times). All against P@10 of Google. In general as others k-toplist, at k = 200 the MRR of WBFs also stable outperform other rank fusion methods.

As a conclusion, both the 'Default' WBF and the user defined 'MyOwn' WBF produce best results. They outperform other algorithms such as KE [13] and Count Function [9]. The relevance of Count Function algorithm is far below the WBFs, this is due to the simplicity of the algorithm that only computes the sum of local rank of document x returned from each component engines divides by total number of occurence of x in all meta-search engine components. The Count Function algorithm does not consider neither popularities of component engines nor the number of crawled toplist documents. KE algorithm concerns about how many number of toplist documents crawled from each component engines but popularities are missed.

From experiments also we found that the best k-toplist of each component search engines (in terms of precision) is reached for k = 200 ('MyOwn' WBF with precision of 2.78 times higher than that of Google), followed by 2.41 times higher for 'MyOwn' WBF with k = 100, and 2.12 times higher for 'Default' WBF with k = 50, all compared to Google. Therefore the best methods found among rank fusion methods are the Weighted Borda-Fuse algorithms. The 'Default' WBF is suitable for small datasets while the 'MyOwn' WBF is suitable for bigger datasets (e.g. >100-toplist crawled from each component engine). The best gold standard is achieved by Borda Count technique.

From Table 4 to Table 6, mostly in all cases the weight that has been set up influence the result. For example, high weight on Google will most probably give Google as the best search engine. Except for ask.com that always better than Bing. The Ask.com uses the ExpertRank algorithm that performs better than Bing that uses best trail finding algorithms. The ExpertRank algorithm is based on the HITS algorithm that uses a scheme in which every web page is assigned two scores: the hub score and the authority score. When compare to Google's PageRank algorithm, the Google search engine had more relevant top results, higher quantity of relevant results and that its results remained more stable than the ExpertRank algorithm [23]. While Google's search algorithms are very dependent on HTML text when it comes to indexing websites, multimedia contents (images, video, audio, Flash, and others) are far better with Bing [24]. In all of our experiments we ignore any multimedia contents and focus on text (structured information) this is the rationale for why ask.com always better than Bing. The Exalead search engine provides hybrid searching over typed information extracted from structured databases, as well as searching over unstructured text [25]. This semantic search engine is not too success in our experiments, again because we only focus on structured information.

Furthermore, considering the length of the queries, in Table 7 and Table 8 we focus on the use of two or three terms query since most query (97%) of all queries in World Wide Web having less than 6 terms [26]. In our experiments, we do not use queries from TREC since their length average are longer than common queries on internet. Table 7 shows the effect of length of query terms. From the results we know that using two or three terms are optional since there are no significant difference in the relevance scores (P@n and MRR). Table 7 supports result in [20] that performance of the query complexity 2 terms always outperforms the performance of the 3 terms, but in our case the difference is not significant thus we leave this as a choice to user.

To examine the effect of complexity degree of query length, we analyzed queries with degree 1 = 1 operator (two terms) and degree 2 = 2 operators (three terms). As operators we use AND/OR combinations. Table 8 shows relevance scores of Borda Count pseudo-relevance sets obtained from the 'UNIB Meta Fusion' MSE prototype. From Table 8 we suggest the use of operator "AND" for degree 1 and operators "AND ... AND" for degree 2 that stable in producing relevant results.

4. Conclusions

In this paper we briefly described two rank fusion algorithms: 'MyOwn' WBF and 'Default' WBF as well as their implementation on the 'UNIB Meta Fusion', a meta-search engine prototype. From experiments we showed that our variants of Weighted Borda-Fuse algorithms stable outperforms other MSE rank fusion methods. We showed that the weight that has been set up influence the result. The 'Default' WBF is best for small datasets while the 'MyOwn' WBF is best for larger datasets. The best value of k-toplist for crawling the web is achieved for k = 200. Furthermore, we suggest the use of operators "AND" or "AND ... AND" each for degree 1 and degree 2 queries to increase relevance with user needs. From experiments there are no significant difference in relevance if a user uses either two or three terms queries while browsing a search engine. As a general conclusion, our system, the MSE prototype 'UNIB Meta Fusion', was built correctly.

Acknowledgment

This work is supported by the University of Bengkulu's Excellent Research Grant #557/UN30.15/LT/2014 from the Research and Community Services Office of the University of Bengkulu (LPPM UNIB), Bengkulu Province, Sumatera, Indonesia.

References

References

[1] Gulli A, Signorini A. Building an Open Source Meta Search Engine. World Wide Web Conference. Chiba, Japan. 2005: 1004-1005.

[2] Meng W. Metasearch Engines. In: Liu L, Ozsu MT. Editors. Encyclopedia of Database Systems. 2009 edition. New York, USA: Springer; 2009: 1730-1734.

[3] Mahabhashyam MS, Singitham P. Tadpole: A Metasearch Engine Evaluation of Meta Search Ranking Strategies. Stanford University. Report number: CS276A. 2002.

[4] Akritidis L, Katsaros D, Bozanis P. Effective Ranking Fusion Methods for Personalized Metasearch Engines. 12th Pan-Hellenic Conference on Informatics (PCI 2008). Samos Island, Greece. 2008: 3943.

[5] Dwork C, Kumar R, Naor M, Sivakumar D. Rank Aggregation Methods for the Web. Proceedings of the 10th International World Wide Web Conference. Hong Kong. 2001: 613-622.

[6] Lam KW, Leung CH. Rank Aggregation for Meta-search Engines. Proceedings of the 13th International Conference on World Wide Web. New York. 2004: 384-385.

[7] Liu YT, Liu TY, Qin T, Ma ZM, Li H. Supervised Rank Aggregation. Proceedings of WWW 2007 Conference. Banff, Alberta, Canada. 2007: 481-489.

[8] Akritidis L, Voutsakelis G, Katsaros D, Bozanis P. QuadSearch: A Novel Metasearch Engine. Proceedings of the 11th Panhellenic Conference on Informatics (PCI 2007). Patras, Greece. 2007: 453-466.

[9] Patel B, Shah D. Ranking Algorithm for Meta Search Engine. IJAERS International Journal of Advanced Engineering Research and Studies. 2012; 2(1): 39-40.

[10] Aslam JA, Montague M. Models for Metasearch. Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 01). New York. 2001: 276-284.

[11] Christensen HU, Ortiz-Arroyo D. Applying Data Fusion Methods to Passage Retrieval in QAS. In: Haindl M, Kittler J, Roli F. Editors. Multiple Classifier Systems: 7th International Workshop, MCS 2007, Prague, Czech Republic, May 23-25, 2007, Proceedings. IEEE Computer Society Press. 2007. p. 82. (Lecture Notes in Computer Science, Vol. 4472). 1st ed. Berlin Heidelberg: Springer-Verlag; 2007: 82-92. Available from: 10.1007/978-3-540-72523-7_9.

[12] Montague M, Aslam J. Condorcet Fusion for Improved Retrieval. Proceedings of the 11th Annual ACM Conference on Information and Knowledge Management (CIKM 02). Tysons Corner, VA. 2002: 538-548.

[13] Renda ME, Straccia U. Web Metasearch: Rank vs. Score based Rank Aggregation Methods. Proceedings of the ACM Symposium on Applied Computing (SAC). Melbourne, FL. 2003: 841-846.

[14] Fagin R, Kumar R, Sivakumar D. Comparing Top-k Lists. SIAM Journal on Discrete Mathematics. 2003; 17(1): 134-160.

[15] Dorn J, Naz T. Structuring Meta-Search Research by Design Patterns. Proceedings of the International Computer Science and Technology Conference (ICSTC). San Diego, California, USA. 2008.

[16] van Erp M, Schomaker L. Variants of the Borda Count Method for Combining Ranked Classifier Hypotheses. Proceedings of the 7th International Workshop on Frontiers in Handwriting Recognition. Amsterdam. 2000: 443-452.

[17] Jadidoleslamy H. Search Result Merging and Ranking Strategies in Meta-Search Engines: A Survey. IJCSI International Journal of Computer Science. 2012; 9(4): 239-251.

[18] Wu Z, Raghavan V, Du C, Sai K, Meng W, He H, Yu C. SE-LEGO: Creating Metasearch Engine on Demand. ACM SIGIR Conference, Demo paper. Toronto, Canada. 2003: 464-464.

[19] Eastman CM, Jansen BJ. Coverage, Relevance, and Ranking: the Impact of Query Operators on Web Search Engine Results. ACM Transactions on Information Systems. 2003; 21(4): 383-411.

[20] Mohamed KA-E-F. Merging Multiple Search Results Approach for Meta-Search Engines. PhD Thesis. Pittsburgh, United States: Postgraduate School of Information Sciences, University of Pittsburgh; 2004.

[21] Nuray R, Can F. Automatic Ranking of Information Retrieval Systems using Data Fusion. Journal of Information Processing and Management. 2006; 42(3): 595-614.

[22] Sigurbjörnsson B, van Zwol R. Flickr Tag Recommendation based on Collective Knowledge. WWW 2008 Conference. Beijing, China. 2008: 327-336.

[23] Zeno G. PageRank vs ExpertRank: Analysis of Their Link Analysis Algorithms for Ranking Search Results. Thesis. Bayamon: Department of Computer Science, University of Puerto Rico; 2010.

[24] Singla A, White RW, Huang J. Studying Trailfinding Algorithms for Enhanced Web Search. Proceedings of SIGIR'10. Geneva, Switzerland. 2010: 443-450.

[25] Exalead. Exalead CloudView Semantics White Paper. Exalead. Report number: EN.140.001.0-V1.2. 2010.

[26] Jansen B, Spink A, Bateman J, Saracevic T. Real Life Information Retrieval: A Study of User Queries on the Web. ACM SIGIR Forum. 1998; 32(1): 5-17.

AuthorAffiliation

Diyah Puspitaningrum*1, Jeri Apriansyah Pagua2, Aan Erlansari3, Fauzi4, Rusdi Efendi5

Desi Andreswari6, I.S.W.B. Prasetya7

1,2,3,4,5,6Department of Computer Science, Faculty of Engineering, University of Bengkulu

WR Supratman Street, Kandang Limun, Bengkulu 38371, Sumatera, Indonesia

7Department of Information and Computing Sciences, Utrecht University

PO Box 80.089, 3508 TB Utrecht, The N1etherlands 2

*Corresponding author, e-mails: [email protected], [email protected],

[email protected], [email protected], [email protected],

[email protected], [email protected]

Word count: 5044

Show less

Details

Title

The Analysis of Rank Fusion Techniques to Improve Query Relevance

Author

Puspitaningrum, Diyah; Pagua, Jeri Apriansyah; Erlansari, Aan; Fauzi; Efendi, Rusdi; Andreswari, Desi; Prasetya, I S W B

Pages

1495-1504

Publication year

2015

Publication date

Dec 2015

Publisher

Ahmad Dahlan University

ISSN

16936930

e-ISSN

23029293

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.12928/TELKOMNIKA.v13i4.2736

ProQuest document ID

1805460049

The Analysis of Rank Fusion Techniques to Improve Query Relevance

Jump to:

Full Text

Details

Suggested sources