Wensi Xi
About the author:
No description available of Wensi Xi...
Publications by Wensi Xi (bibliography)
» 2005 «
Xue, Gui-Rong, Lin, Chenxi, Yang, Qiang, Xi, Wensi, Zeng, Hua-Jun, Yu, Yong and Chen, Zheng (2005): Scalable collaborative filtering using cluster-based smoothing. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. pp. 114-121. Available online
Memory-based approaches for collaborative filtering identify the similarity between two users by comparing their ratings on a set of items. In the past, the memory-based approach has been shown to suffer from two fundamental problems: data sparsity and difficulty in scalability. Alternatively, the model-based approach has been proposed to alleviate these problems, but this approach tends to limit the range of users. In this paper, we present a novel approach that combines the advantages of these two approaches by introducing a smoothing-based method. In our approach, clusters generated from the training data provide the basis for data smoothing and neighborhood selection. As a result, we provide higher accuracy as well as increased efficiency in recommendations. Empirical studies on two datasets (EachMovie and MovieLens) show that our new proposed approach consistently outperforms other state-of-art collaborative filtering algorithms.
Copyrights may apply
Zhang, Benyu, Li, Hua, Liu, Yi, Ji, Lei, Xi, Wensi, Fan, Weiguo, Chen, Zheng and Ma, Wei-Ying (2005): Improving web search results using affinity graph. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. pp. 504-511. Available online
In this paper, we propose a novel ranking scheme named Affinity Ranking (AR) to re-rank search results by optimizing two metrics: (1) diversity -- which indicates the variance of topics in a group of documents; (2) information richness -- which measures the coverage of a single document to its topic. Both of the two metrics are calculated from a directed link graph named Affinity Graph (AG). AG models the structure of a group of documents based on the asymmetric content similarities between each pair of documents. Experimental results in Yahoo! Directory, ODP Data, and Newsgroup data demonstrate that our proposed ranking algorithm significantly improves the search performance. Specifically, the algorithm achieves 31% improvement in diversity and 12% improvement in information richness relatively within the top 10 search results.
Copyrights may apply
» 2004 «
Fan, Weiguo, Luo, Ming, Wang, Li, Xi, Wensi and Fox, Edward A. (2004): Tuning before feedback: combining ranking discovery and blind feedback for robust retrieval. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2004. pp. 138-145. Available online
Both ranking functions and user queries are very important factors affecting a search engine's performance. Prior research has looked at how to improve ad-hoc retrieval performance for existing queries while tuning the ranking function, or modify and expand user queries using a fixed ranking scheme using blind feedback. However, almost no research has looked at how to combine ranking function tuning and blind feedback together to improve ad-hoc retrieval performance. In this paper, we look at the performance improvement for ad-hoc retrieval from a more integrated point of view by combining the merits of both techniques. In particular, we argue that the ranking function should be tuned first, using user-provided queries, before applying the blind feedback technique. The intuition is that highly-tuned ranking offers more high quality documents at the top of the hit list, thus offers a stronger baseline for blind feedback. We verify this integrated model in a large scale heterogeneous collection and the experimental results show that combining ranking function tuning and blind feedback can improve search performance by almost 30% over the baseline Okapi system.
Copyrights may apply
Xi, Wensi, Lind, Jesper and Brill, Eric (2004): Learning effective ranking functions for newsgroup search. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2004. pp. 394-401. Available online
Web communities are web virtual broadcasting spaces where people can freely discuss anything. While such communities function as discussion boards, they have even greater value as large repositories of archived information. In order to unlock the value of this resource, we need an effective means for searching archived discussion threads. Unfortunately the techniques that have proven successful for searching document collections and the Web are not ideally suited to the task of searching archived community discussions. In this paper, we explore the problem of creating an effective ranking function to predict the most relevant messages to queries in community search. We extract a set of predictive features from the thread trees of newsgroup messages as well as features of message authors and lexical distribution within a message thread. Our final results indicate that when using linear regression with this feature set, our search system achieved a 28.5% performance improvement compared to our baseline system.
Copyrights may apply
Xue, Gui-Rong, Zeng, Hua-Jun, Chen, Zheng, Yu, Yong, Ma, Wei-Ying, Xi, Wensi and Fan, Weiguo (2004): Optimizing web search using web click-through data. In: Grossman, David A., Gravano, Luis, Zhai, Chengxiang, Herzog, Otthein and Evans, David A. (eds.) Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management November 8-13, 2004, Washington, DC, USA. pp. 118-126. Available online
Xue, Gui-Rong, Zeng, Hua-Jun, Chen, Zheng, Yu, Yong, Ma, Wei-Ying, Xi, Wensi and Fox, Edward A. (2004): MRSSA: an iterative algorithm for similarity spreading over interrelated objects. In: Grossman, David A., Gravano, Luis, Zhai, Chengxiang, Herzog, Otthein and Evans, David A. (eds.) Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management November 8-13, 2004, Washington, DC, USA. pp. 240-241. Available online
Fan, Weiguo, Gordon, Michael D., Pathak, Praveen, Xi, Wensi and Fox, Edward A. (2004): Ranking Function Optimization for Effective Web Search by Genetic Programming: An Empirical Study. In: HICSS 2004 2004. . Available online
Xi, Wensi, Zhang, Benyu, Chen, Zheng, Lu, Yizhou, Yan, Shuicheng, Ma, Wei-Ying and Fox, Edward A. (2004): Link fusion: a unified link analysis framework for multi-type interrelated data objects. In: Proceedings of the 2004 International Conference on the World Wide Web 2004. pp. 319-327. Available online
Web link analysis has proven to be a significant enhancement for quality based web search. Most existing links can be classified into two categories: intra-type links (e.g., web hyperlinks), which represent the relationship of data objects within a homogeneous data type (web pages), and inter-type links (e.g., user browsing log) which represent the relationship of data objects across different data types (users and web pages). Unfortunately, most link analysis research only considers one type of link. In this paper, we propose a unified link analysis framework, called "link fusion", which considers both the inter- and intra- type link structure among multiple-type inter-related data objects and brings order to objects in each data type at the same time. The PageRank and HITS algorithms are shown to be special cases of our unified link analysis framework. Experiments on an instantiation of the framework that makes use of the user data and web pages extracted from a proxy log show that our proposed algorithm could improve the search effectiveness over the HITS and DirectHit algorithms by 24.6% and 38.2% respectively.
Copyrights may apply
Lu, Yizhou, Zhang, Benyu, Xi, Wensi, Chen, Zheng, Liu, Yi, Lyu, Michael R. and Ma, Wei-Ying (2004): The PowerRank web link analysis algorithm. In: Proceedings of the 2004 International Conference on the World Wide Web 2004. pp. 254-255. Available online
The web graph follows the power law distribution and has a hierarchy structure. But neither the PageRank algorithm nor any of its improvements leverage these attributes. In this paper, we propose a novel link analysis algorithm "the PowerRank algorithm", which makes use of the power law distribution attribute and the hierarchy structure of the web graph. The algorithm consists two parts. In the first part, special treatment is applied to the web pages with low "importance" score. In the second part, the global "importance" score for each web page is obtained by combining those scores together. Our experimental results show that: 1) The PowerRank algorithm computes 10%-30% faster than PageRank algorithm. 2) Top web pages in PowerRank algorithm remain similar to that of the PageRank algorithm.
Copyrights may apply
SHOW THIS LIST ON YOUR HOMEPAGE
What do YOU think?
Give us your opinion! Do you have any comments/additions that you would like other visitors to see?
You say:
Mar 17th, 2010
Changes to this page (author)
27 Feb 2010: Enabled abstracts to be shown on Wensi Xi's author page.09 Jul 2009: Author was edited 09 Jul 2009: Author was edited
12 Jun 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was added to the bibliography