It is easy for me to access this knowledge pool, I want it to grow so that I can grow along

Last 3 Donors


Support us

Funding progress for 2010:

Hua-Jun Zeng

No picture of Hua-Jun Zeng available - click to provide one

About the author:
No description available of Hua-Jun Zeng...
ADD DESCRIPTION
ADD PUBLICATION
SHARE YOUR RESEARCH

Publications by Hua-Jun Zeng (bibliography)

 what's this?

» 2008 «

Edit | Del

Hu, Jian, Fang, Lujun, Cao, Yang, Zeng, Hua-Jun, Li, Hua, Yang, Qiang and Chen, Zheng (2008): Enhancing text clustering by leveraging Wikipedia semantics. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2008. pp. 179-186. Available online

Most traditional text clustering methods are based on "bag of words" (BOW) representation based on frequency statistics in a set of documents. BOW, however, ignores the important information on the semantic relationships between key terms. To overcome this problem, several methods have been proposed to enrich text representation with external resource in the past, such as WordNet. However, many of these approaches suffer from some limitations: 1) WordNet has limited coverage and has a lack of effective word-sense disambiguation ability; 2) Most of the text representation enrichment strategies, which append or replace document terms with their hypernym and synonym, are overly simple. In this paper, to overcome these deficiencies, we first propose a way to build a concept thesaurus based on the semantic relations (synonym, hypernym, and associative relation) extracted from Wikipedia. Then, we develop a unified framework to leverage these semantic relations in order to enhance traditional content similarity measure for text clustering. The experimental results on Reuters and OHSUMED datasets show that with the help of Wikipedia thesaurus, the clustering performance of our method is improved as compared to previous methods. In addition, with the optimized weights for hypernym, synonym, and associative concepts that are tuned with the help of a few labeled data users provided, the clustering performance can be further improved.

Copyrights may apply

» 2007 «

Edit | Del

Han, Jie, Han, Dingyi, Lin, Chenxi, Zeng, Hua-Jun, Chen, Zheng and Yu, Yong (2007): Homepage live: automatic block tracing for web personalization. In: Proceedings of the 2007 International Conference on the World Wide Web 2007. pp. 1-10. Available online

The emergence of personalized homepage services, e.g. personalized Google Homepage and Microsoft Windows Live, has enabled Web users to select Web contents of interest and to aggregate them in a single Web page. The web contents are often predefined content blocks provided by the service providers. However, it involves intensive manual efforts to define the content blocks and maintain the information in it. In this paper, we propose a novel personalized homepage system, called "Homepage Live", to allow end users to use drag-and-drop actions to collect their favorite Web content blocks from existing Web pages and organize them in a single page. Moreover, Homepage Live automatically traces the changes of blocks with the evolvement of the container pages by measuring the tree edit distance of the selected blocks. By exploiting the immutable elements of Web pages, the tracing algorithm performance is significantly improved. The experimental results demonstrate the effectiveness and efficiency of our algorithm.

Copyrights may apply

Edit | Del

Hu, Jian, Zeng, Hua-Jun, Li, Hua, Niu, Cheng and Chen, Zheng (2007): Demographic prediction based on user's browsing behavior. In: Proceedings of the 2007 International Conference on the World Wide Web 2007. pp. 151-160. Available online

Demographic information plays an important role in personalized web applications. However, it is usually not easy to obtain this kind of personal data such as age and gender. In this paper, we made a first approach to predict users' gender and age from their Web browsing behaviors, in which the Webpage view information is treated as a hidden variable to propagate demographic information between different users. There are three main steps in our approach: First, learning from the Webpage click-though data, Webpages are associated with users' (known) age and gender tendency through a discriminative model; Second, users' (unknown) age and gender are predicted from the demographic information of the associated Webpages through a Bayesian framework; Third, based on the fact that Webpages visited by similar users may be associated with similar demographic tendency, and users with similar demographic information would visit similar Webpages, a smoothing component is employed to overcome the data sparseness of web click-though log. Experiments are conducted on a real web click-through log to demonstrate the effectiveness of the proposed approach. The experimental results show that the proposed algorithm can achieve up to 30.4% improvements on gender prediction and 50.3% on age prediction in terms of macro F1, compared to baseline algorithms.

Copyrights may apply

» 2006 «

Edit | Del

Sun, Jian-Tao, Wang, Xuanhui, Shen, Dou, Zeng, Hua-Jun and Chen, Zheng (2006): CWS: a comparative web search system. In: Proceedings of the 2006 International Conference on the World Wide Web 2006. pp. 467-476. Available online

In this paper, we define and study a novel search problem: Comparative Web Search (CWS). The task of CWS is to seek relevant and comparative information from the Web to help users conduct comparisons among a set of topics. A system called CWS is developed to effectively facilitate Web users' comparison needs. Given a set of queries, which represent the topics that a user wants to compare, the system is characterized by: (1) automatic retrieval and ranking of Web pages by incorporating both their relevance to the queries and the comparative contents they contain; (2) automatic clustering of the comparative contents into semantically meaningful themes; (3) extraction of representative keyphrases to summarize the commonness and differences of the comparative contents in each theme. We developed a novel interface which supports two types of view modes: a pair-view which displays the result in the page level, and a cluster-view which organizes the comparative pages into the themes and displays the extracted phrases to facilitate users' comparison. Experiment results show the CWS system is effective and efficient.

Copyrights may apply

Edit | Del

Sun, Jian-Tao, Wang, Xuanhui, Shen, Dou, Zeng, Hua-Jun and Chen, Zheng (2006): Mining clickthrough data for collaborative web search. In: Proceedings of the 2006 International Conference on the World Wide Web 2006. pp. 947-948. Available online

This paper is to investigate the group behavior patterns of search activities based on Web search history data, i.e., clickthrough data, to boost search performance. We propose a Collaborative Web Search (CWS) framework based on the probabilistic modeling of the co-occurrence relationship among the heterogeneous web objects: users, queries, and Web pages. The CWS framework consists of two steps: (1) a cube-clustering approach is put forward to estimate the semantic cluster structures of the Web objects; (2) Web search activities are conducted by leveraging the probabilistic relations among the estimated cluster structures. Experiments on a real-world clickthrough data set validate the effectiveness of our CWS approach.

Copyrights may apply

» 2005 «

Edit | Del

Xue, Gui-Rong, Lin, Chenxi, Yang, Qiang, Xi, Wensi, Zeng, Hua-Jun, Yu, Yong and Chen, Zheng (2005): Scalable collaborative filtering using cluster-based smoothing. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. pp. 114-121. Available online

Memory-based approaches for collaborative filtering identify the similarity between two users by comparing their ratings on a set of items. In the past, the memory-based approach has been shown to suffer from two fundamental problems: data sparsity and difficulty in scalability. Alternatively, the model-based approach has been proposed to alleviate these problems, but this approach tends to limit the range of users. In this paper, we present a novel approach that combines the advantages of these two approaches by introducing a smoothing-based method. In our approach, clusters generated from the training data provide the basis for data smoothing and neighborhood selection. As a result, we provide higher accuracy as well as increased efficiency in recommendations. Empirical studies on two datasets (EachMovie and MovieLens) show that our new proposed approach consistently outperforms other state-of-art collaborative filtering algorithms.

Copyrights may apply

Edit | Del

Xue, Gui-Rong, Yang, Qiang, Zeng, Hua-Jun, Yu, Yong and Chen, Zheng (2005): Exploiting the hierarchical structure for link analysis. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. pp. 186-193. Available online

Link analysis algorithms have been extensively used in Web information retrieval. However, current link analysis algorithms generally work on a flat link graph, ignoring the hierarchal structure of the Web graph. They often suffer from two problems: the sparsity of link graph and biased ranking of newly-emerging pages. In this paper, we propose a novel ranking algorithm called Hierarchical Rank as a solution to these two problems, which considers both the hierarchical structure and the link structure of the Web. In this algorithm, Web pages are first aggregated based on their hierarchical structure at directory, host or domain level and link analysis is performed on the aggregated graph. Then, the importance of each node on the aggregated graph is distributed to individual pages belong to the node based on the hierarchical structure. This algorithm allows the importance of linked Web pages to be distributed in the Web page space even when the space is sparse and contains new pages. Experimental results on the .GOV collection of TREC 2003 and 2004 show that hierarchical ranking algorithm consistently outperforms other well-known ranking algorithms, including the PageRank, BlockRank and LayerRank. In addition, experimental results show that link aggregation at the host level is much better than link aggregation at either the domain or directory levels.

Copyrights may apply

Edit | Del

Sun, Jian-Tao, Shen, Dou, Zeng, Hua-Jun, Yang, Qiang, Lu, Yuchang and Chen, Zheng (2005): Web-page summarization using clickthrough data. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. pp. 194-201. Available online

Most previous Web-page summarization methods treat a Web page as plain text. However, such methods fail to uncover the full knowledge associated with a Web page needed in building a high-quality summary, because many of these methods do not consider the hidden relationships in the Web. Uncovering the hidden knowledge is important in building good Web-page summarizers. In this paper, we extract the extra knowledge from the clickthrough data of a Web search engine to improve Web-page summarization. Wefirst analyze the feasibility in utilizing the clickthrough data to enhance Web-page summarization and then propose two adapted summarization methods that take advantage of the relationships discovered from the clickthrough data. For those pages that are not covered by the clickthrough data, we design a thematic lexicon approach to generate implicit knowledge for them. Our methods are evaluated on a dataset consisting of manually annotated pages as well as a large dataset that is crawled from the Open Directory Project website. The experimental results indicate that significant improvements can be achieved through our proposed summarizer as compared to the summarizers that do not use the clickthrough data.

Copyrights may apply

Edit | Del

Chen, Mo, Sun, Jian-Tao, Zeng, Hua-Jun and Lam, Kwok-Yan (2005): A practical system of keyphrase extraction for web pages. In: Herzog, Otthein, Schek, Hans-Jörg and Fuhr, Norbert (eds.) Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management October 31 - November 5, 2005, Bremen, Germany. pp. 277-278. Available online

Edit | Del

Sun, Jian-Tao, Zeng, Hua-Jun, Liu, Huan, Lu, Yuchang and Chen, Zheng (2005): CubeSVD: a novel approach to personalized Web search. In: Proceedings of the 2005 International Conference on the World Wide Web 2005. pp. 382-390. Available online

As the competition of Web search market increases, there is a high demand for personalized Web search to conduct retrieval incorporating Web users' information needs. This paper focuses on utilizing clickthrough data to improve Web search. Since millions of searches are conducted everyday, a search engine accumulates a large volume of clickthrough data, which records who submits queries and which pages he/she clicks on. The clickthrough data is highly sparse and contains different types of objects (user, query and Web page), and the relationships among these objects are also very complicated. By performing analysis on these data, we attempt to discover Web users' interests and the patterns that users locate information. In this paper, a novel approach CubeSVD is proposed to improve Web search. The clickthrough data is represented by a 3-order tensor, on which we perform 3-mode analysis using the higher-order singular value decomposition technique to automatically capture the latent factors that govern the relations among these multi-type objects: users, queries and Web pages. A tensor reconstructed based on the CubeSVD analysis reflects both the observed interactions among these objects and the implicit associations among them. Therefore, Web search activities can be carried out based on CubeSVD analysis. Experimental evaluations using a real-world data set collected from an MSN search engine show that CubeSVD achieves encouraging search results in comparison with some standard methods.

Copyrights may apply

Edit | Del

Su, Xue-Feng, Zeng, Hua-Jun and Chen, Zheng (2005): Finding group shilling in recommendation system. In: Proceedings of the 2005 International Conference on the World Wide Web 2005. pp. 960-961. Available online

In the age of information explosion, recommendation system has been proved effective to cope with information overload in e-commerce area. However, unscrupulous producers shill the systems in many ways to make profit, and it makes the system imprecise and unreliable in a long term. Among many shilling behaviors, a new form of attack, called group shilling, appears and does great harm to the system. Because group shilling users are now well organized and become more hidden among various normal users, it is hard to find them by traditional methods. However, these group shilling users are similar to some extent, for they both shill the target items. We bring out a similarity spreading algorithm to find these group shilling users and protect recommendation system from unfair ratings. In our algorithm, we try to find these cunning group shilling users through propagating similarities from items to users iteratively. The experiment shows our similarity spreading algorithm improves the precision of the system and provides the system a reliable protection.

Copyrights may apply

Edit | Del

Liu, Tie-Yan, Yang, Yiming, Wan, Hao, ZHOU, Qian, Gao, Bin, Zeng, Hua-Jun, Chen, Zheng and Ma, Wei-Ying (2005): An experimental study on large-scale web categorization. In: Proceedings of the 2005 International Conference on the World Wide Web 2005. pp. 1106-1107. Available online

Taxonomies of the Web typically have hundreds of thousands of categories and skewed category distribution over documents. It is not clear whether existing text classification technologies can perform well on and scale up to such large-scale applications. To understand this, we conducted the evaluation of several representative methods (Support Vector Machines, k-Nearest Neighbor and Naive Bayes) with Yahoo! taxonomies. In particular, we evaluated the effectiveness/efficiency tradeoff in classifiers with hierarchical setting compared to conventional (flat) setting, and tested popular threshold tuning strategies for their scalability and accuracy in large-scale classification problems.

Copyrights may apply

» 2004 «

Edit | Del

Zeng, Hua-Jun, He, Qi-Cai, Chen, Zheng, Ma, Wei-Ying and Ma, Jinwen (2004): Learning to cluster web search results. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2004. pp. 210-217. Available online

Organizing Web search results into clusters facilitates users' quick browsing through search results. Traditional clustering techniques are inadequate since they don't generate clusters with highly readable names. In this paper, we reformalize the clustering problem as a salient phrase ranking problem. Given a query and the ranked list of documents (typically a list of titles and snippets) returned by a certain Web search engine, our method first extracts and ranks salient phrases as candidate cluster names, based on a regression model learned from human labeled training data. The documents are assigned to relevant salient phrases to form candidate clusters, and the final clusters are generated by merging these candidate clusters. Experimental results verify our method's feasibility and effectiveness.

Copyrights may apply

Edit | Del

Wang, Xuanhui, Shen, Dou, Zeng, Hua-Jun, Chen, Zheng and Ma, Wei-Ying (2004): Web page clustering enhanced by summarization. In: Grossman, David A., Gravano, Luis, Zhai, Chengxiang, Herzog, Otthein and Evans, David A. (eds.) Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management November 8-13, 2004, Washington, DC, USA. pp. 242-243. Available online

Edit | Del

Xue, Gui-Rong, Zeng, Hua-Jun, Chen, Zheng, Yu, Yong, Ma, Wei-Ying, Xi, Wensi and Fan, Weiguo (2004): Optimizing web search using web click-through data. In: Grossman, David A., Gravano, Luis, Zhai, Chengxiang, Herzog, Otthein and Evans, David A. (eds.) Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management November 8-13, 2004, Washington, DC, USA. pp. 118-126. Available online

Edit | Del

Xue, Gui-Rong, Zeng, Hua-Jun, Chen, Zheng, Yu, Yong, Ma, Wei-Ying, Xi, Wensi and Fox, Edward A. (2004): MRSSA: an iterative algorithm for similarity spreading over interrelated objects. In: Grossman, David A., Gravano, Luis, Zhai, Chengxiang, Herzog, Otthein and Evans, David A. (eds.) Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management November 8-13, 2004, Washington, DC, USA. pp. 240-241. Available online

Edit | Del

Xue, Gui-Rong, Zeng, Hua-Jun, Chen, Zheng, Ma, Wei-Ying and Yu, Yong (2004): Similarity spreading: a unified framework for similarity calculation of interrelated objects. In: Proceedings of the 2004 International Conference on the World Wide Web 2004. pp. 460-461. Available online

In many Web search applications, similarities between objects of one type (say, queries) can be affected by the similarities between their interrelated objects of another type (say, Web pages), and vice versa. We propose a novel framework called similarity spreading to take account of the interrelationship and improve the similarity calculation. Experiment results show that the proposed framework can significantly improve the accuracy of the similarity measurement of the objects in a search engine.

Copyrights may apply

» 2003 «

Edit | Del

Xue, Gui-Rong, Zeng, Hua-Jun, Chen, Zheng, Ma, Wei-Ying, Zhang, Hong-Jiang and Lu, Chao-Jun (2003): Implicit link analysis for small web search. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2003. pp. 56-63. Available online

Current Web search engines generally impose link analysis-based re-ranking on web-page retrieval. However, the same techniques, when applied directly to small web search such as intranet and site search, cannot achieve the same performance because their link structures are different from the global Web. In this paper, we propose an approach to constructing implicit links by mining users' access patterns, and then apply a modified PageRank algorithm to re-rank web-pages for small web search. Our experimental results indicate that the

Copyrights may apply

» 2002 «

Edit | Del

Xie, Xing, Zeng, Hua-Jun and Ma, Wei-Ying (2002): Enabling personalization services on the edge. In: ACM Multimedia 2002 2002. pp. 263-266. Available online

ADD PUBLICATION
SHOW THIS LIST ON YOUR HOMEPAGE

What do YOU think?

Give us your opinion! Do you have any comments/additions that you would like other visitors to see?

 
comment You say: Mar 18th, 2010
#1
Be the first to add a thoughtful note to this page ! 

  will be spam-protected
 

 
How many?
=
e.g. "6"
 

Changes to this page (author)

19 Feb 2010: Enabled abstracts to be shown on Hua-Jun Zeng's author page.
09 Jul 2009: Author was edited
09 Jul 2009: Author was edited
09 Jul 2009: Author was edited
09 Jul 2009: Author was edited
09 Jul 2009: Author was edited
09 Jul 2009: Author was edited
17 Jun 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
08 Apr 2009: Author was edited
25 Jul 2007: Author was edited
25 Jul 2007: Author was edited
25 Jul 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was added to the bibliography

Publication statistics

Publication period:2002-2008
Publication count:19
Number of co-authors:34



Productive colleagues

Hua-Jun Zeng's 3 most productive colleagues in number of publications:

Edward A. Fox:92
Wei-Ying Ma:85
Zheng Chen:49


Collaboration count

Number of publications with 3 favourite co-authors:

Zheng Chen:17
Wei-Ying Ma:8
Yong Yu:6

 

Other options

Learn more about Hua-Jun Zeng:
- Google Scholar
- ACM
- CSB

Mar 18

The theory gives the answers, not the theorist.

-- Allen Newell

  • Share this quote on... Bookmark and Share
  • Get more quotes

Eva Hornecker on Tangible Interaction

Eva Hornecker explains the evolving concept of Tangible Interaction.

Read Eva's insightful entry here..

Help us help you!

  • Spread the word: Bookmark and Share
  • Donate
  • Other ways to help
 

Page information

Page maintainer: The Editorial Team
How to cite/reference this page
URL: http://www.interaction-design.org/references/authors/hua-jun_zeng.html