Wei Xu
About the author:
No description available of Wei Xu...
Publications by Wei Xu (bibliography)
» 2008 «
Yu, Kai, Zhu, Shenghuo, Xu, Wei and Gong, Yihong (2008): trNon-greedy active learning for text categorization using convex ansductive experimental design. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2008. pp. 635-642. Available online
In this paper we propose a non-greedy active learning method for text categorization using least-squares support vector machines (LSSVM). Our work is based on transductive experimental design (TED), an active learning formulation that effectively explores the information of unlabeled data. Despite its appealing properties, the optimization problem is however NP-hard and thus -- like most of other active learning methods -- a greedy sequential strategy to select one data example after another was suggested to find a suboptimum. In this paper we formulate the problem into a continuous optimization problem and prove its convexity, meaning that a set of data examples can be selected with a guarantee of global optimum. We also develop an iterative algorithm to efficiently solve the optimization problem, which turns out to be very easy-to-implement. Our text categorization experiments on two text corpora empirically demonstrated that the new active learning algorithm outperforms the sequential greedy algorithm, and is promising for active text categorization applications.
Copyrights may apply
» 2007 «
Zhang, Yi and Xu, Wei (2007): Fast exact maximum likelihood estimation for mixture of language models. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2007. pp. 865-866. Available online
A common language modeling approach assumes the data D is generated from a mixture of several language models. EM algorithm is usually used to find the maximum likelihood estimation of one unknown mixture component, given the mixture weights and the other language models. In this paper, we provide an efficient algorithm of O(k) complexity to find the exact solution, where k is the number of words occurred at least once in D. Another merit is that the probabilities of many words are exactly zeros, which means that the mixture language model also serves as a feature selection technique.
Copyrights may apply
» 2006 «
Ji, Xiang and Xu, Wei (2006): Document clustering with prior knowledge. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2006. pp. 405-412. Available online
Document clustering is an important tool for text analysis and is used in many different applications. We propose to incorporate prior knowledge of cluster membership for document cluster analysis and develop a novel semi-supervised document clustering model. The method models a set of documents with weighted graph in which each document is represented as a vertex, and each edge connecting a pair of vertices is weighted with the similarity value of the two corresponding documents. The prior knowledge indicates pairs of documents that known to belong to the same cluster. Then, the prior knowledge is transformed into a set of constraints. The document clustering task is accomplished by finding the best cuts of the graph under the constraints. We apply the model to the Normalized Cut method to demonstrate the idea and concept. Our experimental evaluations show that the proposed document clustering model reveals remarkable performance improvements with very limited training samples, and hence is a very effective semi-supervised classification tool.
Copyrights may apply
Zhao, Ke, Hu, Gangwei, Xu, Wei and Li, Yatao (2006): Research on Models of Concept and Relation between Concepts Adapted for Semantic Disambiguation of Natural Language. In: Yao, Yiyu, Shi, Zhongzhi, Wang, Yingxu and Kinsner, Witold (eds.) Proceedings of the Firth IEEE International Conference on Cognitive Informatics ICCI 2006 July 17-19, 2006, Beijing, China. pp. 612-616. Available online
Han, Mei, Xu, Wei and Gong, Yihong (2006): Video object segmentation by motion-based sequential feature clustering. In: Nahrstedt, Klara, Turk, Matthew, Rui, Yong, Klas, Wolfgang and Mayer-Patel, Ketan (eds.) Proceedings of the 14th ACM International Conference on Multimedia October 23-27, 2006, Santa Barbara, CA, USA. pp. 773-782. Available online
» 2005 «
Zhu, Shenghuo, Ji, Xiang, Xu, Wei and Gong, Yihong (2005): Multi-labelled classification using maximum entropy method. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. pp. 274-281. Available online
Many classification problems require classifiers to assign each single document into more than one category, which is called multi-labelled classification. The categories in such problems usually are neither conditionally independent from each other nor mutually exclusive, therefore it is not trivial to directly employ state-of-the-art classification algorithms without losing information of relation among categories. In this paper, we explore correlations among categories with maximum entropy method and derive a classification algorithm for multi-labelled documents. Our experiments show that this method significantly outperforms the combination of single label approach.
Copyrights may apply
Xu, Wei, Sekar, R., Ramakrishnan, I. V. and Venkatakrishnan, V. N. (2005): An approach for realizing privacy-preserving web-based services. In: Proceedings of the 2005 International Conference on the World Wide Web 2005. pp. 1014-1015. Available online
» 2004 «
Xu, Wei and Gong, Yihong (2004): Document clustering by concept factorization. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2004. pp. 202-209. Available online
In this paper, we propose a new data clustering method called concept factorization that models each concept as a linear combination of the data points, and each data point as a linear combination of the concepts. With this model, the data clustering task is accomplished by computing the two sets of linear coefficients, and this linear coefficients computation is carried out by finding the non-negative solution that minimizes the reconstruction error of the data points. The cluster label of each data point can be easily derived from the obtained linear coefficients. This method differs from the method of clustering based on non-negative matrix factorization (NMF) \citeXu03 in that it can be applied to data containing negative values and the method can be implemented in the kernel space. Our experimental results show that the proposed data clustering method and its variations performs best among 11 algorithms and their variations that we have evaluated on both TDT2 and Reuters-21578 corpus. In addition to its good performance, the new method also has the merit in its easy and reliable derivation of the clustering results.
Copyrights may apply
» 2002 «
Liu, Xin, Gong, Yihong, Xu, Wei and Zhu, Shenghuo (2002): Document clustering with cluster refinement and model selection capabilities. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2002. pp. 191-198. Available online
In this paper, we propose a document clustering method that strives to achieve: (1) a high accuracy of document clustering, and (2) the capability of estimating the number of clusters in the document corpus (i.e. the model selection capability). To accurately cluster the given document corpus, we employ a richer feature set to represent each document, and use the Gaussian Mixture Model (GMM) together with the Expectation-Maximization (EM) algorithm to conduct an initial document clustering. From this initial result, we identify a set of discriminative features for each cluster, and refine the initially obtained document clusters by voting on the cluster label of each document using this discriminative feature set. This self-refinement process of discriminative feature identification and cluster label voting is iteratively applied until the convergence of document clusters. On the other hand, the model selection capability is achieved by introducing randomness in the cluster initialization stage, and then discovering a value C for the number of clusters N by which running the document clustering process for a fixed number of times yields sufficiently similar results. Performance evaluations exhibit clear superiority of the proposed method with its improved document clustering and model selection accuracies. The evaluations also demonstrate how each feature as well as the cluster refinement process contribute to the document clustering accuracy.
Copyrights may apply
Han, Mei, Hua, Wei, Xu, Wei and Gong, Yihong (2002): An integrated baseball digest system using maximum entropy method. In: ACM Multimedia 2002 2002. pp. 347-350. Available online
» 1999 «
Xu, Wei, Dainoff, Marvin J. and Mark, Leonard S. (1999): Facilitate Complex Search Tasks in Hypertext by Externalizing Functional Properties of a Work Domain. In International Journal of Human-Computer Interaction, 11 (3) pp. 201-229
The premise of this study was that practical problem solving within a complex work domain (ergonomic design and integration of computer workstations) could be enhanced by a hypertext representation of that work domain. Two alternative hypertext representations were developed. The first consisted of an ecological interface design based on the means-end abstraction hierarchy (AH) approach (Vicente & Rasmussen, 1992). In this design, the goal-relevant constraints and functional relations within the domain were explicitly represented on the interface. The second hypertext interface was based on a more traditional classification hierarchy (CH) in which supraordinate categories were broken down into their components (part-whole relation). The relative effectiveness of the 2 approaches was compared using an experimental procedure in which participants solved ergonomic problems of increasing complexity. The results supported the following research hypotheses: (a) When performing a complex or problem-solving task, participants using the AH interface spent less time and experienced less navigation disorientation than those participants using the CH interface; (b) as the task complexity increased, the advantage of the AH interface over the CH interface increased as measured by search time and navigation disorientation; (c) no difference was found between the 2 interfaces for the simple task; and (d) participants using the AH interface also reported experiencing less navigation disorientation than those participants using the CH interface. This article recommends the AH interface as a more effective semantic representation of an interface for a hypertext application with a complex document in support of complex and problem-solving search tasks.
Copyrights may apply
» 1997 «
Xu, Wei and Dainoff, Marvin J. (1997): Comparative Hypertext Approaches to Ergonomic Training. In: Smith, Michael J., Salvendy, Gavriel and Koubek, Richard J. (eds.) HCI International 1997 - Proceedings of the Seventh International Conference on Human-Computer Interaction - Volume 2 August 24-29, 1997, San Francisco, California, USA. pp. 145-148.
» 1995 «
Gardner, Douglas L., Mark, Leonard S., Dainoff, Marvin J. and Xu, Wei (1995): Considerations for Linking Seatpan and Backrest Angles. In International Journal of Human-Computer Interaction, 7 (2) pp. 153-165
Modern ergonomic chairs typically have several dimensions that can be adjusted independently of one another. Finding a desirable setting for any one dimension can depend on how other dimensions are set, thereby confronting users with a significant control problem. One design strategy for dealing with this problem has been to link changes in seatpan and backrest angles in some ratio, such that a one-degree change in seatpan angle is associated with a two- or three-degree change in backrest angle. However, there is no evidence to justify the choice of a particular ratio. This article presents data that addresses this issue. Subjects, performing either an entry or verification task, could adjust the chair to any position. Backrest and seatpan angles were plotted over time and analyzed using both graphical and statistical methods. The resulting scatter plots do not support the industry standard, 1:2 or 1:3 ratio, of changes in seatpan to backrest angles. The possibility of a variable linkage is discussed, however problems associated with such a solution raise the possibility that control issues might be best addressed through training and exploration.
Copyrights may apply
» 1993 «
Xu, Wei, Dainoff, Marvin J. and Mark, Leonard S. (1993): An Ergonomic Field Study of VDT Operations in a Developing Country. In: Proceedings of the Fifth International Conference on Human-Computer Interaction - Poster Sessions: Abridged Proceedings 1993. p. 112.
SHOW THIS LIST ON YOUR HOMEPAGE
What do YOU think?
Give us your opinion! Do you have any comments/additions
that you would like other visitors to see?
You say:
Mar 20th, 2010
Changes to this page (author)
25 Feb 2010: Enabled abstracts to be shown on Wei Xu's author page.09 Jul 2009: Author was edited 17 Jun 2009: Author was edited
17 Jun 2009: Author was edited
04 Jun 2009: Author was edited
30 May 2009: Author was edited
08 Apr 2009: Author was edited
12 May 2008: Author was edited
29 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
28 Apr 2003: Added the author to the bibliography