Yihong Gong
About the author:
No description available of Yihong Gong...
Publications by Yihong Gong (bibliography)
» 2009 «
Liu, Feng, Wang, Jinjun, Zhu, Shenghuo, Gleicher, Michael and Gong, Yihong (2009): Visual-Quality Optimizing Super Resolution. In Comput. Graph. Forum, 28 (1) pp. 127-140
» 2008 «
Yu, Kai, Zhu, Shenghuo, Xu, Wei and Gong, Yihong (2008): trNon-greedy active learning for text categorization using convex ansductive experimental design. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2008. pp. 635-642. Available online
In this paper we propose a non-greedy active learning method for text categorization using least-squares support vector machines (LSSVM). Our work is based on transductive experimental design (TED), an active learning formulation that effectively explores the information of unlabeled data. Despite its appealing properties, the optimization problem is however NP-hard and thus -- like most of other active learning methods -- a greedy sequential strategy to select one data example after another was suggested to find a suboptimum. In this paper we formulate the problem into a continuous optimization problem and prove its convexity, meaning that a set of data examples can be selected with a guarantee of global optimum. We also develop an iterative algorithm to efficiently solve the optimization problem, which turns out to be very easy-to-implement. Our text categorization experiments on two text corpora empirically demonstrated that the new active learning algorithm outperforms the sequential greedy algorithm, and is promising for active text categorization applications.
Copyrights may apply
Chi, Yun, Zhu, Shenghuo, Gong, Yihong and Zhang, Yi (2008): Probabilistic polyadic factorization and its application to personalized recommendation. In: Shanahan, James G., Amer-Yahia, Sihem, Manolescu, Ioana, Zhang, Yi, Evans, David A., Kolcz, Aleksander, Choi, Key-Sun and Chowdhury, Abdur (eds.) Proceedings of the 17th ACM Conference on Information and Knowledge Management - CIKM 2008 October 26-30, 2008, Napa Valley, California, USA. pp. 941-950. Available online
Wang, Dingding, Zhu, Shenghuo, Li, Tao, Chi, Yun and Gong, Yihong (2008): Integrating clustering and multi-document summarization to improve document understanding. In: Shanahan, James G., Amer-Yahia, Sihem, Manolescu, Ioana, Zhang, Yi, Evans, David A., Kolcz, Aleksander, Choi, Key-Sun and Chowdhury, Abdur (eds.) Proceedings of the 17th ACM Conference on Information and Knowledge Management - CIKM 2008 October 26-30, 2008, Napa Valley, California, USA. pp. 1435-1436. Available online
Liu, Feng, Wang, Jinjun, Zhu, Shenghuo, Gleicher, Michael and Gong, Yihong (2008): Noisy video super-resolution. In: El-Saddik, Abdulmotaleb, Vuong, Son, Griwodz, Carsten, Bimbo, Alberto Del, Candan, K. Selcuk and Jaimes, Alejandro (eds.) Proceedings of the 16th International Conference on Multimedia 2008 October 26-31, 2008, Vancouver, British Columbia, Canada. pp. 713-716. Available online
» 2007 «
Zhu, Shenghuo, Yu, Kai, Chi, Yun and Gong, Yihong (2007): Combining content and link for classification using matrix factorization. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2007. pp. 487-494. Available online
The world wide web contains rich textual contents that are interconnected via complex hyperlinks. This huge database violates the assumption held by most of conventional statistical methods that each web page is considered as an independent and identical sample. It is thus difficult to apply traditional mining or learning methods for solving web mining problems, e.g., web page classification, by exploiting both the content and the link structure. The research in this direction has recently received considerable attention but are still in an early stage. Though a few methods exploit both the link structure or the content information, some of them combine the only authority information with the content information, and the others first decompose the link structure into hub and authority features, then apply them as additional document features. Being practically attractive for its great simplicity, this paper aims to design an algorithm that exploits both the content and linkage information, by carrying out a joint factorization on both the linkage adjacency matrix and the document-term matrix, and derives a new representation for web pages in a low-dimensional factor space, without explicitly separating them as content, hub or authority factors. Further analysis can be performed based on the compact representation of web pages. In the experiments, the proposed method is compared with state-of-the-art methods and demonstrates an excellent accuracy in hypertext classification on the WebKB and Cora benchmarks.
Copyrights may apply
» 2006 «
Han, Mei, Xu, Wei and Gong, Yihong (2006): Video object segmentation by motion-based sequential feature clustering. In: Nahrstedt, Klara, Turk, Matthew, Rui, Yong, Klas, Wolfgang and Mayer-Patel, Ketan (eds.) Proceedings of the 14th ACM International Conference on Multimedia October 23-27, 2006, Santa Barbara, CA, USA. pp. 773-782. Available online
» 2005 «
Zhu, Shenghuo, Ji, Xiang, Xu, Wei and Gong, Yihong (2005): Multi-labelled classification using maximum entropy method. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. pp. 274-281. Available online
Many classification problems require classifiers to assign each single document into more than one category, which is called multi-labelled classification. The categories in such problems usually are neither conditionally independent from each other nor mutually exclusive, therefore it is not trivial to directly employ state-of-the-art classification algorithms without losing information of relation among categories. In this paper, we explore correlations among categories with maximum entropy method and derive a classification algorithm for multi-labelled documents. Our experiments show that this method significantly outperforms the combination of single label approach.
Copyrights may apply
» 2004 «
Xu, Wei and Gong, Yihong (2004): Document clustering by concept factorization. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2004. pp. 202-209. Available online
In this paper, we propose a new data clustering method called concept factorization that models each concept as a linear combination of the data points, and each data point as a linear combination of the concepts. With this model, the data clustering task is accomplished by computing the two sets of linear coefficients, and this linear coefficients computation is carried out by finding the non-negative solution that minimizes the reconstruction error of the data points. The cluster label of each data point can be easily derived from the obtained linear coefficients. This method differs from the method of clustering based on non-negative matrix factorization (NMF) \citeXu03 in that it can be applied to data containing negative values and the method can be implemented in the kernel space. Our experimental results show that the proposed data clustering method and its variations performs best among 11 algorithms and their variations that we have evaluated on both TDT2 and Reuters-21578 corpus. In addition to its good performance, the new method also has the merit in its easy and reliable derivation of the clustering results.
Copyrights may apply
» 2002 «
Liu, Xin, Gong, Yihong, Xu, Wei and Zhu, Shenghuo (2002): Document clustering with cluster refinement and model selection capabilities. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2002. pp. 191-198. Available online
In this paper, we propose a document clustering method that strives to achieve: (1) a high accuracy of document clustering, and (2) the capability of estimating the number of clusters in the document corpus (i.e. the model selection capability). To accurately cluster the given document corpus, we employ a richer feature set to represent each document, and use the Gaussian Mixture Model (GMM) together with the Expectation-Maximization (EM) algorithm to conduct an initial document clustering. From this initial result, we identify a set of discriminative features for each cluster, and refine the initially obtained document clusters by voting on the cluster label of each document using this discriminative feature set. This self-refinement process of discriminative feature identification and cluster label voting is iteratively applied until the convergence of document clusters. On the other hand, the model selection capability is achieved by introducing randomness in the cluster initialization stage, and then discovering a value C for the number of clusters N by which running the document clustering process for a fixed number of times yields sufficiently similar results. Performance evaluations exhibit clear superiority of the proposed method with its improved document clustering and model selection accuracies. The evaluations also demonstrate how each feature as well as the cluster refinement process contribute to the document clustering accuracy.
Copyrights may apply
Han, Mei, Hua, Wei, Xu, Wei and Gong, Yihong (2002): An integrated baseball digest system using maximum entropy method. In: ACM Multimedia 2002 2002. pp. 347-350. Available online
» 2001 «
Gong, Yihong and Liu, Xin (2001): Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2001. pp. 19-25. Available online
In this paper, we propose two generic text summarization methods that create text summaries by ranking and extracting sentences from the original documents. The first method uses standard IR methods to rank sentence relevances, while the second method uses the latent semantic analysis technique to identify semantically important sentences, for summary creations. Both methods strive to select sentences that are highly ranked and different from each other. This is an attempt to create a summary with a wider coverage of the document's main content and less redundancy. Performance evaluations on the two summarization methods are conducted by comparing their summarization outputs with the manual summaries generated by three independent human evaluators. The evaluations also study the influence of different VSM weighting schemes on the text summarization performances. Finally, the causes of the large disparities in the evaluators' manual summarization results are investigated, and discussions on human text summarization patterns are presented.
Copyrights may apply
» 1999 «
Wactlar, Howard D., Christel, Michael G., Gong, Yihong and Hauptmann, Alexander G. (1999): Lessons Learned from Building a Terabyte Digital Video Library. In IEEE Computer, 32 (2) pp. 66-73
Gong, Yihong, Proietti, Guido and LaRose, David (1999): A Robust Image Mosaicing Technique Capable of Creating Integrated Panoramas. In: IV 1999 1999. pp. 24-. Available online
SHOW THIS LIST ON YOUR HOMEPAGE
What do YOU think?
Give us your opinion! Do you have any comments/additions
that you would like other visitors to see?
You say:
Mar 22nd, 2010
Changes to this page (author)
16 Feb 2010: Enabled abstracts to be shown on Yihong Gong's author page.21 Jul 2009: Author was edited 17 Jun 2009: Author was edited
17 Jun 2009: Author was edited
17 Jun 2009: Author was edited
14 Jun 2009: Author was edited
01 Jun 2009: Author was edited
30 May 2009: Author was edited
29 May 2009: Author was edited
08 Apr 2009: Author was edited
12 May 2008: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was added to the bibliography