Jian-Yun Nie

No picture of Jian-Yun Nie available - click to provide one

About the author:
No description available of Jian-Yun Nie...
ADD DESCRIPTION
ADD PUBLICATION
SHARE YOUR RESEARCH

Publications by Jian-Yun Nie (bibliography)

 what's this?

» 2008 «

Edit | Del

Cao, Guihong, Nie, Jian-Yun, Gao, Jianfeng and Robertson, Stephen (2008): Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2008. pp. 243-250. Available online

Pseudo-relevance feedback assumes that most frequent terms in the pseudo-feedback documents are useful for the retrieval. In this study, we re-examine this assumption and show that it does not hold in reality -- many expansion terms identified in traditional approaches are indeed unrelated to the query and harmful to the retrieval. We also show that good expansion terms cannot be distinguished from bad ones merely on their distributions in the feedback documents and in the whole collection. We then propose to integrate a term classification process to predict the usefulness of expansion terms. Multiple additional features can be integrated in this process. Our experiments on three TREC collections show that retrieval effectiveness can be much improved when term classification is used. In addition, we also demonstrate that good terms should be identified directly according to their possible impact on the retrieval effectiveness, i.e. using supervised learning, instead of unsupervised learning.

Copyrights may apply

Edit | Del

Shi, Lixin, Nie, Jian-Yun and Cao, Guihong (2008): Relating dependent indexes using dempster-shafer theory. In: Shanahan, James G., Amer-Yahia, Sihem, Manolescu, Ioana, Zhang, Yi, Evans, David A., Kolcz, Aleksander, Choi, Key-Sun and Chowdhury, Abdur (eds.) Proceedings of the 17th ACM Conference on Information and Knowledge Management - CIKM 2008 October 26-30, 2008, Napa Valley, California, USA. pp. 429-438. Available online

» 2007 «

Edit | Del

Bai, Jing, Nie, Jian-Yun, Cao, Guihong and Bouchard, Hugues (2007): Using query contexts in information retrieval. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2007. pp. 15-22. Available online

User query is an element that specifies an information need, but it is not the only one. Studies in literature have found many contextual factors that strongly influence the interpretation of a query. Recent studies have tried to consider the user's interests by creating a user profile. However, a single profile for a user may not be sufficient for a variety of queries of the user. In this study, we propose to use query-specific contexts instead of user-centric ones, including context around query and context within query. The former specifies the environment of a query such as the domain of interest, while the latter refers to context words within the query, which is particularly useful for the selection of relevant term relations. In this paper, both types of context are integrated in an IR model based on language modeling. Our experiments on several TREC collections show that each of the context factors brings significant improvements in retrieval effectiveness.

Copyrights may apply

Edit | Del

Gao, Wei, Niu, Cheng, Nie, Jian-Yun, Zhou, Ming, Hu, Jian, Wong, Kam-Fai and Hon, Hsiao-Wuen (2007): Cross-lingual query suggestion using query logs of different languages. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2007. pp. 463-470. Available online

Query suggestion aims to suggest relevant queries for a given query, which help users better specify their information needs. Previously, the suggested terms are mostly in the same language of the input query. In this paper, we extend it to cross-lingual query suggestion (CLQS): for a query in one language, we suggest similar or relevant queries in other languages. This is very important to scenarios of cross-language information retrieval (CLIR) and cross-lingual keyword bidding for search engine advertisement. Instead of relying on existing query translation technologies for CLQS, we present an effective means to map the input query of one language to queries of the other language in the query log. Important monolingual and cross-lingual information such as word translation relations and word co-occurrence statistics, etc. are used to estimate the cross-lingual query similarity with a discriminative model. Benchmarks show that the resulting CLQS system significantly out performs a baseline system based on dictionary-based query translation. Besides, the resulting CLQS is tested with French to English CLIR tasks on TREC collections. The results demonstrate higher effectiveness than the traditional query translation methods.

Copyrights may apply

Edit | Del

Cao, Guihong, Gao, Jianfeng, Nie, Jian-Yun and Bai, Jing (2007): Extending query translation to cross-language query expansion with markov chain models. In: Silva, Mario J., Laender, Alberto H. F., Baeza-Yates, Ricardo A., McGuinness, Deborah L., Olstad, Bjørn, Olsen, Øystein Haug and Falcão, André O. (eds.) Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management - CIKM 2007 November 6-10, 2007, Lisbon, Portugal. pp. 351-360. Available online

» 2006 «

Edit | Del

Gao, Jianfeng and Nie, Jian-Yun (2006): A study of statistical models for query translation: finding a good unit of translation. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2006. pp. 194-201. Available online

This paper presents a study of three statistical query translation models that use different units of translation. We begin with a review of a word-based translation model that uses co-occurrence statistics for resolving translation ambiguities. The translation selection problem is then formulated under the framework of graphic model resorting to which the modeling assumptions and limitations of the co-occurrence model are discussed, and the research of finding better translation units is motivated. Then, two other models that use larger, linguistically motivated translation units (i.e., noun phrase and dependency triple) are presented. For each model, the modeling and training methods are described in detail. All query translation models are evaluated using TREC collections. Results show that larger translation units lead to more specific models that usually achieve better translation and cross-language information retrieval results.

Copyrights may apply

Edit | Del

Kadri, Youssef and Nie, Jian-Yun (2006): Improving query translation with confidence estimation for cross language information retrieval. In: Yu, Philip S., Tsotras, Vassilis J., Fox, Edward A. and Liu, Bing (eds.) Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management November 6-11, 2006, Arlington, Virginia, USA. pp. 818-819. Available online

Edit | Del

Shi, Lixin and Nie, Jian-Yun (2006): Filtering or adapting: two strategies to exploit noisy parallel corpora for cross-language information retrieval. In: Yu, Philip S., Tsotras, Vassilis J., Fox, Edward A. and Liu, Bing (eds.) Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management November 6-11, 2006, Arlington, Virginia, USA. pp. 814-815. Available online

Edit | Del

Cao, Guihong, Nie, Jian-Yun and Bai, Jing (2006): Constructing better document and query models with markov chains. In: Yu, Philip S., Tsotras, Vassilis J., Fox, Edward A. and Liu, Bing (eds.) Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management November 6-11, 2006, Arlington, Virginia, USA. pp. 800-801. Available online

» 2005 «

Edit | Del

Gao, Jianfeng, Qi, Haoliang, Xia, Xinsong and Nie, Jian-Yun (2005): Linear discriminant model for information retrieval. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. pp. 290-297. Available online

This paper presents a new discriminative model for information retrieval (IR), referred to as linear discriminant model (LDM), which provides a flexible framework to incorporate arbitrary features. LDM is different from most existing models in that it takes into account a variety of linguistic features that are derived from the component models of HMM that is widely used in language modeling approaches to IR. Therefore, LDM is a means of melding discriminative and generative models for IR. We present two algorithms of parameter learning for LDM. One is to optimize the average precision (AP) directly using an iterative procedure. The other is a perceptron-based algorithm that minimizes the number of discordant document-pairs in a rank list. The effectiveness of our approach has been evaluated on the task of ad hoc retrieval using six English and Chinese TREC test sets. Results show that (1) in most test sets, LDM significantly outperforms the state-of-the-art language modeling approaches and the classical probabilistic retrieval model; (2) it is more appropriate to train LDM using a measure of AP rather than likelihood if the IR system is graded on AP; and (3) linguistic features (e.g. phrases and dependences) are effective for IR if they are incorporated properly.

Copyrights may apply

Edit | Del

Gao, Guihong, Nie, Jian-Yun and Bai, Jing (2005): Integrating word relationships into language models. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. pp. 298-305. Available online

In this paper, we propose a novel dependency language modeling approach for information retrieval. The approach extends the existing language modeling approach by relaxing the independence assumption. Our goal is to build a language model in which various word relationships can be integrated. In this work, we integrate two types of relationship extracted from WordNet and co-occurrence relationships respectively. The integrated model has been tested on several TREC collections. The results show that our model achieves substantial and significant improvements with respect to the models without these relationships. These results clearly show the benefit of integrating word relationships into language models for IR.

Copyrights may apply

Edit | Del

Bai, Jing, Song, Dawei, Bruza, Peter, Nie, Jian-Yun and Cao, Guihong (2005): Query expansion using term relationships in language models for information retrieval. In: Herzog, Otthein, Schek, Hans-Jörg and Fuhr, Norbert (eds.) Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management October 31 - November 5, 2005, Bremen, Germany. pp. 688-695. Available online

Edit | Del

Wang, Ming-Wen, Nie, Jian-Yun and Zeng, Xue-Qiang (2005): A latent semantic classification model. In: Herzog, Otthein, Schek, Hans-Jörg and Fuhr, Norbert (eds.) Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management October 31 - November 5, 2005, Bremen, Germany. pp. 261-262. Available online

Edit | Del

Bai, Jing, Nie, Jian-Yun and Cao, Guihong (2005): Integrating Compound Terms in Bayesian Text Classification. In: Skowron, Andrzej, Agrawal, Rakesh, Luck, Michael, Yamaguchi, Takahira, Morizet-Mahoudeaux, Pierre, Liu, Jiming and Zhong, Ning (eds.) 2005 IEEE / WIC / ACM International Conference on Web Intelligence WI 2005 19-22 September, 2005, Compiegne, France. pp. 598-601. Available online

» 2004 «

Edit | Del

Gao, Jianfeng, Nie, Jian-Yun, Wu, Guangyuan and Cao, Guihong (2004): Dependence language model for information retrieval. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2004. pp. 170-177. Available online

This paper presents a new dependence language modeling approach to information retrieval. The approach extends the basic language modeling approach based on unigram by relaxing the independence assumption. We integrate the linkage of a query as a hidden variable, which expresses the term dependencies within the query as an acyclic, planar, undirected graph. We then assume that a query is generated from a document in two stages: the linkage is generated first, and then each term is generated in turn depending on other related terms according to the linkage. We also present a smoothing method for model parameter estimation and an approach to learning the linkage of a sentence in an unsupervised manner. The new approach is compared to the classical probabilistic retrieval model and the previously proposed language models with and without taking into account term dependencies. Results show that our model achieves substantial and significant improvements on TREC collections.

Copyrights may apply

Edit | Del

Bai, Jing, Paradis, Francois and Nie, Jian-Yun (2004): Web-supported Matching and Classification of Business. In: Yao, Jingtao, Raghavan, Vijay V. and Wang, G. Y. (eds.) Proceedings of the 2nd International Workshop on Web-based Support Systems September 20, 2004, Beijing, China. pp. 28-36. Available online

» 2003 «

Edit | Del

Nie, Jian-Yun (2003): Query expansion and query translation as logical inference. In JASIST - Journal of the American Society for Information Science and Technology, 54 (4) pp. 335-346

» 2002 «

Edit | Del

Gao, Jianfeng, Zhou, Ming, Nie, Jian-Yun, He, Hongzhao and Chen, Weijun (2002): Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2002. pp. 183-190. Available online

Bilingual dictionaries have been commonly used for query translation in cross-language information retrieval (CLIR). However, we are faced with the problem of translation selection. Several recent studies suggested the utilization of term co-occurrences in this selection. This paper presents two extensions to improve them. First, we extend the basic co-occurrence model by adding a decaying factor that decreases the mutual information when the distance between the terms increases. Second, we incorporate a triple translation model, in which syntactic dependence relations (represented as triples) are integrated. Our evaluation on translation accuracy shows that translating triples as units is more precise than a word-by-word translation. Our CLIR experiments show that the addition of the decaying factor leads to substantial improvements of the basic co-occurrence model; and the triple translation model brings further improvements.

Copyrights may apply

Edit | Del

Cui, Hang, Wen, Ji-Rong, Nie, Jian-Yun and Ma, Wei-Ying (2002): Probabilistic query expansion using query logs. In: Proceedings of the 2002 International Conference on the World Wide Web 2002. pp. 325-332. Available online

Query expansion has long been suggested as an effective way to resolve the short query and word mismatching problems. A number of query expansion methods have been proposed in traditional information retrieval. However, these previous methods do not take into account the specific characteristics of web searching; in particular, of the availability of large amount of user interaction information recorded in the web query logs. In this study, we propose a new method for query expansion based on query logs. The central idea is to extract probabilistic correlations between query terms and document terms by analyzing query logs. These correlations are then used to select high-quality expansion terms for new queries. The experimental results show that our log-based probabilistic query expansion method can greatly improve the search performance and has several advantages over other existing methods.

Copyrights may apply

» 2001 «

Edit | Del

Gao, Jianfeng, Nie, Jian-Yun, Xun, Endong, Zhang, Jian, Zhou, Ming and Huang, Changning (2001): Improving query translation for cross-language information retrieval using statistical models. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2001. pp. 96-104. Available online

Dictionaries have often been used for query translation in cross-language information retrieval (CLIR). However, we are faced with the problem of translation ambiguity, i.e. multiple translations are stored in a dictionary for a word. In addition, a word-by-word query translation is not precise enough. In this paper, we explore several methods to improve the previous dictionary-based query translation. First, as many as possible, noun phrases are recognized and translated as a whole by using statistical models and phrase translation patterns. Second, the best word translations are selected based on the cohesion of the translation words. Our experimental results on TREC English-Chinese CLIR collection show that these techniques result in significant improvements over the simple dictionary approaches, and achieve even better performance than a high-quality machine translation system.

Copyrights may apply

Edit | Del

Wen, Ji-Rong, Nie, Jian-Yun and Zhang, Hong-Jiang (2001): Query clustering using content words and user feedback. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2001. pp. 442-443. Available online

Query clustering is crucial for automatically discovering frequently asked queries (FAQs) or most popular topics on a question-answering search engine. Due to the short length of queries, the traditional approaches based on keywords are not suitable for query clustering. This paper describes our attempt to cluster similar queries according to their contents as well as the document click information in the user logs.

Copyrights may apply

Edit | Del

Wen, Ji-Rong, Nie, Jian-Yun and Zhang, Hong-Jiang (2001): Clustering user queries of a search engine. In: Proceedings of the 2001 International Conference on the World Wide Web 2001. pp. 162-168. Available online

» 1999 «

Edit | Del

Nie, Jian-Yun, Simard, Michel, Isabelle, Pierre and Durand, Richard (1999): Cross-Language Information Retrieval Based on Parallel Texts and Automatic Mining of Parallel Texts from the Web. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1999. pp. 74-81. Available online

» 1996 «

Edit | Del

Nie, Jian-Yun, Brisebois, Martin and Ren, Xiaobo (1996): On Chinese Text Retrieval. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1996. pp. 225-233. Available online

In previous studies, Chinese text retrieval has often been dealt with on the character basis. This approach is not suited to deal with complex queries. We suggest that Chinese text retrieval should work with words instead of characters. The crucial problem is to segment originally continuous Chinese texts into words. In this paper, we first propose a hybrid segmentation approach which unifies the commonly used approaches. The system SMART is then adapted to index the segmented Chinese texts. Finally, we suggest that Chinese text retrieval should move further to include a thesaurus in order to cope with the rich vocabulary of Chinese.

Copyrights may apply

» 1993 «

Edit | Del

Nie, Jian-Yun, Paradis, Francois and Vaucher, Jean G. (1993): Adjusting the Performance of an Information Retrieval System. In: Bhargava, Bharat K., Finin, Timothy W. and Yesha, Yelena (eds.) CIKM 93 - Proceedings of the Second International Conference on Information and Knowledge Management November 1-5, 1993, Washington, DC, USA. pp. 726-728. Available online

» 1992 «

Edit | Del

Nie, Jian-Yun (1992): Towards a Probabilistic Modal Logic for Semantic-Based Information Retrieval. In: Proceedings of the Fifteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1992. pp. 140-151. Available online

Semantic-based approaches to Information Retrieval make a query evaluation similar to an inference process based on semantic relations. Semantic-based approaches find out hidden semantic relationships between a document and a query, but quantitative estimation of the correspondence between them is often empiric. On the other hand, probabilistic approaches usually consider only statistical relationships between terms. It is expected that improvement may be brought by integrating these two approaches. This paper demonstrates, using some particular probabilistic models which are strongly related to modal logic, that such an integration is feasible and natural. A new model is developed on the basis of an extended modal logic. It has the advantages of: (1) augmenting a semantic-based approach with a probabilistic measurement, and (2) augmenting a probabilistic approach with finer semantic relations than just statistical ones. It is shown that this model verifies most of the conditions for an absolute probability function.

Copyrights may apply

ADD PUBLICATION
SHOW THIS LIST ON YOUR HOMEPAGE

What do YOU think?

Give us your opinion! Do you have any comments/additions
that you would like other visitors to see?

 
comment You say: Mar 21st, 2010
#1
Be the first to add a thoughtful note to this page ! 

  will be spam-protected
 

 
How many?
=
e.g. "6"
 

Changes to this page (author)

27 Feb 2010: Enabled abstracts to be shown on Jian-Yun Nie's author page.
09 Jul 2009: Author was edited
09 Jul 2009: Author was edited
01 Jun 2009: Author was edited
30 May 2009: Author was edited
30 May 2009: Author was edited
30 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
08 Apr 2009: Author was edited
12 May 2008: Author was edited
12 May 2008: Author was edited
25 Jun 2007: Author was edited
25 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was added to the bibliography

Publication statistics

Publication period:1992-2008
Publication count:26
Number of co-authors:37



Productive colleagues

Jian-Yun Nie's 3 most productive colleagues in number of publications:

Wei-Ying Ma:85
Ji-Rong Wen:24
Stephen Robertson:16


Collaboration count

Number of publications with 3 favourite co-authors:

Guihong Cao:8
Jianfeng Gao:7
Jing Bai:7

 

Other options

Learn more about Jian-Yun Nie:
- Google Scholar
- ACM
- CSB

Mar 21

Software design is the act of determining the user's experience with a piece of software. It has nothing to do with how the code works inside, or how big or small the code is. The designer's task is to specify completely and unambiguously the user's whole experience.

-- David Liddle, From Bringing Design to Software, edited by Terry Winograd, 1996

  • Share this quote on... Bookmark and Share
  • Get more quotes

Eva Hornecker on Tangible Interaction

Eva Hornecker explains the evolving concept of Tangible Interaction.

Read Eva's insightful entry here..

Help us help you!

  • Spread the word: Bookmark and Share
  • Donate
  • Other ways to help
 

Page information

Page maintainer: The Editorial Team
How to cite/reference this page
URL: http://www.interaction-design.org/references/authors/jian-yun_nie.html