Publication statistics

Pub. period:2002-2012
Pub. count:22
Number of co-authors:36


Number of publications with 3 favourite co-authors:

Eileen Abels:
Ferhan Ture:
Tamer Elsayed:



Productive colleagues

Jimmy Lin's 3 most productive colleagues in number of publications:

Susan Dumais:74
Iadh Ounis:59
Craig Macdonald:37

Upcoming Courses

go to course
Gestalt Psychology and Web Design: The Ultimate Guide
Starts tomorrow LAST CALL!
go to course
Become a UX Designer from scratch
Starts the day after tomorrow !

Featured chapter

Marc Hassenzahl explains the fascinating concept of User Experience and Experience Design. Commentaries by Don Norman, Eric Reiss, Mark Blythe, and Whitney Hess

User Experience and Experience Design !


Our Latest Books

The Glossary of Human Computer Interaction
by Mads Soegaard and Rikke Friis Dam
start reading
The Social Design of Technical Systems: Building technologies for communities. 2nd Edition
by Brian Whitworth and Adnan Ahmad
start reading
Gamification at Work: Designing Engaging Business Software
by Janaki Mythily Kumar and Mario Herger
start reading
The Social Design of Technical Systems: Building technologies for communities
by Brian Whitworth and Adnan Ahmad
start reading
The Encyclopedia of Human-Computer Interaction, 2nd Ed.
by Mads Soegaard and Rikke Friis Dam
start reading

Jimmy Lin


Publications by Jimmy Lin (bibliography)

 what's this?
Edit | Del

Ture, Ferhan, Lin, Jimmy and Oard, Douglas W. (2012): Looking inside the box: context-sensitive translation for cross-language information retrieval. In: Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2012. pp. 1105-1106.

Cross-language information retrieval (CLIR) today is dominated by techniques that use token-to-token mappings from bilingual dictionaries. Yet, state-of-the-art statistical translation models (e.g., using Synchronous Context-Free Grammars) are far richer, capturing multi-term phrases, term dependencies, and contextual constraints on translation choice. We present a novel CLIR framework that is able to reach inside the translation "black box" and exploit these sources of evidence. Experiments on the TREC-5/6 English-Chinese test collection show this approach to be promising.

© All rights reserved Ture et al. and/or ACM Press

Edit | Del

McCreadie, Richard, Soboroff, Ian, Lin, Jimmy, Macdonald, Craig, Ounis, Iadh and McCullough, Dean (2012): On building a reusable Twitter corpus. In: Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2012. pp. 1113-1114.

The Twitter real-time information network is the subject of research for information retrieval tasks such as real-time search. However, so far, reproducible experimentation on Twitter data has been impeded by restrictions imposed by the Twitter terms of service. In this paper, we detail a new methodology for legally building and distributing Twitter corpora, developed through collaboration between the Text REtrieval Conference (TREC) and Twitter. In particular, we detail how the first publicly available Twitter corpus -- referred to as Tweets2011 -- was distributed via lists of tweet identifiers and specialist tweet crawling software. Furthermore, we analyse whether this distribution approach remains robust over time, as tweets in the corpus are removed either by users or Twitter itself. Tweets2011 was successfully used by 58 participating groups for the TREC 2011 Microblog track, while our results attest to the robustness of the crawling methodology over time.

© All rights reserved McCreadie et al. and/or ACM Press

Edit | Del

Mishne, Gilad and Lin, Jimmy (2012): Twanchor text: a preliminary study of the value of tweets as anchor text. In: Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2012. pp. 1159-1160.

It is well known that anchor text plays an important role in search, providing signals that are often not present in the source document itself. The paper reports results of a preliminary investigation on the value of tweets and tweet conversations as anchor text. We show that using tweets as anchors improves significantly over using HTML anchors, and significantly increases recall of news item retrieval.

© All rights reserved Mishne and Lin and/or ACM Press

Edit | Del

Wang, Lidan, Lin, Jimmy and Metzler, Donald (2011): A cascade ranking model for efficient ranked retrieval. In: Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2011. pp. 105-114.

There is a fundamental tradeoff between effectiveness and efficiency when designing retrieval models for large-scale document collections. Effectiveness tends to derive from sophisticated ranking functions, such as those constructed using learning to rank, while efficiency gains tend to arise from improvements in query evaluation and caching strategies. Given their inherently disjoint nature, it is difficult to jointly optimize effectiveness and efficiency in end-to-end systems. To address this problem, we formulate and develop a novel cascade ranking model, which unlike previous approaches, can simultaneously improve both top k ranked effectiveness and retrieval efficiency. The model constructs a cascade of increasingly complex ranking functions that progressively prunes and refines the set of candidate documents to minimize retrieval latency and maximize result set quality. We present a novel boosting algorithm for learning such cascades to directly optimize the tradeoff between effectiveness and efficiency. Experimental results show that our cascades are faster and return higher quality results than comparable ranking models.

© All rights reserved Wang et al. and/or ACM Press

Edit | Del

Ture, Ferhan, Elsayed, Tamer and Lin, Jimmy (2011): No free lunch: brute force vs. locality-sensitive hashing for cross-lingual pairwise similarity. In: Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2011. pp. 943-952.

This work explores the problem of cross-lingual pairwise similarity, where the task is to extract similar pairs of documents across two different languages. Solutions to this problem are of general interest for text mining in the multi-lingual context and have specific applications in statistical machine translation. Our approach takes advantage of cross-language information retrieval (CLIR) techniques to project feature vectors from one language into another, and then uses locality-sensitive hashing (LSH) to extract similar pairs. We show that effective cross-lingual pairwise similarity requires working with similarity thresholds that are much lower than in typical monolingual applications, making the problem quite challenging. We present a parallel, scalable MapReduce implementation of the sort-based sliding window algorithm, which is compared to a brute-force approach on German and English Wikipedia collections. Our central finding can be summarized as "no free lunch": there is no single optimal solution. Instead, we characterize effectiveness-efficiency tradeoffs in the solution space, which can guide the developer to locate a desirable operating point based on application- and resource-specific constraints.

© All rights reserved Ture et al. and/or ACM Press

Edit | Del

Asadi, Nima, Metzler, Donald, Elsayed, Tamer and Lin, Jimmy (2011): Pseudo test collections for learning web search ranking functions. In: Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2011. pp. 1073-1082.

Test collections are the primary drivers of progress in information retrieval. They provide yardsticks for assessing the effectiveness of ranking functions in an automatic, rapid, and repeatable fashion and serve as training data for learning to rank models. However, manual construction of test collections tends to be slow, labor-intensive, and expensive. This paper examines the feasibility of constructing web search test collections in a completely unsupervised manner given only a large web corpus as input. Within our proposed framework, anchor text extracted from the web graph is treated as a pseudo query log from which pseudo queries are sampled. For each pseudo query, a set of relevant and non-relevant documents are selected using a variety of web-specific features, including spam and aggregated anchor text weights. The automatically mined queries and judgments form a pseudo test collection that can be used for training ranking functions. Experiments carried out on TREC web track data show that learning to rank models trained using pseudo test collections outperform an unsupervised ranking function and are statistically indistinguishable from a model trained using manual judgments, demonstrating the usefulness of our approach in extracting reasonable quality training data "for free".

© All rights reserved Asadi et al. and/or ACM Press

Edit | Del

Asadi, Nima, Metzler, Donald and Lin, Jimmy (2011): Cross-corpus relevance projection. In: Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2011. pp. 1163-1164.

Edit | Del

Wagner, Earl J. and Lin, Jimmy (2011): In-depth accounts and passing mentions in the news: connecting readers to the context of a news event. In: Proceedings of the 2011 iConference 2011. pp. 790-791.

Software that models how types of news events unfold can extract information about specific events and explain them to a news reader. This support can be useful when the background provided by an article is insufficient, if other news coverage exists from which an event's history can be extracted. For extended sequences of related events, it is reasonable to expect that articles published after the sequence concludes include less background coverage of the sequence. Focusing on two stereotypical types of event sequences -- kidnappings and corporate acquisitions -- we distinguish between articles providing in-depth coverage, those having multiple sentences mentioning the same event sequence, from articles making a passing mention in just one sentence. We find that, after an event sequence concludes, passing mentions become more common and there are significantly fewer mean mentions per article.

© All rights reserved Wagner and Lin and/or ACM Press

Edit | Del

Wang, Lidan, Lin, Jimmy and Metzler, Donald (2010): Learning to efficiently rank. In: Proceedings of the 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2010. pp. 138-145.

It has been shown that learning to rank approaches are capable of learning highly effective ranking functions. However, these approaches have mostly ignored the important issue of efficiency. Given that both efficiency and effectiveness are important for real search engines, models that are optimized for effectiveness may not meet the strict efficiency requirements necessary to deploy in a production environment. In this work, we present a unified framework for jointly optimizing effectiveness and efficiency. We propose new metrics that capture the tradeoff between these two competing forces and devise a strategy for automatically learning models that directly optimize the tradeoff metrics. Experiments indicate that models learned in this way provide a good balance between retrieval effectiveness and efficiency. With specific loss functions, learned models converge to familiar existing ones, which demonstrates the generality of our framework. Finally, we show that our approach naturally leads to a reduction in the variance of query execution times, which is important for query load balancing and user satisfaction.

© All rights reserved Wang et al. and/or their publisher

Edit | Del

Murray, G. Craig, Lin, Jimmy, Wilbur, John and Lu, Zhiyong (2009): Users' adjustments to unsuccessful queries in biomedical search. In: JCDL09 Proceedings of the 2009 Joint International Conference on Digital Libraries 2009. pp. 433-434.

Biomedical researchers depend on on-line databases and digital libraries for up to date information. We introduce a pilot project aimed at characterizing adjustments made to biomedical queries that improve search results. Specifically we focus on queries submitted to PubMed, a large sophisticated search engine that facilitates Web access to abstracts of articles in over 5,200 biomedical journals. On average 2 million users search PubMed each day. During their search, nearly 20% will experience a result page from one of their queries that has zero results. In some cases there really is no document or abstract that will satisfy a particular query. However, in analyzing one month of queries submitted to PubMed, we find that more often than not, queries that retrieved no results are queries that would retrieve something relevant if they were constructed differently. This paper describes a new effort to identify some of the characteristics of a query that produces zero results, and the changes that users most often apply in constructing new, "corrected" queries. Zero-result queries afford us an opportunity to examine changes made to queries that we know did not return relevant data, because they did not return any data. An investigation of the changes users make under these circumstances can yield insight into users' search processes.

© All rights reserved Murray et al. and/or their publisher

Edit | Del

Lin, Jimmy (2009): Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce. In: Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2009. pp. 155-162.

This paper explores the problem of computing pairwise similarity on document collections, focusing on the application of "more like this" queries in the life sciences domain. Three MapReduce algorithms are introduced: one based on brute force, a second where the problem is treated as large-scale ad hoc retrieval, and a third based on the Cartesian product of postings lists. Each algorithm supports one or more approximations that trade effectiveness for efficiency, the characteristics of which are studied experimentally. Results show that the brute force algorithm is the most efficient of the three when exact similarity is desired. However, the other two algorithms support approximations that yield large efficiency gains without significant loss of effectiveness.

© All rights reserved Lin and/or his/her publisher

Edit | Del

Lin, Jimmy, Wu, Philip and Abels, Eileen (2009): Toward automatic facet analysis and need negotiation: Lessons from mediated search. In ACM Transactions on Information Systems, 27 (1) p. 6.

This work explores the hypothesis that interactions between a trained human search intermediary and an information seeker can inform the design of interactive IR systems. We discuss results from a controlled Wizard-of-Oz case study, set in the context of the TREC 2005 HARD track evaluation, in which a trained intermediary executed an integrated search and interaction strategy based on conceptual facet analysis and informed by need negotiation techniques common in reference interviews. Having a human "in the loop" yielded large improvements over fully automated systems as measured by standard ranked-retrieval metrics, demonstrating the value of mediated search. We present a detailed analysis of the intermediary's actions to gain a deeper understanding of what worked and why. One contribution is a taxonomy of clarification types informed both by empirical results and existing theories in library and information science. We discuss how these findings can guide the development of future systems. Overall, this work illustrates how studying human information-seeking processes can lead to better information retrieval applications.

© All rights reserved Lin et al. and/or ACM Press

Edit | Del

Klavans, Judith L., Sheffield, Carolyn, Lin, Jimmy and Sidhu, Tandeep (2008): Computational linguistics for metadata building. In: JCDL08 Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries 2008. p. 427.

In this paper, we describe a downloadable text-mining tool for enhancing subject access to image collections in digital libraries.

© All rights reserved Klavans et al. and/or ACM Press

Edit | Del

Lin, Jimmy and Smucker, Mark D. (2008): How do users find things with PubMed?: towards automatic utility evaluation with user simulations. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2008. pp. 19-26.

In the context of document retrieval in the biomedical domain, this paper explores the complex relationship between the quality of initial query results and the overall utility of an interactive retrieval system. We demonstrate that a content-similarity browsing tool can compensate for poor retrieval results, and that the relationship between retrieval performance and overall utility is non-linear. Arguments are advanced with user simulations, which characterize the relevance of documents that a user might encounter with different browsing strategies. With broader implications to IR, this work provides a case study of how user simulations can be exploited as a formative tool for automatic utility evaluation. Simulation-based studies provide researchers with an additional evaluation tool to complement interactive and Cranfield-style experiments.

© All rights reserved Lin and Smucker and/or ACM Press

Edit | Del

Lin, Jimmy and Zhang, Pengyi (2007): Deconstructing nuggets: the stability and reliability of complex question answering evaluation. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2007. pp. 327-334.

A methodology based on "information nuggets" has recently emerged as the de facto standard by which answers to complex questions are evaluated. After several implementations in the TREC question answering tracks, the community has gained a better understanding of its many characteristics. This paper focuses on one particular aspect of the evaluation: the human assignment of nuggets to answer strings, which serves as the basis of the F-score computation. As a byproduct of the TREC 2006 ciQA task, identical answer strings were independently evaluated twice, which allowed us to assess the consistency of human judgments. Based on these results, we explored simulations of assessor behavior that provide a method to quantify scoring variations. Understanding these variations in turn lets researchers be more confident in their comparisons of systems.

© All rights reserved Lin and Zhang and/or ACM Press

Edit | Del

Lin, Jimmy (2007): An exploration of the principles underlying redundancy-based factoid question answering. In ACM Transactions on Information Systems, 25 (2) p. 6.

The so-called "redundancy-based" approach to question answering represents a successful strategy for mining answers to factoid questions such as "Who shot Abraham Lincoln?" from the World Wide Web. Through contrastive and ablation experiments with Aranea, a system that has performed well in several TREC QA evaluations, this work examines the underlying assumptions and principles behind redundancy-based techniques. Specifically, we develop two theses: that stable characteristics of data redundancy allow factoid systems to rely on external "black box" components, and that despite embodying a data-driven approach, redundancy-based methods encode a substantial amount of knowledge in the form of heuristics. Overall, this work attempts to address the broader question of "what really matters" and to provide guidance for future researchers.

© All rights reserved Lin and/or ACM Press

Edit | Del

Lin, Jimmy (2005): Evaluation of resources for question answering evaluation. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. pp. 392-399.

Controlled and reproducible laboratory experiments, enabled by reusable test collections, represent a well-established methodology in modern information retrieval research. In order to confidently draw conclusions about the performance of different retrieval methods using test collections, their reliability and trustworthiness must first be established. Although such studies have been performed for ad hoc test collections, currently available resources for evaluating question answering systems have not been similarly analyzed. This study evaluates the quality of answer patterns and lists of relevant documents currently employed in automatic question answering evaluation, and concludes that they are not suitable for post-hoc experimentation. These resources, created from runs submitted by TREC QA track participants, do not produce fair and reliable assessments of systems that did not participate in the original evaluations. Potential solutions for addressing this evaluation gap and their shortcomings are discussed.

© All rights reserved Lin and/or ACM Press

Edit | Del

Lin, Jimmy and Murray, G. Craig (2005): Assessing the term independence assumption in blind relevance feedback. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2005. pp. 635-636.

When applying blind relevance feedback for ad hoc document retrieval, is it possible to identify, a priori, the set of query terms that will most improve retrieval performance? Can this complex problem be reduced into the simpler one of making independent decisions about the performance effects of each query term? Our experiments suggest that, for the selection of terms for blind relevance feedback, the term independence assumption may be empirically justified.

© All rights reserved Lin and Murray and/or ACM Press

Edit | Del

Karger, David R., Katz, Boris, Lin, Jimmy and Quan, Dennis (2003): Sticky notes for the semantic web. In: Johnson, Lewis and Andre, Elisabeth (eds.) International Conference on Intelligent User Interfaces 2003 January 12-15, 2003, Miami, Florida, USA. pp. 254-256.

Computer-based annotation is increasing in popularity as a mechanism for revising documents and sharing comments over the Internet. One reason behind this surge is that viewpoints, summaries, and notes written by others are often helpful to readers. In particular, these types of annotations can help users locate or recall relevant documents. We believe that this model can be applied to the problem of retrieval on the Semantic Web. In this paper, we propose a generalized annotation environment that supports richer forms of description such as natural language. We discuss how RDF can be used to model annotations and the connections between annotations and the documents they describe. Furthermore, we explore the idea of a question answering interface that allows retrieval based both on the text of the annotations and the annotations associated metadata. Finally, we speculate on how these features could be pervasively integrated into an information management environment, making Semantic Web annotation a first class player in terms of document management and retrieval.

© All rights reserved Karger et al. and/or ACM Press

Edit | Del

Tellex, Stefanie, Katz, Boris, Lin, Jimmy, Fernandes, Aaron and Marton, Gregory (2003): Quantitative evaluation of passage retrieval algorithms for question answering. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2003. pp. 41-47.

Passage retrieval is an important component common to many question answering systems. Because most evaluations of question answering systems focus on end-to-end performance, comparison of common components becomes difficult. To address this shortcoming, we present a quantitative evaluation of various passage retrieval algorithms for question answering, implemented in a framework called Pauchok. We present three important findings: Boolean querying schemes perform well in the question answering task. The performance differences between various passage retrieval algorithms vary with the choice of document retriever, which suggests significant interactions between document retrieval and passage retrieval. The best algorithms in our evaluation employ density-based measures for scoring query terms. Our results reveal future directions for passage retrieval and question answering.

© All rights reserved Tellex et al. and/or ACM Press

Edit | Del

Lin, Jimmy, Quan, Dennis, Sinha, Vineet, Bakshi, Karun, Huynh, David, Katz, Boris and Karger, David R. (2003): What Makes a Good Answer? The Role of Context in Question Answering. In: Proceedings of IFIP INTERACT03: Human-Computer Interaction 2003, Zurich, Switzerland. p. 25.

Edit | Del

Dumais, Susan, Banko, Michele, Brill, Eric, Lin, Jimmy and Ng, Andrew (2002): Web question answering: is more always better?. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2002. pp. 291-298.

This paper describes a question answering system that is designed to capitalize on the tremendous amount of data that is now available online. Most question answering systems use a wide variety of linguistic resources. We focus instead on the redundancy available in large corpora as an important resource. We use this redundancy to simplify the query rewrites that we need to use, and to support answer mining from returned snippets. Our system performs quite well given the simplicity of the techniques being utilized. Experimental results show that question answering accuracy can be greatly improved by analyzing more and more matching passages. Simple passage ranking and n-gram extraction techniques work well in our system making it efficient to use with many backend retrieval engines.

© All rights reserved Dumais et al. and/or ACM Press

Add publication
Show list on your website

Join our community and advance:




Join our community!

Page Information

Page maintainer: The Editorial Team