Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval


 
Time and place:

1991
Series:
Conf. description:
SIGIR is the major international forum for the presentation of new research results and the demonstration of new systems and techniques in the field of information retrieval.
Help us!
Do you know when the next conference is? If yes, please add it to the calendar!
Publisher:
EDIT

References from this conference (1991)

The following articles are from "Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval":

 what's this?

Articles

p. 1

Belkin, Nicholas J. (1991): B. C. Brookes: In Memoriam. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. p. 1. Available online

p. 114-122

Frei, H. P. and Wyle, M. F. (1991): Retrieval Algorithm Effectiveness in a Wide Area Network Information Filter. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 114-122. Available online

We present an application of the usefulness performance measure in a WAN-based SDI system. Components of two basic indexing and retrieval algorithms are compared experimentally. The components we investigate include indexing token type (words versus N-grams), the amount of word reduction used in indexing, and the use of an indirect similarity component in retrieval. The theoretical basis and implementation of the basic algorithms and variations are discussed. Results indicate that works perform better than N-grams, that S-stemming is better than full-stemming, and that indirect similarity provides an improvement to the cosine measure. Performance improvements are, however, small.

Copyrights may apply

p. 123-132

Sutcliffe, Richard F. E. (1991): Distributed Representations in a Text Based Information Retrieval System: A New Way of Using the Vector Space Model. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 123-132. Available online

In this paper we discuss how the Vector Space model of Information Retrieval can be used in a new way by combining connectionist ideas about distributed representations with the concept of propositional structure (semantic case structure) derived from mainstream Natural Language Understanding research. We show how distributed representations may be used to capture both amorphous concept representations and propositional structures and we discuss a prototype Information Retrieval system, PELICAN, which has been constructed in order to experiment with these ideas.

Copyrights may apply

p. 134-141

Korfhage, Robert R. (1991): To See, or Not to See -- Is That the Query?. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 134-141. Available online

Traditional information retrieval systems, in the guise of presenting the most relevant information to the searcher, really put blinders on him. They present certain information to the searcher, but strongly inhibit him from seeing other information, or even knowing of its existence. In this paper we present an argument for a new retrieval paradigm, one that focuses on the organized display of all documents, rather than on the linear display of just the "best."

Copyrights may apply

p. 14-20

Tague, Jean, Salminen, Airi and McClellan, Charles (1991): Complete Formal Model for Information Retrieval Systems. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 14-20. Available online

p. 142-151

Fowler, Richard H., Fowler, Wendy A. L. and Wilson, Bradley A. (1991): Integrating Query, Thesaurus, and Documents through a Common Visual Representation. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 142-151. Available online

Document retrieval is a highly interactive process dealing with large amounts of information. Visual representations can both provide a means for managing the complexity of large information structures and support an interface style well suited to interactive manipulation. The system we have designed utilizes visually displayed graphic structures and a direct manipulation interface style to supply an integrated environment for retrieval. A common visually displayed network structure is used for query, document content, and term relations. A query can be modified through direct manipulation of its visual form by incorporating terms from any other information structure the system displays. An associative thesaurus of terms and an interdocument network provide information about a document collection that can complement other retrieval aids. Visualization of these large data structures makes use of fisheye views and overview diagrams to help overcome some of the difficulties of orientation and navigation in large information structures.

Copyrights may apply

p. 152-161

Tissen, Anne (1991): A Case-Based Architecture for a Dialogue Manager for Information-Seeking Processes. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 152-161. Available online

In this paper, we propose a case-based architecture for a dialogue manager. The dialogue manager is one of the main components of the cognitive layer of an interface system for information-seeking processes. Information-seeking is a highly exploratory and navigational process and needs therefore elaborated interaction functionality. In our approach, this functionality will be provided by the dialogue manager operating on a set of case-based dialogue plans. In a case-based planning system a new plan will be generated by retrieving the plan which is most appropriate to the user's goals and adapting it dynamically during the ongoing dialogue. We propose a case-based architecture for two reasons. First, operating on old solutions provides a coherent framework which prevents the user from being 'lost in hyperspace'. Second, it allows flexible adaptations, domain dependents ones, using perspectives on domain objects, and domain independent ones, that change the sequence of dialogue steps.

Copyrights may apply

p. 163-172

Anick, Peter G., Flynn, Rex A. and Hanssen, David R. (1991): Addressing the Requirements of a Dynamic Corporate Textual Information Base. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 163-172. Available online

AI-STARS is a lexicon-assisted full-text Information Retrieval system, designed for use in a dynamic corporate environment. In this paper, we explore how the requirements of such an environment have influenced many key aspects of the design and implementation of the AI-STARS system. We promote the use of "views" to create logical partitions in large, heterogeneous databases, and argue that storing not only article instances, but also class definitions, stored queries, display templates and linguistic data in a single object repository has consequences that can be exploited for schema and lexicon evolution, security and subject filtering, information navigation, and data distribution.

Copyrights may apply

p. 173-182

Jarvelin, Kalervo and Niemi, Timo (1991): Data Conversion, Aggregation and Deduction for Advanced Retrieval from Heterogeneous Fact Databases. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 173-182. Available online

Modern distributed fact databases are heterogeneous and autonomous. Their heterogeneity is due to many reasons, including varying data models, data structures, attribute naming conventions, units of measurement or naming of data values, composition of data as attributes, technical representation of data, abstraction levels of data, etc. Database autonomity means that the database users have hardly any means for reducing such heterogeneity. Present information retrieval (IR) systems either provide no support for overcoming such heterogeneity or their support is insufficient and difficult to utilize. In this paper we offer integrated and powerful data conversion aggregation and deductive techniques for advanced IR in such environments. These techniques allow the users to overcome data inconsistency due to units of measurement or naming of data values, composition of data as attributes, abstraction levels of data, and difficulties related to deductive use of hierarchically classified data. In complex situations, all these inconsistencies appears together. Therefore we also show how these techniques are integrated into a powerful query language which has been implemented in Prolog in a workstation environment.

Copyrights may apply

p. 183-190

Celentano, A., Fugini, M. G. and Pozzi, S. (1991): Querying Office Systems about Document Roles. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 183-190. Available online

This paper describes the architecture of a document retrieval system integrating classical IR features with knowledge about the procedural and application context where documents are used. The paper focuses on the query language that allows the user to pose queries involving the analysis of both the semantic network where procedures, office agents, and events of the office context are represented as elements accessing, modifying, filing, manipulating document, and the document contents, i.e. their text. The coupling of the query system with a browser tool is also discussed. The system relies on a knowledge representation model for document and document roles developed in previous phases of the research.

Copyrights may apply

p. 192-201

Kwok, K. L. (1991): Query Modification and Expansion in a Network with Adaptive Architecture. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 192-201. Available online

This paper shows how a network view of probabilistic information indexing and retrieval with components may implement query expansion and modification (based on user relevance feedback) by growing new edges and adapting weights between queries and terms of relevant documents. Experimental results with two collections and partial feedback confirm that the process can lead to much improved performance. Learning from irrelevant documents however was not effective.

Copyrights may apply

p. 202-210

Wilkinson, Ross and Hingston, Philip (1991): Using the Cosine Measure in a Neural Network for Document Retrieval. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 202-210. Available online

The task of document retrieval systems is to match one natural language query against a large number of natural language documents. Neural networks are known to be good pattern matchers. This paper reports our investigations in implementing a document retrieval system based on a neural network model. It shows that many of the standard strategies of information retrieval are applicable in a neural network model.

Copyrights may apply

p. 21-30

Salton, Garard and Buckley, Chris (1991): Automatic Text Structuring and Retrieval -- Experiments in Automatic Encyclopedia Searching. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 21-30. Available online

Many conventional approaches to text analysis and information retrieval prove ineffective when large text collections must be processed in heterogeneous subject areas. An alternative text manipulation system is outlined useful for the retrieval of large heterogeneous texts, and for the recognition of content similarities between text excerpts, based on flexible text matching procedures carried out in several contexts of different scope. The methods are illustrated by search experiments performed with the 29-volume Funk and Wagnalls encyclopedia.

Copyrights may apply

p. 211-218

Yao, Y. Y. and Wong, S. K. M. (1991): Preference Structure, Inference and Set-Oriented Retrieval. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 211-218. Available online

In this paper, a framework for modeling information retrieval is introduced by combining the salient features of many inference-based and set-oriented retrieval models. The degrees of relevance of different subsets of documents are inferred from the user preference judgments on subsets of index terms. In order to demonstrate the usefulness of the proposed framework, the Boolean and the binary vector space models are analyzed. This analysis reveals the structures implicitly used in these models.

Copyrights may apply

p. 220-229

Danzig, Peter B., Ahn, Jongsuk, Noll, John and Obraczka, Katia (1991): Distributed Indexing: A Scalable Mechanism for Distributed Information Retrieval. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 220-229. Available online

Despite blossoming computer network bandwidths and the emergence of hypertext and CD-ROM databases, little progress has been made towards uniting the world's library-style bibliographic databases. While a few advanced distributed retrieval systems can broadcast a query to hundreds of participating databases, experience shows that local users almost always clog library retrieval systems. Hence broadcast remote queries will clog nearly every systems. The premise of this work is that broadcast-based systems do not scale to world-wide systems. This project describes an indexing scheme that will permit thorough yet efficient searches of millions of retrieval systems. Our architecture will work with an arbitrary number of indexing companies and information providers, and, in the market place, could provide economic incentive for cooperation between database and indexing services. We call our scheme distributed indexing, and believe it will help researchers disseminate and locate both published and republication material. We are building and plan to distribute a research prototype for the Internet that demonstrates these ideas. Our prototype will index technical reports and public domain software from dozens of computer science departments around the country.

Copyrights may apply

p. 230-239

Frieder, Ophir and Siegelmann, Hava Tova (1991): On the Allocation of Documents in Multiprocessor Information Retrieval Systems. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 230-239. Available online

Information retrieval is the selection of documents that are potentially relevant to a user's information need. Given the vast volume of data stored in modern information retrieval systems, searching the document database requires vast computational resources. To meet these computational demands, various researchers have developed parallel information retrieval systems. As efficient exploitation of parallelism demands fast access to the documents, data organization and placement significantly affect the total processing time. We describe and evaluate a data placement strategy for distributed memory, distributed I/O multicomputers. Initially, a formal description of the Multiprocessor Document Allocation Problem (MDAP) and a proof that MDAP is NP Complete are presented. A document allocation algorithm for MDAP based on Genetic Algorithms is developed. This algorithm assumes that the documents are clustered using any one of the many clustering algorithms. We define a cost function for the derived allocation and evaluate the performance of our algorithm using this function. As part of the experimental analysis, the effects of varying the number of documents and their distribution across the clusters as well the exploitation of various differing architectural interconnection topologies are studied. We also experiment with the several parameters common to Genetic Algorithms, e.g., the probability of mutation and the population size.

Copyrights may apply

p. 241-250

Zhang, Yong, Raghavan, Vijay V. and Deogun, Jitender S. (1991): An Object-Oriented Modeling of the History of Optimal Retrievals. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 241-250. Available online

Learning techniques are used in IR to exploit user feedback in order that the system can improve its performance with respect to particular queries. This process involves the construction of an optimal query that best separates the documents known to be relevant from those that are not. Since obtaining relevance judgments and constructing an optimal query involve a great deal of effort, in this paper, we develop a framework for organizing the history of optimal retrievals. The framework involves the identification of a hierarchy of document classes such that the concepts corresponding to higher level classes are more general than those of the lower level classes. The ways in which such a hierarchy may be used to retrieve answers to new queries are outlined. This approach has the advantage that the query specification is concept-based, where as the retrieval mechanism is numerically-oriented involving optimal query vectors. It is shown that the construction of a hierarchy of optimal queries can correspond to an object-oriented modeling of IR objects. Furthermore, the resulting model can be easily implemented using a relational DBMS.

Copyrights may apply

p. 251-260

Henninger, Scott (1991): Retrieving Software Objects in an Example-Based Programming Environment. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 251-260. Available online

Example-based programming is a form of software reuse in which existing code examples are modified to meet current task needs. Example-based programming systems that have enough examples to be useful present the problem of finding relevant examples. A prototype system named CodeFinder, which explores issues of retrieving software objects relevant to the design task, is presented. CodeFinder supports human-computer dialogue by providing the means to incrementally construct a query and by providing associative cues that are compatible with human memory retrieval principles.

Copyrights may apply

p. 262-269

Lin, Xia, Soergel, Dagobert and Marchionini, Gary (1991): A Self-Organizing Semantic Map for Information Retrieval. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 262-269. Available online

A neural network's unsupervised learning algorithm, Kohonen's feature map, is applied to constructing a self-organizing semantic map for information retrieval. The semantic map visualizes semantic relationships between input documents, and has properties of economic representation of data with their interrelationships. The potentials of the semantic map include using the map as a retrieval interface for an online bibliographic system. A prototype system that demonstrates this potential is described.

Copyrights may apply

p. 270-279

Wendlandt, Edgar B. and Driscoll, James R. (1991): Incorporating a Semantic Analysis into a Document Retrieval Strategy. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 270-279. Available online

Current information retrieval systems focus on the use of keywords to respond to user queries. We propose the additional use of surface level knowledge in order to improve the accuracy of information retrieval. Our approach is based on the database concept of semantic modeling (particularly entities and relationships among entities). We extend the concept of query-document similarity by recognizing basic entity properties (attributes) which appear in text. We also extend query-document similarity using the linguistic concept of thematic roles. Thematic roles allow us to recognize relationship properties which appear in text. We include several examples to illustrate our approach. Test results which support our approach are reported. The test results concern searching documents and using their contents to perform the intelligent task of answering a question.

Copyrights may apply

p. 280-289

Swanson, Don R. (1991): Complementary Structures in Disjoint Science Literatures. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 280-289. Available online

p. 291-304

Motzkin, D. (1991): An Efficient Directory System for Document Retrieval. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 291-304. Available online

This paper introduces a file directory structure which provides an efficient access path for document retrieval. The directory structure is based on the multi-B-tree structure. This directory structure is compatible with current automatic retrieval and query processing techniques. Weights that are assigned to index terms can be included in the directory with the terms at no additional cost. In addition, it provides for indexing a secondary attribute within a primary attribute with no additional cost. Updates are achieved with a high degree of efficiency as well. It is shown that this structure achieves a better overall performance than inverted files, standard B-trees, and other directory structures.

Copyrights may apply

p. 3-12

Cleverdon, Cyril W. (1991): The Significance of the Cranfield Tests on Index Languages. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 3-12. Available online

p. 305-314

Rabitti, F. and Savino, P. (1991): Image Query Processing Based on Multi-Level Signatures. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 305-314. Available online

This paper describes the processing of queries, expressing conditions on the content of images, in large image databases. The query language assumes that a semantic interpretation of the image content is available (i.e. an image symbolic interpretation), as result of an image analysis process. The image query language addresses important aspects of the image interpretations resulting from image analysis, by defining partial conditions on the composition of the complex objects, requirements on their degree of recognition, and requirements on their position in the image interpretation. Particular emphasis is given on the definition of suitable content-based access structures to make more efficient the query processing. An approach based on multi-level signatures is adopted. The query is pre-processed on the signatures to filter-out most of the images not satisfying the query. Finally, an evaluation of the efficiency and precision of the signature technique is given.

Copyrights may apply

p. 316-325

Agosti, Maristella, Colotti, Roberto and Gradenigo, Girolamo (1991): A Two-Level Hypertext Retrieval Model for Legal Data. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 316-325. Available online

This paper introduces an associative information retrieval model based on the two-level architecture proposed in [Agosti et al, 1989a] and [Agosti et al, 1990], and an experimental prototype developed in order to validate the model in a personal computing environment. In the first part of the paper, related work and motivations are presented. In the second part, the model, entitled EXPLICIT, is introduced. EXPLICIT is based on a two-level architecture which holds the two main parts of the informative resource managed by an information retrieval tool: the collection of documents and the indexing term structure. The term structure is managed as a schema of concepts which can be used by the final user as a frame of reference in the query formulation process. The model supports the concurrent use of different schemas of concepts to satisfy information needs of different categories of users. In the third part of the paper, the main characteristics of the experimental prototype, named HyperLaw, are presented.

Copyrights may apply

p. 32-45

Croft, W. Bruce, Turtle, Howard and Lewis, David D. (1991): The Use of Phrases and Structured Queries in Information Retrieval. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 32-45. Available online

Both phrases and Boolean queries have a long history in information retrieval, particularly in commercial systems. In previous work, Boolean queries have been used as a source of phrases for a statistical retrieval model. This work, like the majority of research on phrases, resulted in little improvement in retrieval effectiveness. In this paper, we describe an approach where phrases identified in natural language queries are used to build structured queries for a probabilistic retrieval model. Our results show that using phrases in this way can improve performance, and that phrases that are automatically extracted from a natural language query perform nearly as well as manually selected phrases.

Copyrights may apply

p. 326-335

Lelu, Alain (1991): Automatic Generation of "Hyper-Paths" in Information Retrieval Systems: A Stochastic and an Incremental Algorithm. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 326-335. Available online

A hypertext procedure for browsing through documentary databases is proposed, based upon a global synthetic mapping in addition to a set of local scanning axes. A method is developed for automatic generation of these relevant axes: local component analysis. It consists in tracking the local maxima of a "partial inertia" landscape. First, a "neural" algorithm converging after several passes on the data is presented. Then a deterministic one-pass algorithm is deduced, allowing dynamic data-flow analysis.

Copyrights may apply

p. 337-346

Rau, Lisa F. and Jacobs, Paul S. (1991): Creating Segmented Databases from Free Text for Text Retrieval. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 337-346. Available online

Indexing text for accurate retrieval is a difficult and important problem. On-line information services generally depend on "keyword" indices rather than other methods of retrieval, because of the practical features of keywords for storage, dissemination, and browsing as well as for retrieval. However, these methods of indexing have two major drawbacks: First, they must be laboriously assigned by human indexers. Second, they are inaccurate, because of mistakes made by these indexers as well as the difficulties users have in choosing keywords for their queries, and the ambiguity a keyword may have. Current natural language text processing (NLP) methods help to overcome these problems. Such methods can provide automatic indexing and keyword assignment capabilities that are at least as accurate as human indexers in many applications. In addition, NLP systems can increase the information contained in keyword fields by separating keywords into segments, or distinct fields that capture certain discriminating content or relations among keywords. This paper reports on a system that uses natural language text processing to derive keywords from free text news stories, separate these keywords into segments, and automatically build a segmented database. The system is used as part of a commercial news "clipping" and retrieval product. Preliminary results show improved accuracy, as well as reduced cost, resulting from these automated techniques.

Copyrights may apply

p. 347-355

Mauldin, Michael L. (1991): Retrieval Performance in FERRET: A Conceptual Information Retrieval System. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 347-355. Available online

FERRET is a full text, conceptual information retrieval system that uses a partial understanding of its texts to provide greater precision and recall performance than keyword search techniques. It uses a machine-readable dictionary to augment its lexical knowledge and a variant of genetic learning to extend its script database. Comparison of FERRET's retrieval performance on a collection of 1065 astronomy texts using 22 sample user queries with a standard boolean keyword query system showed that precision increased from 35 to 48 percent, and recall more than doubled, from 19.4 to 52.4 percent. This paper describes the FERRET system's architecture, parsing and matching abilities, and focuses on the use of the Webster's Seventh dictionary to increase the system's lexical coverage.

Copyrights may apply

p. 356-358

Salton, Garard, Lesk, Michael E., Harman, Donna, Williamson, Robert E., Fox, Edward A. and Buckley, Chris (1991): The Smart Project in Automatic Document Retrieval. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 356-358. Available online

The Smart project in automatic text retrieval was started in 1961. It is the oldest, continuously running research project in information retrieval. The panel members are all major contributors to the Smart system work. The discussion covers aspects of the Smart system design and examines the past and future significance of some of the research conducted in the Smart environment.

Copyrights may apply

p. 46-56

Fuhr, Norbert and Pfeifer, Ulrich (1991): Combining Model-Oriented and Description-Oriented Approaches for Probabilistic Indexing. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 46-56. Available online

We distinguish model-oriented and description-oriented approaches in probabilistic information retrieval. The former refer to certain representations of documents and queries and use additional independence assumptions, whereas the latter map documents and queries onto feature vectors which form the input to certain classification procedures or regression methods. Description-oriented approaches are more flexible with respect to the underlying representations, but the definition of the feature vector is a heuristic step. In this paper, we combine a probabilistic model for the Darmstadt Indexing Approach with logistic regression. Here the probabilistic model forms a guideline for the definition of the feature vector. Experiments with the purely theoretical approach and with several heuristic variations show that heuristic assumptions may yield significant improvements.

Copyrights may apply

p. 57-61

Cooper, William S. (1991): Some Inconsistencies and Misnomers in Probabilistic Information Retrieval. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 57-61. Available online

The probabilistic theory of information retrieval involves the construction on mathematical models based on statistical assumptions of various sorts. One of the hazards inherent in this kind of theory construction is that the assumptions laid down may be inconsistent with the data to which they are applied. Another hazard is that the stated assumptions may not be the real assumptions on which the derived modelling equations or resulting experiments are actually based. Both kinds of error have been made repeatedly in research on probabilistic information retrieval. One consequence of these lapses in that the statistical character of certain probabilistic IR models, including the so-called 'binary independence' model, has been seriously misapprehended.

Copyrights may apply

p. 63-71

Bookstein, Abraham and Klein, Shmuel T. (1991): Generative Models for Bitmap Sets with Compression Applications. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 63-71. Available online

In large IR systems, information about word occurrence may be stored as a bit matrix, with rows corresponding to different words and columns to documents. Such a matrix is generally very large and very sparse. New methods for compressing such matrices are presented, which exploit possible correlation between rows and between columns. The methods are based on partitioning the matrix into small blocks and predicting the 1-bit distribution within a block by means of various bit generation models. Each block is then encoded using Huffman or arithmetic coding. Preliminary experimental results indicate improvements over previous methods.

Copyrights may apply

p. 72-81

Aalbersberg, IJsbrand Jan (1991): Posting Compression in Dynamic Retrieval Environments. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 72-81. Available online

This paper describes a posting compression technique to be used in dynamic full-text document retrieval environments. The compression technique being presented is applicable in main-memory document retrieval systems, and consists of two parts. First there is the efficient use of auxiliary tables, and second there is the application of the well-known rank-frequency law of Zipf. It is shown that on the basis of this law term weights can be approximated, and thus that their explicit storage can be avoided.

Copyrights may apply

p. 82-91

Luo, Chengjie and Yu, Clement (1991): A Hybrid Bilevel Image Decode Algorithm for Group 4 FAX. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 82-91. Available online

The modified READ code is a two-dimensional coding scheme standardized by CCITT to compress black and white pictures. Existing decompression algorithms process the compressed data bit-by-bit. In this paper, we propose a hybrid decompressing algorithm which processes most of the compressed data byte-by-byte. The remaining data is processed bit-by-bit. It is known statistically that the former situation, where byte-by-byte processing occurs, happens much more often than the later situation, where bit-by-bit processing takes place. Thus, decompression will be speeded up by the proposed algorithm.

Copyrights may apply

p. 93-112

Lesk, Michael E. (1991): The CORE Electronic Chemistry Library. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1991. pp. 93-112. Available online

A major online file of chemical journal literature complete with graphics is being developed to test the usability of fully electronic access to documents. The test file will include ten years of the American Chemical Society's online journals, supplemented with the graphics from the paper publication, and the indexing of the articles from Chemical Abstracts. Our goals are (1) to assess the effectiveness and acceptability of electronic access to primary journals as compared with paper, and (2) to identify the most desirable functions of the user interface to an electronic systems of journals, including in particular a comparison of page image display with Ascii display interfaces. This paper describes the chemical journal data, the interfaces for searching and reading it, and the experiments being done.

Copyrights may apply




What do YOU think?

Give us your opinion! Do you have any comments/additions
that you would like other visitors to see?

 
comment You say: Mar 20th, 2010
#1
Be the first to add a thoughtful note to this page ! 

  will be spam-protected
 

 
How many?
=
e.g. "6"
 

Changes to this page (conference)

24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was edited
24 Jun 2007: Conference Proceedings was added to the bibliography
Mar 20

Computer programs emerge as the outcome of complex human processes of cognition, communication and negotiation, which serve to establish the meaningful embedding of the computer system in its intended use context.

-- Floyd, 1992, p. 24

  • Share this quote on... Bookmark and Share
  • Get more quotes

Eva Hornecker on Tangible Interaction

Eva Hornecker explains the evolving concept of Tangible Interaction.

Read Eva's insightful entry here..

Help us help you!

  • Spread the word: Bookmark and Share
  • Donate
  • Other ways to help
 

Page information

Page maintainer: The Editorial Team
How to cite/reference this page
URL: http://www.interaction-design.org/references/conferences/proceedings_of_the_fourteenth_annual_international_acm_sigir_conference_on_research_and_development_in_information_retrieval.html