Aya Soffer
About the author:
No description available of Aya Soffer...
Publications by Aya Soffer (bibliography)
» 2009 «
Amitay, Einat, Carmel, David, Har'El, Nadav, Ofek-Koifman, Shila, Soffer, Aya, Yogev, Sivan and Golbandi, Nadav (2009): Social search and discovery using a unified approach. In: Proceedings of the 2009 International Conference on the World Wide Web 2009. pp. 1211-1212. Available online
We explore new ways of improving a search engine using data from Web 2.0 applications such as blogs and social bookmarks. This data contains entities such as documents, people and tags, and relationships between them. We propose a simple yet effective method, based on faceted search, that treats all entities in a unified manner: returning all of them (documents, people and tags) on every search, and allowing all of them to be used as search terms. We describe an implementation of such a social search engine on the intranet of a large enterprise, and present large-scale experiments which verify the validity of our approach.
Copyrights may apply
» 2005 «
Mishne, Gilad, Carmel, David, Hoory, Ron, Roytman, Alexey and Soffer, Aya (2005): Automatic analysis of call-center conversations. In: Herzog, Otthein, Schek, Hans-Jörg and Fuhr, Norbert (eds.) Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management October 31 - November 5, 2005, Bremen, Germany. pp. 453-459. Available online
» 2004 «
Amitay, Einat, Carmel, David, Lempel, Ronny and Soffer, Aya (2004): Scaling IR-system evaluation using term relevance sets. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2004. pp. 10-17. Available online
This paper describes an evaluation method based on Term Relevance Sets Trels that measures an IR system's quality by examining the content of the retrieved results rather than by looking for pre-specified relevant pages. Trels consist of a list of terms believed to be relevant for a particular query as well as a list of irrelevant terms. The proposed method does not involve any document relevance judgments, and as such is not adversely affected by changes to the underlying collection. Therefore, it can better scale to very large, dynamic collections such as the Web. Moreover, this method can evaluate a system's effectiveness on an updatable "live" collection, or on collections derived from different data sources. Our experiments show that the proposed method is very highly correlated with official TREC measures.
Copyrights may apply
Amitay, Einat, Har'El, Nadav, Sivan, Ron and Soffer, Aya (2004): Web-a-where: geotagging web content. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2004. pp. 273-280. Available online
We describe Web-a-Where, a system for associating geography with Web pages. Web-a-Where locates mentions of places and determines the place each name refers to. In addition, it assigns to each page a geographic focus -- a locality that the page discusses as a whole. The tagging process is simple and fast, aimed to be applied to large collections of Web pages and to facilitate a variety of location-based applications and data analyses. Geotagging involves arbitrating two types of ambiguities: geo/non-geo and geo/geo. A geo/non-geo ambiguity occurs when a place name also has a non-geographic meaning, such as a person name (e.g., Berlin) or a common word (Turkey). Geo/geo ambiguity arises when distinct places have the same name, as in London, England vs. London, Ontario. An implementation of the tagger within the framework of the WebFountain data mining system is described, and evaluated on several corpora of real Web pages. Precision of up to 82% on individual geotags is achieved. We also evaluate the relative contribution of various heuristics the tagger employs, and evaluate the focus-finding algorithm using a corpus pretagged with localities, showing that as many as 91% of the foci reported are correct up to the country level.
Copyrights may apply
Amitay, Einat, Carmel, David, Herscovici, Michael, Lempel, Ronny and Soffer, Aya (2004): Trend detection through temporal link analysis. In JASIST - Journal of the American Society for Information Science and Technology, 55 (14) pp. 1270-1281
» 2003 «
Amitay, Einat, Carmel, David, Darlow, Adam, Lempel, Ronny and Soffer, Aya (2003): The connectivity sonar: detecting site functionality by structural patterns. In: Proceedings of the Fourteenth ACM Conference on Hypertext 2003. pp. 38-47. Available online
Web sites today serve many different functions, such as corporate sites, search engines, e-stores, and so forth. As sites are created for different purposes, their structure and connectivity characteristics vary. However, this research argues that sites of similar role exhibit similar structural patterns, as the functionality of a site naturally induces a typical hyperlinked structure and typical connectivity patterns to and from the rest of the Web. Thus, the functionality of Web sites is reflected in a set of structural and connectivity-based features that form a typical signature. In this paper, we automatically categorize sites into eight distinct functional classes, and highlight several search-engine related applications that could make immediate use of such technology. We purposely limit our categorization algorithms by tapping connectivity and structural data alone, making no use of any content analysis whatsoever. When applying two classification algorithms to a set of 202 sites of the eight defined functional categories, the algorithms correctly classified between 54.5% and 59% of the sites. On some categories, the
Copyrights may apply
Carmel, David, Maarek, Yoelle S., Mandelbrod, Matan, Mass, Yosi and Soffer, Aya (2003): Searching XML documents via XML fragments. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2003. pp. 151-158. Available online
Most of the work on XML query and search has stemmed from the publishing and database communities, mostly for the needs of business applications. Recently, the Information Retrieval community began investigating the XML search issue to answer information discovery needs. Following this trend, we present here an approach where information needs can be expressed in an approximate manner as pieces of XML documents or "XML fragments" of the same nature as the documents that are being searched. We present an extension of the vector space model for searching XML collections via XML fragments and ranking results by relevance. We describe how we have extended a full-text search engine to comply with this model. The value of the proposed method is demonstrated by the relative high precision of our system, which was among the top performers in the recent INEX workshop. Our results indicate that certain queries are more appropriate than others for the extended vector space model. Specifically, queries with relatively specific contexts but vague information needs are best situated to reap the benefit of this model. Finally our results show that one method may not fit all types of queries and that it could be worthwhile to use different solutions for different applications.
Copyrights may apply
Amitay, Einat, Nelken, Rani, Niblack, Wayne, Sivan, Ron and Soffer, Aya (2003): Multi-resolution disambiguation of term occurrences. In: Proceedings of the 2003 ACM CIKM International Conference on Information and Knowledge Management November 2-8, 2003, New Orleans, Louisiana, USA. pp. 255-262. Available online
Broder, Andrei Z., Carmel, David, Herscovici, Michael, Soffer, Aya and Zien, Jason Y. (2003): Efficient query evaluation using a two-level retrieval process. In: Proceedings of the 2003 ACM CIKM International Conference on Information and Knowledge Management November 2-8, 2003, New Orleans, Louisiana, USA. pp. 426-434. Available online
» 2002 «
Lempel, Ronny and Soffer, Aya (2002): PicASHOW: pictorial authority search by hyperlinks on the web. In ACM Transactions on Information Systems, 20 (1) pp. 1-24
We describe PicASHOW, a fully automated WWW image retrieval system that is based on several link-structure analyzing algorithms. Our basic premise is that a page p displays (or links to) an image when the author of p considers the image to be of value to the viewers of the page. We thus extend some well known link-based WWW page retrieval schemes to the context of image retrieval.PicASHOW's analysis of the link structure enables it to retrieve relevant images even when those are stored in files with meaningless names. The same analysis also allows it to identify image containers and image hubs. We define these as Web pages that are rich in relevant images, or from which many images are readily accessible.PicASHOW requires no image analysis whatsoever and no creation of taxonomies for preclassification of the Web's images. It can be implemented by standard WWW search engines with reasonable overhead, in terms of both computations and storage, and with no change to user query formats. It can thus be used to easily add image retrieving capabilities to standard search engines.Our results demonstrate that PicASHOW, while relying almost exclusively on link analysis, compares well with dedicated WWW image retrieval systems. We conclude that link analysis, a proven effective technique for Web page search, can improve the performance of Web image retrieval, as well as extend its definition to include the retrieval of image hubs and containers.
Copyrights may apply
Aridor, Yariv, Carmel, David, Maarek, Yoelle S., Soffer, Aya and Lempel, Ronny (2002): Knowledge encapsulation for focused search from pervasive devices. In ACM Transactions on Information Systems, 20 (1) pp. 25-46
Mobile knowledge seekers often need access to information on the Web during a meeting or on the road, while away from their desktop. A common practice today is to use pervasive devices such as Personal Digital Assistants or mobile phones. However, these devices have inherent constraints (e.g., slow communication, form factor) which often make information discovery tasks impractical.In this paper, we present a new focused-search approach specifically oriented for the mode of work and the constraints dictated by pervasive devices. It combines focused search within specific topics with encapsulation of topic-specific information in a persistent repository. One key characteristic of these persistent repositories is that their footprint is small enough to fit on local devices, and yet they are rich enough to support many information discovery tasks in disconnected mode. More specifically, we suggest a representation for topic-specific information based on "knowledge-agent bases" that comprise all the information necessary to access information about a topic (under the form of key concepts and key Web pages) and assist in the full search process from query formulation assistance to result scanning on the device itself. The key contribution of our work is the coupling of focused search with encapsulated knowledge representation making information discovery from pervasive devices practical as well as efficient. We describe our model in detail and demonstrate its aspects through sample scenarios.
Copyrights may apply
Carmel, David, Farchi, Eitan, Petruschka, Yael and Soffer, Aya (2002): Automatic query refinement using lexical affinities with maximal information gain. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2002. pp. 283-290. Available online
This work describes an automatic query refinement technique, which focuses on improving precision of the top ranked documents. The terms used for refinement are lexical affinities (LAs), pairs of closely related words which contain exactly one of the original query terms. Adding these terms to the query is equivalent to re-ranking search results, thus, precision is improved while recall is preserved. We describe a novel method that selects the most "informative" LAs for refinement, namely, those LAs that best separate relevant documents from irrelevant documents in the set of results. The information gain of candidate LAs is determined using unsupervised estimation that is based on the scoring function of the search engine. This method is thus fully automatic and its quality depends on the quality of the scoring function. Experiments we conducted with TREC data clearly show a significant improvement in the precision of the top ranked documents.
Copyrights may apply
Baeza-Yates, Ricardo A., Carmel, David, Maarek, Yoelle S. and Soffer, Aya (2002): Preface. In JASIST - Journal of the American Society for Information Science and Technology, 53 (6) pp. 413-414
Cohen, Doron, Herscovici, Michael, Petruschka, Yael, Maarek, Yoelle S. and Soffer, Aya (2002): Personalized pocket directories for mobile devices. In: Proceedings of the 2002 International Conference on the World Wide Web 2002. pp. 627-638. Available online
In spite of the increase in the availability of mobile devices in the last few years, Web information is not yet as accessible from PDAs or WAP phones as it is from the desktop. In this paper, we propose a solution for supporting one of the most popular information discovery mechanisms, namely Web directory navigation, from mobile devices. Our proposed solution consists of caching enough information on the device itself in order to conduct most of the navigation actions locally (with subsecond response time) while intermittently communicating with the server to receive updates and additional data requested by the user. The cached information is captured in a "directory capsule". The directory capsule represents only the portion of the directory that is of interest to the user in a given context and is sufficiently rich and consistent to support the information needs of the user in disconnected mode. We define a novel subscription model specifically geared for Web directories and for the special needs of PDAs. This subscription model enables users to specify the parts of the directory that are of interest to them as well as the preferred granularity. We describe a mechanism for keeping the directory capsule in sync over time with the Web directory and user subscription requests. Finally, we present the Pocket Directory Browser for Palm powered computers that we have developed. The pocket directory can be used to define, view and manipulate the capsules that are stored on the Palm. We provide several usage examples of our system on the Open Directory Project, one of the largest and most popular Web directories.
Copyrights may apply
» 2001 «
Carmel, David, Cohen, Doron, Fagin, Ronald, Farchi, Eitan, Herscovici, Michael, Maarek, Yoelle S. and Soffer, Aya (2001): Static index pruning for information retrieval systems. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2001. pp. 43-50. Available online
We introduce static index pruning methods that significantly reduce the index size in information retrieval systems. We investigate uniform and term-based methods that each remove selected entries from the index and yet have only a minor effect on retrieval results. In uniform pruning, there is a fixed cutoff threshold, and all index entries whose contribution to relevance scores is bounded above by a given threshold are removed from the index. In term-based pruning, the cutoff threshold is determined for each term, and thus may vary from term to term. We give experimental evidence that for each level of compression, term-based pruning outperforms uniform pruning, under various measures of precision. We present theoretical and experimental evidence that under our term-based pruning scheme, it is possible to prune the index greatly and still get retrieval results that are almost as good as those based on the full index.
Copyrights may apply
Lempel, Ronny and Soffer, Aya (2001): PicASHOW: pictorial authority search by hyperlinks on the Web. In: Proceedings of the 2001 International Conference on the World Wide Web 2001. pp. 438-448. Available online
Aridor, Yariv, Carmel, David, Maarek, Yoelle S., Soffer, Aya and Lempel, Ronny (2001): Knowledge encapsulation for focused search from pervasive devices. In: Proceedings of the 2001 International Conference on the World Wide Web 2001. pp. 754-764. Available online
SHOW THIS LIST ON YOUR HOMEPAGE
What do YOU think?
Give us your opinion! Do you have any comments/additions
that you would like other visitors to see?
You say:
Mar 20th, 2010
Changes to this page (author)
19 Feb 2010: Enabled abstracts to be shown on Aya Soffer's author page.09 Jul 2009: Author was edited 09 Jul 2009: Author was edited
09 Jul 2009: Author was edited
09 Jul 2009: Author was edited
31 May 2009: Author was edited
31 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
29 May 2009: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
24 Jun 2007: Author was edited
23 Jun 2007: Author was edited
28 Apr 2003: Added the author to the bibliography