Number of co-authors:26
Number of publications with 3 favourite co-authors:Nivio Ziviani:3Berthier A. Ribeiro..:3Leonardo Rocha:2
Wagner Meira's 3 most productive colleagues in number of publications:Berthier A. Ribeir..:33Nivio Ziviani:23Edleno Silva de Mo..:13
Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.
-- Antoine de Saint Exupéry
Read the fascinating history of Wearable Computing, told by its father, Steve Mann
Read Steve's chapter !
Our Latest Books
Kumar and Herger 2013: Gamification at Work: Designing Engaging Business Software...
by Janaki Mythily Kumar and Mario Herger
Whitworth and Ahmad 2013: The Social Design of Technical Systems: Building technologies for communities...
by Brian Whitworth and Adnan Ahmad
Soegaard and Dam 2013: The Encyclopedia of Human-Computer Interaction, 2nd Ed....
by Mads Soegaard and Rikke Friis Dam
Publications by Wagner Meira (bibliography)
Silva, Ismael Santana, Gomide, Janaína, Veloso, Adriano, Meira, Wagner and Ferreira, Renato (2011): Effective sentiment stream analysis with self-augmenting training and demand-driven projection. In: Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2011. pp. 475-484.
How do we analyze sentiments over a set of opinionated Twitter messages? This issue has been widely studied in recent years, with a prominent approach being based on the application of classification techniques. Basically, messages are classified according to the implicit attitude of the writer with respect to a query term. A major concern, however, is that Twitter (and other media channels) follows the data stream model, and thus the classifier must operate with limited resources, including labeled data for training classification models. This imposes serious challenges for current classification techniques, since they need to be constantly fed with fresh training messages, in order to track sentiment drift and to provide up-to-date sentiment analysis. We propose solutions to this problem. The heart of our approach is a training augmentation procedure which takes as input a small training seed, and then it automatically incorporates new relevant messages to the training data. Classification models are produced on-the-fly using association rules, which are kept up-to-date in an incremental fashion, so that at any given time the model properly reflects the sentiments in the event being analyzed. In order to track sentiment drift, training messages are projected on a demand driven basis, according to the content of the message being classified. Projecting the training data offers a series of advantages, including the ability to quickly detect trending information emerging in the stream. We performed the analysis of major events in 2010, and we show that the prediction performance remains about the same, or even increases, as the stream passes and new training messages are acquired. This result holds for different languages, even in cases where sentiment distribution changes over time, or in cases where the initial training seed is rather small. We derive lower-bounds for prediction performance, and we show that our approach is extremely effective under diverse learning scenarios, providing gains that range from 7% to 58%.
© All rights reserved Silva et al. and/or ACM Press
Salles, Thiago, Rocha, Leonardo, Pappa, Gisele L., Mourão, Fernando, Meira, Wagner and Gonçalves, Marcos (2010): Temporally-aware algorithms for document classification. In: Proceedings of the 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2010. pp. 307-314.
Automatic Document Classification (ADC) is still one of the major information retrieval problems. It usually employs a supervised learning strategy, where we first build a classification model using pre-classified documents and then use this model to classify unseen documents. The majority of supervised algorithms consider that all documents provide equally important information. However, in practice, a document may be considered more or less important to build the classification model according to several factors, such as its timeliness, the venue where it was published in, its authors, among others. In this paper, we are particularly concerned with the impact that temporal effects may have on ADC and how to minimize such impact. In order to deal with these effects, we introduce a temporal weighting function (TWF) and propose a methodology to determine it for document collections. We applied the proposed methodology to ACM-DL and Medline and found that the TWF of both follows a lognormal. We then extend three ADC algorithms (namely kNN, Rocchio and Naïve Bayes) to incorporate the TWF. Experiments showed that the temporally-aware classifiers achieved significant gains, outperforming (or at least matching) state-of-the-art algorithms.
© All rights reserved Salles et al. and/or their publisher
Pereira, Adriano, Duarte, Diego, Meira, Wagner, Almeida, Virgilio and Góes, Paulo (2009): Analyzing seller practices in a Brazilian marketplace. In: Proceedings of the 2009 International Conference on the World Wide Web 2009. pp. 1031-1040.
E-commerce is growing at an exponential rate. In the last decade, there has been an explosion of online commercial activity enabled by World Wide Web (WWW). These days, many consumers are less attracted to online auctions, preferring to buy merchandise quickly using fixed-price negotiations. Sales at Amazon.com, the leader in online sales of fixed-price goods, rose 37% in the first quarter of 2008. At eBay, where auctions make up 58% of the site's sales,
© All rights reserved Pereira et al. and/or ACM Press
Veloso, Adriano A., Almeida, Humberto M., Goncalves, Marcos A. and Meira, Wagner (2008): Learning to rank at query-time using association rules. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2008. pp. 267-274.
Some applications have to present their results in the form of ranked lists. This is the case of many information retrieval applications, in which documents must be sorted according to their relevance to a given query. This has led the interest of the information retrieval community in methods that automatically learn effective ranking functions. In this paper we propose a novel method which uncovers patterns (or rules) in the training data associating features of the document with its relevance to the query, and then uses the discovered rules to rank documents. To address typical problems that are inherent to the utilization of association rules (such as missing rules and rule explosion), the proposed method generates rules on a demand-driven basis, at query-time. The result is an extremely fast and effective ranking method. We conducted a systematic evaluation of the proposed method using the LETOR benchmark collections. We show that generating rules on a demand-driven basis can boost
© All rights reserved Veloso et al. and/or ACM Press
Tuler, Elisa, Prates, Raquel O., Almir, Fernando, Rocha, Leonardo and Meira, Wagner (2006): Caracterizando desafios de interação com sistemas de mineração de regras de associação. In: Proceedings of the 2006 Brazilian Symposium on Human Factors in Computing Systems 2006. pp. 40-49.
Data mining focuses on extracting useful information from great volumes of data, and thus has been the center of great attention in the recent years. Among the many techniques available for data mining, identifying association rules is one of the most popular. The novel aspect of rule association mining systems brings new challenges to the HCI field. In this article, we identify these challenges and analyze them based on the theory of action, and characterize it within the semiotic engineering theoretical framework. Thus, we provide designers with an explanation of aspects to be considered during use and design of such systems. This theoretical based explanation contributes to a deeper understanding of the issues involved in interacting with association rules mining systems, allowing for better informed decisions during design process. It also motivates future empirical investigations.
© All rights reserved Tuler et al. and/or Sociedade Brasileira de Computação
Possass, Bruno, Ziviani, Nivio, Meira, Wagner and Ribeiro-Neto, Berthier A. (2005): Set-based vector model: An efficient approach for correlation-based ranking. In ACM Transactions on Information Systems, 23 (4) pp. 397-429.
This work presents a new approach for ranking documents in the vector space model. The novelty lies in two fronts. First, patterns of term co-occurrence are taken into account and are processed efficiently. Second, term weights are generated using a data mining technique called association rules. This leads to a new ranking mechanism called the set-based vector model. The components of our model are no longer index terms but index termsets, where a termset is a set of index terms. Termsets capture the intuition that semantically related terms appear close to each other in a document. They can be efficiently obtained by limiting the computation to small passages of text. Once termsets have been computed, the ranking is calculated as a function of the termset frequency in the document and its scarcity in the document collection. Experimental results show that the set-based vector model improves average precision for all collections and query types evaluated, while keeping computational costs small. For the 2-gigabyte TREC-8 collection, the set-based vector model leads to a gain in average precision figures of 14.7% and 16.4% for disjunctive and conjunctive queries, respectively, with respect to the standard vector space model. These gains increase to 24.9% and 30.0%, respectively, when proximity information is taken into account. Query processing times are larger but, on average, still comparable to those obtained with the standard vector model (increases in processing time varied from 30% to 300%). Our results suggest that the set-based vector model provides a correlation-based ranking formula that is effective with general collections and computationally practical.
© All rights reserved Possass et al. and/or ACM Press
Possas, Bruno, Ziviani, Nivio, Meira, Wagner and Ribeiro-Neto, Berthier A. (2002): Set-based model: a new approach for information retrieval. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2002. pp. 230-237.
The objective of this paper is to present a new technique for computing term weights for index terms, which leads to a new ranking mechanism, referred to as set-based model. The components in our model are no longer terms, but termsets. The novelty is that we compute term weights using a data mining technique called association rules, which is time efficient and yet yields nice improvements in retrieval effectiveness. The set-based model function for computing the similarity between a document and a query considers the termset frequency in the document and its scarcity in the document collection. Experimental results show that our model improves the average precision of the answer set for all three collections evaluated. For the TReC-3 collection, our set-based model led to a gain, relative to the standard vector space model, of 37% in average precision curves and of 57% in average precision for the top 10 documents. Like the vector space model, the set-based model has time complexity that is linear in the number of documents in the collection.
© All rights reserved Possas et al. and/or ACM Press
Saraiva, Paricia Correia, Moura, Edleno Silva de, Ziviani, Nivio, Meira, Wagner, Fonseca, Rodrigo and Ribeiro-Neto, Berthier A. (2001): Rank-preserving two-level caching for scalable search engines. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2001. pp. 51-58.
Show list on your website
Join the design elite and advance:
Changes to this page (author)04 Apr 2012: Modified04 Apr 2012: Modified
03 Nov 2010: Modified
09 Jul 2009: Modified
08 Apr 2009: Modified
24 Jun 2007: Modified
24 Jun 2007: Modified
23 Jun 2007: Added
Page maintainer: The Editorial Team