Joseph Hellerstein


Publications by Joseph Hellerstein (bibliography)

Abouzied, Azza, Hellerstein, Joseph and Silberschatz, Avi (2012): DataPlay: interactive tweaking and example-driven correction of graphical database queries. In: Proceedings of the 2012 ACM Symposium on User Interface Software and Technology 2012. pp. 207-218. http://dx.doi.org/10.1145/2380116.2380144

Writing complex queries in SQL is a challenge for users. Prior work has developed several techniques to ease query specification but none of these techniques are applicable to a particularly difficult class of queries: quantified queries. Our hypothesis is that users prefer to specify quantified queries interactively by trial-and-error. We identify two impediments to this form of interactive trial-and-error query specification in SQL: (i) changing quantifiers often requires global syntactical query restructuring, and (ii) the absence of non-answers from SQL's results makes verifying query correctness difficult. We remedy these issues with DataPlay, a query tool with an underlying graphical query language, a unique data model and a graphical interface. DataPlay provides two interaction features that support trial-and-error query specification. First, DataPlay allows users to directly manipulate a graphical query by changing quantifiers and modifying dependencies between constraints. Users receive real-time feedback in the form of updated answers and non-answers. Second, DataPlay can auto-correct a user's query, based on user feedback about which tuples to keep or drop from the answers and non-answers. We evaluated the effectiveness of each interaction feature with a user study and we found that direct query manipulation is more effective than auto-correction for simple queries but auto-correction is more effective than direct query manipulation for more complex queries.

© All rights reserved Abouzied et al. and/or ACM Press

Willett, Wesley, Heer, Jeffrey, Hellerstein, Joseph and Agrawala, Maneesh (2011): CommentSpace: structured support for collaborative visual analysis. In: Proceedings of ACM CHI 2011 Conference on Human Factors in Computing Systems 2011. pp. 3131-3140. http://dx.doi.org/10.1145/1978942.1979407

Collaborative visual analysis tools can enhance sensemaking by facilitating social interpretation and parallelization of effort. These systems enable distributed exploration and evidence gathering, allowing many users to pool their effort as they discuss and analyze the data. We explore how adding lightweight tag and link structure to comments can aid this analysis process. We present CommentSpace, a collaborative system in which analysts comment on visualizations and websites and then use tags and links to organize findings and identify others'" contributions. In a pair of studies comparing CommentSpace to a system without support for tags and links, we find that a small, fixed vocabulary of tags (question, hypothesis, to-do) and links (evidence-for, evidence-against) helps analysts more consistently and accurately classify evidence and establish common ground. We also find that managing and incentivizing participation is important for analysts to progress from exploratory analysis to deeper analytical tasks. Finally, we demonstrate that tags and links can help teams complete evidence gathering and synthesis tasks and that organizing comments using tags and links improves analytic results.

© All rights reserved Willett et al. and/or their publisher

Kandel, Sean, Paepcke, Andreas, Hellerstein, Joseph and Heer, Jeffrey (2011): Wrangler: interactive visual specification of data transformation scripts. In: Proceedings of ACM CHI 2011 Conference on Human Factors in Computing Systems 2011. pp. 3363-3372. http://dx.doi.org/10.1145/1978942.1979444

Though data analysis tools continue to improve, analysts still expend an inordinate amount of time and effort manipulating data and assessing data quality issues. Such "data wrangling" regularly involves reformatting data values or layout, correcting erroneous or missing values, and integrating multiple data sources. These transforms are often difficult to specify and difficult to reuse across analysis tasks, teams, and tools. In response, we introduce Wrangler, an interactive system for creating data transformations. Wrangler combines direct manipulation of visualized data with automatic inference of relevant transforms, enabling analysts to iteratively explore the space of applicable operations and preview their effects. Wrangler leverages semantic data types (e.g., geographic locations, dates, classification codes) to aid validation and type conversion. Interactive histories support review, refinement, and annotation of transformation scripts. User study results show that Wrangler significantly reduces specification time and promotes the use of robust, auditable transforms instead of manual editing.

© All rights reserved Kandel et al. and/or their publisher

