Publication statistics

Pub. period:1993-2012
Pub. count:67
Number of co-authors:80



Co-authors

Number of publications with 3 favourite co-authors:

Barton A. Smith:8
Johnny Accot:7
Paul Milgram:7

 

 

Productive colleagues

Shumin Zhai's 3 most productive colleagues in number of publications:

Bill Buxton:78
Andy Cockburn:68
Michel Beaudouin-L..:53
 
 
 

Upcoming Courses

go to course
The Practical Guide to Usability
88% booked. Starts in 7 days
go to course
The Ultimate Guide to Visual Perception and Design
82% booked. Starts in 13 days
 
 

Featured chapter

Marc Hassenzahl explains the fascinating concept of User Experience and Experience Design. Commentaries by Don Norman, Eric Reiss, Mark Blythe, and Whitney Hess

User Experience and Experience Design !

 
 

Our Latest Books

 
 
The Social Design of Technical Systems: Building technologies for communities. 2nd Edition
by Brian Whitworth and Adnan Ahmad
start reading
 
 
 
 
Gamification at Work: Designing Engaging Business Software
by Janaki Mythily Kumar and Mario Herger
start reading
 
 
 
 
The Social Design of Technical Systems: Building technologies for communities
by Brian Whitworth and Adnan Ahmad
start reading
 
 
 
 
The Encyclopedia of Human-Computer Interaction, 2nd Ed.
by Mads Soegaard and Rikke Friis Dam
start reading
 
 

Shumin Zhai

Ph.D

Picture of Shumin Zhai.
Update pic
Has also published under the name of:
"S. Zhai" and "Shu Min Zhai"

Personal Homepage:
http://www.shuminzhai.com/

Current place of employment:
Google Inc.

Shumin Zhai is a Human-Computer Interaction research scientist at Google Inc. Prior to joning Google Research he worked at the IBM Almaden Research Center from 1996 to 2011. He is interested in both foundational issues of user interfaces and practical product and service innovations. He originated and led the ShapeWriter project that pioneered the touch screen gesture keyboard paradigm. He has published over 100 research papers and received 30 patents. He is active in the academic community and is currently the Editor-in-Chief of ACM Transactions on Computer-Human Interaction. He has been a visiting professor and lectured at universities in the US, Europe and China. He is a Fellow of the ACM and a Member of the CHI Academy.

Edit author info
Rename / change spelling
Add publication
 

Publications by Shumin Zhai (bibliography)

 what's this?
2012
 
Edit | Del

Bi, Xiaojun, Smith, Barton A. and Zhai, Shumin (2012): Multilingual Touchscreen Keyboard Design and Optimization. In Eminds International Journal of Human Computer Interaction, 27 (4) pp. 352-382.

A keyboard design, once adopted, tends to have a long-lasting and worldwide impact on daily user experience. There is a substantial body of research on touch-screen stylus keyboard optimization. Most of it has focused on English only. Applying rigorous mathematical optimization methods and addressing diacritic character design issues, this article expands this body of work to French, Spanish, German, and Chinese. More important and counter to the intuition that optimization by nature is necessarily specific to each language, this article demonstrates that it is possible to find common layouts that are highly optimized across multiple languages for stylus (or single finger) typing. We first obtained a layout that is highly optimized for both English and French input. We then obtained a layout that is optimized for English, French, Spanish, German, and Chinese pinyin simultaneously, reducing its stylus travel distance to about half of QWERTY's for all of the five languages. In comparison to QWERTY's 3.31, 3.51, 3.7, 3.26, and 3.85 keys of movement for English, French, Spanish, German, and Chinese, respectively, the optimized multilingual layout has an average travel distance of 1.88, 1.86, 1.91, 1.77, and 1.68 keys, correspondingly. Applying Fitts's law with parameters validated by a word tapping experiment, we show that these multilingual keyboards also significantly reduce text input time for multiple languages over the standard QWERTY for experienced users. In comparison to layouts individually optimized for each language, which are also obtained in this article, simultaneously optimizing for multiple languages caused only a minor performance degradation for each language. This surprising result could help to reduce the burden of multilingual users having to switch and learn new layouts for different languages. In addition, we also present and analyze multiple ways of incorporating diacritic characters on multilingual keyboards. Taken together, the present work provides a quantitative foundation for the understanding and designing of multilingual touch-screen keyboards.

© All rights reserved Bi et al. and/or Universidad de Oviedo

 
Edit | Del

Azenkot, Shiri and Zhai, Shumin (2012): Touch behavior with different postures on soft smartphone keyboards. In: Proceedings of the 14th Conference on Human-computer interaction with mobile devices and services 2012. pp. 251-260.

Text entry on smartphones is far slower and more error-prone than on traditional desktop keyboards, despite sophisticated detection and auto-correct algorithms. To strengthen the empirical and modeling foundation of smartphone text input improvements, we explore touch behavior on soft QWERTY keyboards when used with two thumbs, an index finger, and one thumb. We collected text entry data from 32 participants in a lab study and describe touch accuracy and precision for different keys. We found that distinct patterns exist for input among the three hand postures, suggesting that keyboards should adapt to different postures. We also discovered that participants' touch precision was relatively high given typical key dimensions, but there were pronounced and consistent touch offsets that can be leveraged by keyboard algorithms to correct errors. We identify patterns in our empirical findings and discuss implications for design and improvements of soft keyboards.

© All rights reserved Azenkot and Zhai and/or ACM Press

 
Edit | Del

Bi, Xiaojun, Chelba, Ciprian, Ouyang, Tom, Partridge, Kurt and Zhai, Shumin (2012): Bimanual gesture keyboard. In: Proceedings of the 2012 ACM Symposium on User Interface Software and Technology 2012. pp. 137-146.

Gesture keyboards represent an increasingly popular way to input text on mobile devices today. However, current gesture keyboards are exclusively unimanual. To take advantage of the capability of modern multi-touch screens, we created a novel bimanual gesture text entry system, extending the gesture keyboard paradigm from one finger to multiple fingers. To address the complexity of recognizing bimanual gesture, we designed and implemented two related interaction methods, finger-release and space-required, both based on a new multi-stroke gesture recognition algorithm. A formal experiment showed that bimanual gesture behaviors were easy to learn. They improved comfort and reduced the physical demand relative to unimanual gestures on tablets. The results indicated that these new gesture keyboards were valuable complements to unimanual gesture and regular typing keyboards.

© All rights reserved Bi et al. and/or ACM Press

2011
 
Edit | Del

Weaver, Kimberly A., Yang, Huahai, Zhai, Shumin and Pierce, Jeff (2011): Understanding information preview in mobile email processing. In: Proceedings of 13th Conference on Human-computer interaction with mobile devices and services 2011. pp. 303-312.

Browsing a collection of information on a mobile device is a common task, yet it can be difficult due to the small size of mobile displays. A common trade-off offered by many current mobile interfaces is to allow users to switch between an overview and detailed views of particular items. An open question is how much preview of each item to include in the overview. Using a mobile email processing task, we attempted to answer that question. We investigated participants' email processing behaviors under differing preview conditions in a semi-controlled, naturalistic study. We collected log data of participants' actual behaviors as well as their subjective impressions of different conditions. Our results suggest that a moderate level of two to three lines of preview should be the default. The overall benefit of a moderate amount of preview was supported by both positive subjective ratings and fewer transitions between the overview and individual items.

© All rights reserved Weaver et al. and/or ACM Press

 
Edit | Del

Bao, Patti, Pierce, Jeffrey, Whittaker, Stephen and Zhai, Shumin (2011): Smart phone use by non-mobile business users. In: Proceedings of 13th Conference on Human-computer interaction with mobile devices and services 2011. pp. 445-454.

The rapid increase in smart phone capabilities has introduced new opportunities for mobile information access and computing. However, smart phone use may still be constrained by both device affordances and work environments. To understand how current business users employ smart phones and to identify opportunities for improving business smart phone use, we conducted two studies of actual and perceived performance of standard work tasks. Our studies involved 243 smart phone users from a large corporation. We intentionally chose users who primarily work with desktops and laptops, as these "non-mobile" users represent the largest population of business users. Our results go beyond the general intuition that smart phones are better for consuming than producing information: we provide concrete measurements that show how fast reading is on phones and how much slower and more effortful text entry is on phones than on computers. We also demonstrate that security mechanisms are a significant barrier to wider business smart phone use. We offer design suggestions to overcome these barriers.

© All rights reserved Bao et al. and/or ACM Press

2010
 
Edit | Del

Wang, Jingtao, Zhai, Shumin and Canny, John (2010): SHRIMP: solving collision and out of vocabulary problems in mobile predictive input with motion gesture. In: Proceedings of ACM CHI 2010 Conference on Human Factors in Computing Systems 2010. pp. 15-24.

Dictionary-based disambiguation (DBD) is a very popular solution for text entry on mobile phone keypads but suffers from two problems: 1. the resolution of encoding collision (two or more words sharing the same numeric key sequence) and 2. entering out-of-vocabulary (OOV) words. In this paper, we present SHRIMP, a system and method that addresses these two problems by integrating DBD with camera based motion sensing that enables the user to express preference through a tilting or movement gesture. SHRIMP (Small Handheld Rapid Input with Motion and Prediction) runs on camera phones equipped with a standard 12-key keypad. SHRIMP maintains the speed advantage of DBD driven predictive text input while enabling the user to overcome DBD collision and OOV problems seamlessly without even a mode switch. An initial empirical study demonstrates that SHRIMP can be learned very quickly, performed immediately faster than MultiTap and handled OOV words more efficiently than DBD.

© All rights reserved Wang et al. and/or their publisher

 
Edit | Del

Bi, Xiaojun, Smith, Barton A. and Zhai, Shumin (2010): Quasi-qwerty soft keyboard optimization. In: Proceedings of ACM CHI 2010 Conference on Human Factors in Computing Systems 2010. pp. 283-286.

It has been well understood that optimized soft keyboard layouts improve motor movement efficiency over the standard Qwerty layouts, but have the drawback of long initial visual search time for novice users. To ease the initial searching time on optimized soft keyboards, we explored "Quasi-Qwerty optimization" so that the resulting layouts are close to Qwerty. Our results show that a middle ground between the optimized but new, and the familiar (Qwerty) but inefficient does exist. We show that by allowing letters to move at most one step (key) away from their original positions on Qwerty in an optimization process, one can achieve about half of what free optimization could gain in movement efficiency. An experiment shows that due to users' familiarity with Qwerty, a layout with quasi Qwerty optimization could significantly reduce novice user's visual search time to a level between those of Qwerty and a freely optimized layout. The results in this work provide designers with a new quantitative understanding of the soft keyboard design space.

© All rights reserved Bi et al. and/or their publisher

 
Edit | Del

Yin, Jibin, Ren, Xiangshi and Zhai, Shumin (2010): Pen pressure control in trajectory-based interaction. In Behaviour and Information Technology, 29 (2) pp. 137-148.

This study presents a series of three experiments that evaluate human capabilities and limitations in using pen-tip pressure as an additional channel of control information in carrying out trajectory tasks such as drawing, writing and gesturing on computer screen. The first experiment measured the natural range of force used in regular drawing and writing tasks. The second experiment tested human performance of maintaining pen-tip pressure at different levels with and without a visual display of the pen pressure. The third experiment, using the steering law paradigm, studied path steering performance as a function of the steering law index of difficulty, steering path type (linear and circular) and pressure precision tolerance interval. The main conclusions of our investigation are as follows. The natural range of pressure used in drawing and writing is concentrated in the 0.82 N (SI force unit Newton (N) is used in this article) to 3.16 N region. The resting force of the pen tip on the screen is between 0.78 N and 1.58 N. Pressure near or below the resting force is markedly more difficult to control. Visual feedback improves pressure-modulated trajectory tasks. Up to six layers of pressure can be controlled in steering tasks, but the error rate changed from 4.9% for one layer of pressure to 35% for six layers. The steering law holds for pressure steering tasks, which enables systematic prediction of successful steering time for a given path's length, width and pressure precision criterion. The steering time can also be modelled as a logarithmic function of pressure control precision ratio s. Taken together, the current work provides a systematic body of empirical knowledge as basis for future research and design of digital pen applications.

© All rights reserved Yin et al. and/or Taylor and Francis

 
Edit | Del

Apitz, Georg, Guimbretiere, Francois and Zhai, Shumin (2010): Foundations for designing and evaluating user interfaces based on the crossing paradigm. In ACM Transactions on Computer-Human Interaction, 17 (2) p. 9.

Traditional graphical user interfaces have been designed with the desktop mouse in mind, a device well characterized by Fitts' law. Yet in recent years, hand-held devices and tablet personal computers using a pen (or fingers) as the primary mean of interaction have become more and more popular. These new interaction modalities have pushed the traditional focus on pointing to its limit. In this paper we explore whether a different paradigm -- goal crossing-based on pen strokes -- may substitute or complement pointing as another fundamental interaction method. First we describe a study in which we establish that goal crossing is dependent on an index of difficulty analogous to Fitts' law, and that in some settings, goal crossing completion time is shorter or comparable to pointing performance under the same index of difficulty. We then demonstrate the expressiveness of the crossing-based interaction paradigm by implementing CrossY, an application which only uses crossing for selecting commands. CrossY demonstrates that crossing-based interactions can be more expressive than the standard point and click approach. We also show how crossing-based interactions encourage the fluid composition of commands. Finally after observing that users' performance could be influenced by the general direction of travel, we report on the results of a study characterizing this effect. These latter results led us to propose a general guideline for dialog box interaction. Together, these results provide the foundation for the design of effective crossing-based interactions.

© All rights reserved Apitz et al. and/or ACM Press

2009
 
Edit | Del

Lee, Seungyon and Zhai, Shumin (2009): The performance of touch screen soft buttons. In: Proceedings of ACM CHI 2009 Conference on Human Factors in Computing Systems 2009. pp. 309-318.

The introduction of a new generation of attractive touch screen-based devices raises many basic usability questions whose answers may influence future design and market direction. With a set of current mobile devices, we conducted three experiments focusing on one of the most basic interaction actions on touch screens: the operation of soft buttons. Issues investigated in this set of experiments include: a comparison of soft button and hard button performance; the impact of audio and vibrato-tactile feedback; the impact of different types of touch sensors on use, behavior, and performance; a quantitative comparison of finger and stylus operation; and an assessment of the impact of soft button sizes below the traditional 22 mm recommendation as well as below finger width.

© All rights reserved Lee and Zhai and/or ACM Press

 
Edit | Del

Appert, Caroline and Zhai, Shumin (2009): Using strokes as command shortcuts: cognitive benefits and toolkit support. In: Proceedings of ACM CHI 2009 Conference on Human Factors in Computing Systems 2009. pp. 2289-2298.

This paper investigates using stroke gestures as shortcuts to menu selection. We first experimentally measured the performance and ease of learning of stroke shortcuts in comparison to keyboard shortcuts when there is no mnemonic link between the shortcut and the command. While both types of shortcuts had the same level of performance with enough practice, stroke shortcuts had substantial cognitive advantages in learning and recall. With the same amount of practice, users could successfully recall more shortcuts and make fewer errors with stroke shortcuts than with keyboard shortcuts. The second half of the paper focuses on UI development support and articulates guidelines for toolkits to implement stroke shortcuts in a wide range of software applications. We illustrate how to apply these guidelines by introducing the Stroke Shortcuts Toolkit (SST) which is a library for adding stroke shortcuts to Java Swing applications with just a few lines of code.

© All rights reserved Appert and Zhai and/or ACM Press

 
Edit | Del

Zhai, Shumin, Kristensson, Per Ola, Gong, Pengjun, Greiner, Michael, Peng, Shilei Allen, Liu, Liang Mico and Dunnigan, Anthony (2009): Shapewriter on the iPhone: from the laboratory to the real world. In: Proceedings of ACM CHI 2009 Conference on Human Factors in Computing Systems 2009. pp. 2667-2670.

We present our experience in bringing ShapeWriter, a novel HCI research product, from the laboratory to real world users through iPhone's App Store.

© All rights reserved Zhai et al. and/or ACM Press

2008
 
Edit | Del

Zhai, Shumin and Kristensson, Per-Ola (2008): Interlaced QWERTY: accommodating ease of visual search and input flexibility in shape writing. In: Proceedings of ACM CHI 2008 Conference on Human Factors in Computing Systems April 5-10, 2008. pp. 593-596.

Shape writing is an input technology for touch-screen mobile phones and pen-tablets. To shape write text, the user spells out word patterns by sliding a finger or stylus over a graphical keyboard. The user's trace is then recognized by a pattern recognizer. In this paper we analyze and evaluate various keyboard layouts, including alphabetic, optimized (ATOMIK), QWERTY, and interlaced QWERTY for shape writing. The goodness of a layout for shape writing has two aspects. For users' initial ease of use the letters should be easy to visually locate. For long term use, however, the layout should maximize the imprecision tolerance and writing flexibility for all words. We present empirical studies for the former and mathematical analyses for the latter. Our results led to a new layout, interlaced QWERTY, which offers excellent separation of word shapes, while still maintaining a low visual search time. Many of the findings in our study also apply to traditional soft keyboards tapped with a stylus or one finger.

© All rights reserved Zhai and Kristensson and/or ACM Press

 
Edit | Del

Kristensson, Per Ola and Zhai, Shumin (2008): Improving word-recognizers using an interactive lexicon with active and passive words. In: Proceedings of the 2008 International Conference on Intelligent User Interfaces 2008. pp. 353-356.

The words a user is likely to write comprise the user's active vocabulary. This vocabulary is considerably smaller than the passive vocabulary of words a user reads. We explore an interactive adaptive lexicon method that separates a large lexicon into active and passive sets, and gradually expands and adapts the active set to reflect the user's active vocabulary. The adaptation is achieved through lightweight interaction as a by product of actual use. The effectiveness of the technique is demonstrated through a computational experiment and a user study.

© All rights reserved Kristensson and Zhai and/or ACM Press

 
Edit | Del

Zhai, Shumin (2008): On the ease and efficiency of human-computer interfaces. In: Rih, Kari-Jouko and Duchowski, Andrew T. (eds.) ETRA 2008 - Proceedings of the Eye Tracking Research and Application Symposium March 26-28, 2008, Savannah, Georgia, USA. pp. 9-10.

2007
 
Edit | Del

Kristensson, Per-Ola and Zhai, Shumin (2007): Command strokes with and without preview: using pen gestures on keyboard for command selection. In: Proceedings of ACM CHI 2007 Conference on Human Factors in Computing Systems 2007. pp. 1137-1146.

This paper presents a new command selection method that provides an alternative to pull-down menus in pen-based mobile interfaces. Its primary advantage is the ability for users to directly select commands from a very large set without the need to traverse menu hierarchies. The proposed method maps the character strings representing the commands onto continuous pen-traces on a stylus keyboard. The user enters a command by stroking part of its character string. We call this method "command strokes." We present the results of three experiments assessing the usefulness of the technique. The first experiment shows that command strokes are 1.6 times faster than the de-facto standard pull-down menus and that users find command strokes more fun to use. The second and third experiments investigate the effect of displaying a visual preview of the currently recognized command while the user is still articulating the command stroke. These experiments show that visual preview does not slow users down and leads to significantly lower error rates and shorter gestures when users enter new unpracticed commands.

© All rights reserved Kristensson and Zhai and/or ACM Press

 
Edit | Del

Cao, Xiang and Zhai, Shumin (2007): Modeling human performance of pen stroke gestures. In: Proceedings of ACM CHI 2007 Conference on Human Factors in Computing Systems 2007. pp. 1495-1504.

This paper presents a quantitative human performance model of making single-stroke pen gestures within certain error constraints in terms of production time. Computed from the properties of Curves, Line segments, and Corners (CLC) in a gesture stroke, the model may serve as a foundation for the design and evaluation of existing and future gesture-based user interfaces at the basic motor control efficiency level, similar to the role of previous "laws of action" played to pointing, crossing or steering-based user interfaces. We report and discuss our experimental results on establishing and validating the CLC model, together with other basic empirical findings in stroke gesture production.

© All rights reserved Cao and Zhai and/or ACM Press

 
Edit | Del

Cockburn, Andy, Kristensson, Per-Ola, Alexander, Jason and Zhai, Shumin (2007): Hard lessons: effort-inducing interfaces benefit spatial learning. In: Proceedings of ACM CHI 2007 Conference on Human Factors in Computing Systems 2007. pp. 1571-1580.

Interface designers normally strive for a design that minimises the user's effort. However, when the design's objective is to train users to interact with interfaces that are highly dependent on spatial properties (e.g. keypad layout or gesture shapes) we contend that designers should consider explicitly increasing the mental effort of interaction. To test the hypothesis that effort aids spatial memory, we designed a "frost-brushing" interface that forces the user to mentally retrieve spatial information, or to physically brush away the frost to obtain visual guidance. We report results from two experiments using virtual keypad interfaces -- the first concerns spatial location learning of buttons on the keypad, and the second concerns both location and trajectory learning of gesture shape. The results support our hypothesis, showing that the frost-brushing design improved spatial learning. The participants' subjective responses emphasised the connections between effort, engagement, boredom, frustration, and enjoyment, suggesting that effort requires careful parameterisation to maximise its effectiveness.

© All rights reserved Cockburn et al. and/or ACM Press

2006
 
Edit | Del

Yin, Min and Zhai, Shumin (2006): The benefits of augmenting telephone voice menu navigation with visual browsing and search. In: Proceedings of ACM CHI 2006 Conference on Human Factors in Computing Systems 2006. pp. 319-328.

Automatic interactive voice response (IVR) based telephone routing has long been recognized as a frustrating interaction experience. This paper presents a series of experiments examining the benefits of augmenting telephone voice menus with coordinated visual displays and keyword search. The first experiment qualitatively studied callers' experience of having a visual menu on a screen in synchronization with the telephone voice menu tree navigation. The second experiment quantitatively measured callers' performance in time and accuracy with and without visual display augmentation. The third experiment tested keyword search in comparison to visual browsing of telephone menu trees. Study participants uniformly and enthusiastically liked the visual augmentation of voice menus. On average with visual augmentation callers could navigate phone trees 36% faster with 75% fewer errors, and made choices ahead of the voice menu over 60% of the time. Search vs. browsing had similar navigation performance but offered different and complementary user experiences. Overall our studies conclude that telephone voice menu navigation can be significantly improved with a visual channel augmentation, resulting in both business cost reduction and user experience satisfaction.

© All rights reserved Yin and Zhai and/or ACM Press

 
Edit | Del

Wang, Jingtao, Zhai, Shumin and Canny, John (2006): Camera phone based motion sensing: interaction techniques, applications and performance study. In: Proceedings of the ACM Symposium on User Interface Software and Technology 2006. pp. 101-110.

This paper presents TinyMotion, a pure software approach for detecting a mobile phone user's hand movement in real time by analyzing image sequences captured by the built-in camera. We present the design and implementation of TinyMotion and several interactive applications based on TinyMotion. Through both an informal evaluation and a formal 17-participant user study, we found that 1. TinyMotion can detect camera movement reliably under most background and illumination conditions. 2. Target acquisition tasks based on TinyMotion follow Fitts' law and Fitts law parameters can be used for TinyMotion based pointing performance measurement. 3. The users can use Vision TiltText, a TinyMotion enabled input method, to enter sentences faster than MultiTap with a few minutes of practicing. 4. Using camera phone as a handwriting capture device and performing large vocabulary, multilingual real time handwriting recognition on the cell phone are feasible. 5. TinyMotion based gaming is enjoyable and immediately available for the current generation camera phones. We also report user experiences and problems with TinyMotion based interaction as resources for future design and development of mobile interfaces.

© All rights reserved Wang et al. and/or ACM Press

2005
 
Edit | Del

Yin, Min and Zhai, Shumin (2005): Dial and see: tackling the voice menu navigation problem with cross-device user experience integration. In: Proceedings of the 2005 ACM Symposium on User Interface Software and Technology 2005. pp. 187-190.

IVR (interactive voice response) menu navigation has long been recognized as a frustrating interaction experience. We propose an IM-based system that sends a coordinated visual IVR menu to the caller\'s computer screen. The visual menu is updated in real time in response to the caller\'s actions. With this automatically opened supplementary channel, callers can take advantages of different modalities over different devices and interact with the IVR system with the ease of graphical menu selection. Our approach of utilizing existing network infrastructure to pinpoint the caller\'s virtual location and coordinating multiple devices and multiple channels based on users\' ID registration can also be more generally applied to create integrated user experiences across a group of devices.

© All rights reserved Yin and Zhai and/or ACM Press

 
Edit | Del

Zhai, Shumin, Kristensson, Per-Ola and Smith, Barton A. (2005): In search of effective text input interfaces for off the desktop computing. In Interacting with Computers, 17 (3) pp. 229-250.

It is generally recognized that today's frontier of HCI research lies beyond the traditional desktop computers whose GUI interfaces were built on the foundation of display -- pointing device -- full keyboard. Many interface challenges arise without such a physical UI foundation. Text writing -- ranging from entering URLs and search queries, filling forms, typing commands, to taking notes and writing emails and chat messages -- is one of the hard problems awaiting for solutions in off-desktop computing. This paper summarizes and synthesizes a research program on this topic at the IBM Almaden Research Center. It analyzes various dimensions that constitute a good text input interface; briefly reviews related literature; discusses the evaluation methodology issues of text input; presents the major ideas and results of two systems, ATOMIK and SHARK; and points out current and future directions in the area from our current vantage point.

© All rights reserved Zhai et al. and/or Elsevier Science

 
Edit | Del

Qvarfordt, Pernilla and Zhai, Shumin (2005): Conversing with the user based on eye-gaze patterns. In: Proceedings of ACM CHI 2005 Conference on Human Factors in Computing Systems 2005. pp. 221-230.

Motivated by and grounded in observations of eye-gaze patterns in human-human dialogue, this study explores using eye-gaze patterns in managing human-computer dialogue. We developed an interactive system, iTourist, for city trip planning, which encapsulated knowledge of eye-gaze patterns gained from studies of human-human collaboration systems. User study results show that it was possible to sense users' interest based on eye-gaze patterns and manage computer information output accordingly. Study participants could successfully plan their trip with iTourist and positively rated their experience of using it. We demonstrate that eye-gaze could play an important role in managing future multimodal human-computer dialogues.

© All rights reserved Qvarfordt and Zhai and/or ACM Press

 
Edit | Del

Qvarfordt, P., Beymer, D. and Zhai, Shumin (2005): RealTourist - A Study of Augmenting Human-Human and Human-Computer Dialogue with Eye-Gaze Overlay. In: Proceedings of IFIP INTERACT05: Human-Computer Interaction 2005. pp. 767-780.

We developed and studied an experimental system, RealTourist, which lets a user to plan a conference trip with the help of a remote tourist consultant who could view the tourist's eye-gaze superimposed onto a shared map. Data collected from the experiment were analyzed in conjunction with literature review on speech and eye-gaze patterns. This inspective, exploratory research identified various functions of gaze-overlay on shared spatial material including: accurate and direct display of partner's eye-gaze, implicit deictic referencing, interest detection, common focus and topic switching, increased redundancy and ambiguity reduction, and an increase of assurance, confidence, and understanding. This study serves two purposes. The first is to identify patterns that can serve as a basis for designing multimodal human-computer dialogue systems with eye-gaze locus as a contributing channel. The second is to investigate how computer-mediated communication can be supported by the display of the partner's eye-gaze.

© All rights reserved Qvarfordt et al. and/or Springer Verlag

2004
 
Edit | Del

Ingmarsson, Magnus, Dinka, David and Zhai, Shumin (2004): TNT: a numeric keypad based text input method. In: Dykstra-Erickson, Elizabeth and Tscheligi, Manfred (eds.) Proceedings of ACM CHI 2004 Conference on Human Factors in Computing Systems April 24-29, 2004, Vienna, Austria. pp. 639-646.

With the evolving functionality in television-based (TV-based) information and entertainment appliances, there is an increased need to enable users input text through remote control devices. We present a novel text input method, The Numpad Typer (TNT), for interactive TV, multimedia home terminals or other similar applications. Embodied in a TV remote control and guided by a visual map on the TV screen, TNT was designed for consistent spatial Stimuli-Response (S-R) compatibility and consistency of use. Five users tested TNT in ten sessions of 45-minutes. This initial investigation showed that users on average could type 9.3 and 17.7 correct words per minute with TNT doing the slowest and the fastest session respectively. The study also showed that the users found the TNT method easy to grasp and fun to use. Subjectively the participants felt they mastered the method rather quickly in comparison to their actual speed improvement.

© All rights reserved Ingmarsson et al. and/or ACM Press

 
Edit | Del

Lee, Paul Ung-Joon and Zhai, Shumin (2004): Top-down learning strategies: can they facilitate stylus keyboard learning?. In International Journal of Human-Computer Studies, 60 (5) pp. 585-598.

Learning a new stylus keyboard layout is time-consuming yet potentially rewarding, as optimized virtual keyboards can substantially increase performance for expert users. This paper explores whether the learning curve can be accelerated using top-down learning strategies. In an experiment, one group of participants learned a stylus keyboard layout with top-down methods, such as visuo-spatial grouping of letters and mnemonic techniques, to build familiarity with a stylus keyboard. The other (control) group learned the keyboard by typing sentences. The top-down learning group liked the stylus keyboard better and perceived it to be more effective than the control group. They also had better memory recall performance. Typing performance after the top-down learning process was faster than the initial performance of the control group, but not different from the performance of the control group after they had spent an equivalent amount of time typing. Therefore, top-down learning strategies improved the explicit recall as expected, but the improved memory of the keyboard did not result in quicker typing speeds. These results suggest that quicker acquisition of declarative knowledge does not improve the acquisition speed of procedural knowledge, even during the initial cognitive stage of the virtual keyboard learning. They also suggest that top-down learning strategies can motivate users to learn a new keyboard more than repetitive rehearsal, without any loss in typing performance.

© All rights reserved Lee and Zhai and/or Academic Press

 
Edit | Del

Kristensson, Per-Ola and Zhai, Shumin (2004): SHARK{sup:2}: a large vocabulary shorthand writing system for pen-based computers. In: Proceedings of the 2004 ACM Symposium on User Interface Software and Technology 2004. pp. 43-52.

Zhai and Kristensson (2003) presented a method of speed-writing for pen-based computing which utilizes gesturing on a stylus keyboard for familiar words and tapping for others. In SHARK{sup:2}:, we eliminated the necessity to alternate between the two modes of writing, allowing any word in a large vocabulary (e.g. 10,000-20,000 words) to be entered as a shorthand gesture. This new paradigm supports a gradual and seamless transition from visually guided tracing to recall-based gesturing. Based on the use characteristics and human performance observations, we designed and implemented the architecture, algorithms and interfaces of a high-capacity multi-channel pen-gesture recognition system. The system\'s key components and performance are also reported.

© All rights reserved Kristensson and Zhai and/or ACM Press

 
Edit | Del

Zhai, Shumin (2004): Characterizing computer input with Fitts' law parameters -- the information and non-information aspects of pointing. In International Journal of Human-Computer Studies, 61 (6) pp. 791-809.

Throughput (TP), also known as index of performance or bandwidth in Fitts' law tasks, has been a fundamental metric in quantifying input system performance. The operational definition of TP is varied in the literature. In part thanks to the common interpretations of International Standard ISO 9241-9, the "Ergonomic requirements for office work with visual display terminals -- Part 9: Requirements for non-keyboard input devices", the measurements of throughput have increasingly converged onto the average ratio of index of difficulty (ID) and trial completion time (MT), i.e. TP=ID/MT. In lieu of the complete Fitts' law regression results that can only be represented by both slope (b) and intercept (a) (or MT=a+b ID), TP has been used as the sole performance characteristic of input devices, which is problematic. We show that TP defined as ID/MT is an ill-defined concept that may change its value with the set of ID values used for the same input device and cannot be generalized beyond specific experimental target distances and sizes. The greater the absolute value of a is, the more variable TP (=ID/MT) is. ID/MT only equals a constant 1/b when a=0. We suggest that future studies should use the complete Fitts' law regression characterized by (a, b) parameters to characterize an input system. a reflects the non-informational aspect and b the informational aspect of input performance. For convenience, 1/b can be named as throughput which, unlike ID/MT, is conceptually a true constant.

© All rights reserved Zhai and/or Academic Press

 
Edit | Del

Zhai, Shumin, Kong, Jing and Ren, Xiangshi (2004): Speed-accuracy tradeoff in Fitts' law tasks -- on the equivalency of actual and nominal pointing precision. In International Journal of Human-Computer Studies, 61 (6) pp. 823-856.

Pointing tasks in human-computer interaction obey certain speed-accuracy tradeoff rules. In general, the more accurate the task to be accomplished, the longer it takes and vice versa. Fitts' law models the speed-accuracy tradeoff effect in pointing as imposed by the task parameters, through Fitts' index of difficulty (I{sub:d}) based on the ratio of the nominal movement distance and the size of the target. Operating with different speed or accuracy biases, performers may utilize more or less area than the target specifies, introducing another subjective layer of speed-accuracy tradeoff relative to the task specification. A conventional approach to overcome the impact of the subjective layer of speed-accuracy tradeoff is to use the a posteriori "effective" pointing precision W{sub:e} in lieu of the nominal target width W. Such an approach has lacked a theoretical or empirical foundation. This study investigates the nature and the relationship of the two layers of speed-accuracy tradeoff by systematically controlling both I{sub:d} and the index of target utilization I{sub:u} in a set of four experiments. Their results show that the impacts of the two layers of speed-accuracy tradeoff are not fundamentally equivalent. The use of W{sub:e} could indeed compensate for the difference in target utilization, but not completely. More logical Fitts' law parameter estimates can be obtained by the W{sub:e} adjustment, although its use also lowers the correlation between pointing time and the index of difficulty. The study also shows the complex interaction effect between I{sub:d} and I{sub:u}, suggesting that a simple and complete model accommodating both layers of speed-accuracy tradeoff may not exist.

© All rights reserved Zhai et al. and/or Academic Press

 
Edit | Del

Zhai, Shumin, Accot, Johnny and Woltjer, Rogier (2004): Human Action Laws in Electronic Virtual Worlds - An Empirical Study of Path Steering Performance in VR. In Presence: Teleoperators and Virtual Environments, 13 (2) pp. 113-127.

 
Edit | Del

Guiard, Yves, Beaudouin-Lafon, Michel, Bastin, Julien, Pasveer, Dennis and Zhai, Shumin (2004): View size and pointing difficulty in multi-scale navigation. In: Costabile, Maria Francesca (ed.) AVI 2004 - Proceedings of the working conference on Advanced visual interfaces May 25-28, 2004, Gallipoli, Italy. pp. 117-124.

2003
 
Edit | Del

Zhai, Shumin and Kristensson, Per-Ola (2003): Shorthand writing on stylus keyboard. In: Cockton, Gilbert and Korhonen, Panu (eds.) Proceedings of the ACM CHI 2003 Human Factors in Computing Systems Conference April 5-10, 2003, Ft. Lauderdale, Florida, USA. pp. 97-104.

 
Edit | Del

Albinsson, Par-Anders and Zhai, Shumin (2003): High precision touch screen interaction. In: Cockton, Gilbert and Korhonen, Panu (eds.) Proceedings of the ACM CHI 2003 Human Factors in Computing Systems Conference April 5-10, 2003, Ft. Lauderdale, Florida, USA. pp. 105-112.

 
Edit | Del

Zhai, Shumin, Conversy, Stephane, Beaudouin-Lafon, Michel and Guiard, Yves (2003): Human on-line response to target expansion. In: Cockton, Gilbert and Korhonen, Panu (eds.) Proceedings of the ACM CHI 2003 Human Factors in Computing Systems Conference April 5-10, 2003, Ft. Lauderdale, Florida, USA. pp. 177-184.

 
Edit | Del

Accot, Johnny and Zhai, Shumin (2003): Refining Fitts' law models for bivariate pointing. In: Cockton, Gilbert and Korhonen, Panu (eds.) Proceedings of the ACM CHI 2003 Human Factors in Computing Systems Conference April 5-10, 2003, Ft. Lauderdale, Florida, USA. pp. 193-200.

 
Edit | Del

Sallnas, Eva-Lotta and Zhai, Shumin (2003): Collaboration Meets Fitts' Law: Passing Virtual Objects with and without Haptic Force Feedback. In: Proceedings of IFIP INTERACT03: Human-Computer Interaction 2003, Zurich, Switzerland. p. 97.

 
Edit | Del

Ren, Xiangshi, Tamura, Kinya, Kong, Jing and Zhai, Shumin (2003): Candidate Display Styles in Japanese Input. In: Proceedings of IFIP INTERACT03: Human-Computer Interaction 2003, Zurich, Switzerland. p. 868.

 
Edit | Del

Zhai, Shumin (2003): What's in the eyes for attentive input. In Communications of the ACM, 46 (3) pp. 34-39.

 
Edit | Del

Zhai, Shumin and Woltjer, Rogier (2003): Human Movement Performance in Relation to Path Constraint - The Law of Steering in Locomotion. In: IEEE Virtual Reality Conference 2003 VR 2003 22-26 March, 2003, Los Angeles, CA, USA. pp. 149-.

2002
 
Edit | Del

Zhai, Shumin, Sue, Alison and Accot, Johnny (2002): Movement model, hits distribution and learning in virtual keyboarding. In: Terveen, Loren (ed.) Proceedings of the ACM CHI 2002 Conference on Human Factors in Computing Systems Conference April 20-25, 2002, Minneapolis, Minnesota. pp. 17-24.

 
Edit | Del

Accot, Johnny and Zhai, Shumin (2002): More than dotting the i's -- foundations for crossing-based interfaces. In: Terveen, Loren (ed.) Proceedings of the ACM CHI 2002 Conference on Human Factors in Computing Systems Conference April 20-25, 2002, Minneapolis, Minnesota. pp. 73-80.

 
Edit | Del

Zhai, Shumin, Hunter, Michael and Smith, Barton A. (2002): Performance Optimization of Virtual Keyboards. In Human-Computer Interaction, 17 (2) pp. 229-269.

Text entry has been a bottleneck of nontraditional computing devices. One of the promising methods is the virtual keyboard for touch screens. Correcting previous estimates on virtual keyboard efficiency in the literature, we estimated the potential performance of the existing Qwerty, FITALY, and OPTI designs of virtual keyboards to be in the neighborhood of 28, 36, and 38 words per minute (wpm), respectively. This article presents 2 quantitative design techniques to search for virtual keyboard layouts. The first technique simulated the dynamics of a keyboard with digraph springs between keys, which produced a Hooke keyboard with 41.6 wpm movement efficiency. The second technique used a Metropolis random walk algorithm guided by a "Fitts-digraph energy" objective function that quantifies the movement efficiency of a virtual keyboard. This method produced various Metropolis keyboards with different shapes and structures with approximately 42.5 wpm movement efficiency, which was 50% higher than Qwerty and 10% higher than OPTI. With a small reduction (41.16 wpm) of movement efficiency, we introduced 2 more design objectives that produced the ATOMIK layout. One was alphabetical tuning that placed the keys with a tendency from A to Z so a novice user could more easily locate the keys. The other was word connectivity enhancement so the most frequent words were easier to find, remember, and type.

© All rights reserved Zhai et al. and/or Taylor and Francis

2001
 
Edit | Del

Accot, Johnny and Zhai, Shumin (2001): Scale Effects in Steering Law Tasks. In: Beaudouin-Lafon, Michel and Jacob, Robert J. K. (eds.) Proceedings of the ACM CHI 2001 Human Factors in Computing Systems Conference March 31 - April 5, 2001, Seattle, Washington, USA. pp. 1-8.

Interaction tasks on a computer screen can technically be scaled to a much larger or much smaller sized input control area by adjusting the input device's control gain or the control-display (C-D) ratio. However, human performance as a function of movement scale is not a well concluded topic. This study introduces a new task paradigm to study the scale effect in the framework of the steering law. The results confirmed a U-shaped performance-scale function and rejected straight-line or no-effect hypotheses in the literature. We found a significant scale effect in path steering performance, although its impact was less than that of the steering law's index of difficulty. We analyzed the scale effects in two plausible causes: movement joints shift and motor precision limitation. The theoretical implications of the scale effects to the validity of the steering law, and the practical implications of input device size and zooming functions are discussed in the paper.

© All rights reserved Accot and Zhai and/or ACM Press

 
Edit | Del

Wang, Jingtao, Zhai, Shumin and Su, Hui (2001): Chinese Input with Keyboard and Eye-Tracking: An Anatomical Study. In: Beaudouin-Lafon, Michel and Jacob, Robert J. K. (eds.) Proceedings of the ACM CHI 2001 Human Factors in Computing Systems Conference March 31 - April 5, 2001, Seattle, Washington, USA. pp. 349-356.

Chinese input presents unique challenges to the field of human computer interaction. This study provides an anatomical analysis of today's standard Chinese input process, which is based on pinyin, a phonetic spelling system in Roman characters. Through a combination of human performance modeling and experimentation, our study decomposed the Chinese input process into sub-tasks and found that choice reaction time and numeric keying, two component resulted from the large number of homophones in Chinese, were the major usability bottlenecks. Choice reaction alone took 36% of the total input time in our experiment. Numeric keying for multiple candidates selection tends to take the user's attention away from the computer visual screen. We designed and implemented the EASE (Eye Assisted Selection and Entry) system to help maintaining complete touch-typing experience without diverting visual (spacebar) and implicit eye-tracking to replace the numeric keystrokes. Our experiment showed that such a system could indeed work, even with today's imperfec teye-tracking technology.

© All rights reserved Wang et al. and/or ACM Press

 
Edit | Del

Smith, B. A. and Zhai, Shumin (2001): Optimised Virtual Keyboards with and without Alphabetical Ordering - A Novice User Study. In: Proceedings of IFIP INTERACT01: Human-Computer Interaction 2001, Tokyo, Japan. pp. 92-99.

 
Edit | Del

Fels, D. I., Waalen, J. K., Zhai, Shumin and Weiss, P. (2001): Telepresence under Exceptional Circumstances: Enriching the Connection to School for Sick Children. In: Proceedings of IFIP INTERACT01: Human-Computer Interaction 2001, Tokyo, Japan. pp. 617-624.

 
Edit | Del

Matlock, T., Campbell, Christopher S., Maglio, Paul P., Zhai, Shumin and Smith, B. (2001): Designing Feedback for an Attentive Office. In: Proceedings of IFIP INTERACT01: Human-Computer Interaction 2001, Tokyo, Japan. pp. 721-722.

2000
 
Edit | Del

Zhai, Shumin, Hunter, Michael and Smith, Barton A. (2000): The Metropolis Keyboard -- An Exploration of Quantitative Techniques for Virtual Keyboard Design. In: Ackerman, Mark S. and Edwards, Keith (eds.) Proceedings of the 13th annual ACM symposium on User interface software and technology November 06 - 08, 2000, San Diego, California, United States. pp. 119-128.

 
Edit | Del

Maglio, Paul P., Matlock, Teenie, Campbell, Christopher S., Zhai, Shumin and Smith, Barton A. (2000): Gaze and Speech in Attentive User Interfaces. In: Tan, Tieniu, Shi, Yuanchun and Gao, Wen (eds.) Advances in Multimodal Interfaces - ICMI 2000 - Third International Conference October 14-16, 2000, Beijing, China. pp. 1-7.

 
Edit | Del

Smith, Barton A., Ho, Janet, Ark, Wendy S. and Zhai, Shumin (2000): Hand eye coordination patterns in target selection. In: Duchowski, Andrew T. (ed.) ETRA 2000 - Proceedings of the Eye Tracking Research and Application Symposium November 6-8, 2000, Palm Beach Gardens, Florida, USA. pp. 117-122.

1999
 
Edit | Del

Zhai, Shumin, Morimoto, Carlos, Ihde, Steven and Center, Research (1999): Manual and Gaze Input Cascaded (MAGIC) Pointing. In: Altom, Mark W. and Williams, Marian G. (eds.) Proceedings of the ACM CHI 99 Human Factors in Computing Systems Conference May 15-20, 1999, Pittsburgh, Pennsylvania. pp. 246-253.

This work explores a new direction in utilizing eye gaze for computer input. Gaze tracking has long been considered as an alternative or potentially superior pointing method for computer input. We believe that many fundamental limitations exist with traditional gaze pointing. In particular, it is unnatural to overload a perceptual channel such as vision with a motor control task. We therefore propose an alternative approach, dubbed MAGIC (Manual And Gaze Input Cascaded) pointing. With such an approach, pointing appears to the user to be a manual task, used for fine manipulation and selection. However, a large portion of the cursor movement is eliminated by warping the cursor to the eye gaze area, which encompasses the target. Two specific MAGIC pointing techniques, one conservative and one liberal, were designed, analyzed, and implemented with an eye tracker we developed. They were then tested in a pilot study. This early-stage exploration showed that the MAGIC pointing techniques might offer many advantages, including reduced physical effort and fatigue as compared to traditional manual pointing, greater accuracy and naturalness than traditional gaze pointing, and possibly faster speed than manual pointing. The pros and cons of the two techniques are discussed in light of both performance data and subjective reports.

© All rights reserved Zhai et al. and/or ACM Press

 
Edit | Del

Accot, Johnny and Zhai, Shumin (1999): Performance Evaluation of Input Devices in Trajectory-Based Tasks: An Application of The Steering Law. In: Altom, Mark W. and Williams, Marian G. (eds.) Proceedings of the ACM CHI 99 Human Factors in Computing Systems Conference May 15-20, 1999, Pittsburgh, Pennsylvania. pp. 466-472.

Choosing input devices for interactive systems that best suit user's needs remains a challenge, especially considering the increasing number of devices available. The choice often has to be made through empirical evaluations. The most frequently used evaluation task hitherto is target acquisition, a task that can be accurately modeled by Fitts' law. However, today's use of computer input devices has gone beyond target acquisition alone. In particular, we often need to perform trajectory-based tasks, such as drawing, writing, and navigation. This paper illustrates how a recently discovered model, the steering law, can be applied as an evaluation paradigm complementary to Fitts' law. We tested five commonly used computer input devices in two steering tasks, one linear and one circular. Results showed that subjects' performance with the five devices could be generally classified into three groups in the following order: 1. the tablet and the mouse, 2. the trackpoint, 3. the touchpad and the trackball. The steering law proved to hold for all five devices with greater than 0.98 correlation. The ability to generalize the experimental results and the limitations of the steering law are also discussed.

© All rights reserved Accot and Zhai and/or ACM Press

 
Edit | Del

Zhai, Shumin, Kandogan, Eser, Smith, Barton A. and Selker, Ted (1999): In Search of the 'Magic Carpet': Design and Experimentation of a Bimanual 3D Navigation Interface. In J. Vis. Lang. Comput., 10 (1) pp. 3-17.

1998
 
Edit | Del

Zhai, Shumin and Milgram, Paul (1998): Quantifying Coordination in Multiple DOF Movement and its Application to Evaluating 6 DOF Input Devices. In: Karat, Clare-Marie, Lund, Arnold, Coutaz, Jolle and Karat, John (eds.) Proceedings of the ACM CHI 98 Human Factors in Computing Systems Conference April 18-23, 1998, Los Angeles, California. pp. 320-327.

Study of computer input devices has primarily focused on trial completion time and target acquisition errors. To deepen our understanding of input devices, particularly those with high degrees of freedom (DOF), this paper explores device influence on the user's ability to coordinate controlled movements in a 3D interface. After reviewing various existing methods, a new measure of quantifying coordination in multiple degrees of freedom, based on movement efficiency, is proposed and applied to the evaluation of two 6 DOF devices: a free-moving position-control device and a desk-top rate-controlled hand controller. Results showed that while the users of the free moving device had shorter completion time than the users of an elastic rate controller, their movement trajectories were less coordinated. These new findings should better inform system designers on development and selection of input devices. Issues such as mental rotation and isomorphism vs. tools operation as means of computer input are also discussed.

© All rights reserved Zhai and Milgram and/or ACM Press

 
Edit | Del

Ark, Wendy, Dryer, D. Christopher, Selker, Ted and Zhai, Shumin (1998): Representation Matters: The Effect of 3D Objects and a Spatial Metaphor in a Graphical User Interface. In: Johnson, Hilary, Nigay, Laurence and Roast, C. R. (eds.) Proceedings of the Thirteenth Conference of the British Computer Society Human Computer Interaction Specialist Group - People and Computers XIII August 1-4, 1998, Sheffield, UK. pp. 209-219.

As computer graphical user interfaces (GUIs) are loaded with increasingly greater numbers of objects, researchers in HCI are forced to look for the next step in constructing user interface. In this paper, we examine the effects of employing more 'natural' representations in GUIs. In particular, we experimentally assess the impact of object form (2D iconic versus 3D realistic) and layout (regular versus ecological) have on target acquisition time. Results indicate that both form and layout significantly affect performance; subjects located targets more quickly when using interfaces with 3D objects and ecological layouts than they do with 2D objects and regular layouts. An interface with an ecological layout, realistic objects, or both may be an improvement over traditional interfaces.

© All rights reserved Ark et al. and/or Springer Verlag

 
Edit | Del

Leganchuk, Andrea, Zhai, Shumin and Buxton, Bill (1998): Manual and Cognitive Benefits of Two-Handed Input: An Experimental Study. In ACM Transactions on Computer-Human Interaction, 5 (4) pp. 326-359.

One of the recent trends in computer input is to utilize users' natural bimanual motor skills. This article further explores the potential benefits of such two-handed input. We have observed that bimanual manipulation may bring two types of advantages to human-computer interaction: manual and cognitive. Manual benefits come from increased time-motion efficiency, due to the twice as many degrees of freedom simultaneously available to the user. Cognitive benefits arise as a result of reducing the load of mentally composing and visualizing the task at an unnaturally low level which is imposed by traditional unimanual techniques. Area sweeping was selected as our experimental task. It is representative of what one encounters, for example, when sweeping out the bounding box surrounding a set of objects in a graphics program. Such tasks cannot be modeled by Fitts' Law alone and have not been previously studied in the literature. In our experiments, two bimanual techniques were compared with the conventional one-handed GUI approach. Both bimanual techniques employed the two-handed "stretchy" technique first demonstrated by Krueger in 1983. We also incorporated the "Toolglass" technique introduced by Bier et al. in 1993. Overall, the bimanual techniques resulted in significantly faster performance than the status quo one-handed technique, and these benefits increased with the difficulty of mentally visualizing the task, supporting our bimanual cognitive advantage hypothesis. There was no significant difference between the two bimanual techniques. This study makes two types of contributions to the literature. First, practically we studied yet another class of transaction where significant benefits can be realized by applying bimanual techniques. Furthermore, we have done so using easily available commercial hardware in the context to our understanding of why bimanual interaction techniques have an advantage over unimanual techniques. A literature review on two-handed computer input and some of the most relevant bimanual human motor control studies is also included.

© All rights reserved Leganchuk et al. and/or ACM Press

1997
 
Edit | Del

Accot, Johnny and Zhai, Shumin (1997): Beyond Fitts' Law: Models for Trajectory-Based HCI Tasks. In: Pemberton, Steven (ed.) Proceedings of the ACM CHI 97 Human Factors in Computing Systems Conference March 22-27, 1997, Atlanta, Georgia. pp. 295-302.

Trajectory-based interactions, such as navigating through nested-menus, drawing curves, and moving in 3D worlds, are becoming common tasks in modern computer interfaces. Users' performances in these tasks cannot be successfully modeled with Fitts' law as it has been applied to pointing tasks. Therefore we explore the possible existence of robust regularities in trajectory-based tasks. We used "steering through tunnels" as our experimental paradigm to represent such tasks, and found that a simple "steering law" indeed exists. The paper presents the motivation, analysis, a series of four experiments, and the applications of the steering law.

© All rights reserved Accot and Zhai and/or ACM Press

 
Edit | Del

Salem, Chris and Zhai, Shumin (1997): An Isometric Tongue Pointing Device. In: Pemberton, Steven (ed.) Proceedings of the ACM CHI 97 Human Factors in Computing Systems Conference March 22-27, 1997, Atlanta, Georgia. pp. 538-539.

In order to provide alternative computer input, we designed an isometric, tongue operated device: Tonguepoint. The design rationale and a preliminary experiment are presented in this technical note. Results show that, after 30 minutes practice and adjustment, the subjects could use the Tonguepoint at a performance level that was only 5-50% slower than finger isometric pointing. Further improvements are expected.

© All rights reserved Salem and Zhai and/or ACM Press

 Cited in the following chapter:

Fitts's Law: [/encyclopedia/fitts_law.html]


 
1996
 
Edit | Del

Zhai, Shumin, Milgram, Paul and Buxton, Bill (1996): The Influence of Muscle Groups on Performance of Multiple Degree-of-Freedom Input. In: Tauber, Michael J., Bellotti, Victoria, Jeffries, Robin, Mackinlay, Jock D. and Nielsen, Jakob (eds.) Proceedings of the ACM CHI 96 Human Factors in Computing Systems Conference April 14-18, 1996, Vancouver, Canada. pp. 308-315.

The literature has long suggested that the design of computer input devices should make use of the fine, smaller muscle groups and joints in the fingers, since they are richly represented in the human motor and sensory cortex and they have higher information processing bandwidth than other body parts. This hypothesis, however, has not been conclusively verified with empirical research. The present work studied such a hypothesis in the context of designing 6 degree-of-freedom (DOF) input devices. The work attempts to address both a practical need -- designing efficient 6 DOF input devices -- and the theoretical issue of muscle group differences in input control. Two alternative 6 DOF input devices, one including and the other excluding the fingers from the 6 DOF manipulation, were designed and tested in a 3D object docking experiment. Users' task completion times were significantly shorter with the device that utilised the fingers. The results of this study strongly suggest that the shape and size of future input device designs should constitute affordances that invite finger participation in input control.

© All rights reserved Zhai et al. and/or ACM Press

 
Edit | Del

Zhai, Shumin, Buxton, Bill and Milgram, Paul (1996): The Partial-Occlusion Effect: Utilizing Semitransparency in 3D Human-Computer Interaction. In ACM Transactions on Computer-Human Interaction, 3 (3) pp. 254-284.

This study investigates human performance when using semitransparent tools in interactive 3D computer graphics environments. The article briefly reviews techniques for presenting depth information and examples of applying semitransparency in computer interface design. We hypothesize that when the user moves a semitransparent surface in a 3D environment, the "partial-occlusion" effect introduced through semitransparency acts as an effective cue in target localization -- an essential component in many 3D interaction tasks. This hypothesis was tested in an experiment in which subjects were asked to capture dynamic targets (virtual fish) with two versions of a 3D box cursor, one with and one without semitransparent surfaces. Results showed that the partial-occlusion effect through semitransparency significantly improved users' performance in terms of trial completion time, error rate, and error magnitude in both monoscopic and stereoscopic displays. Subjective evaluations supported the conclusions drawn from performance measures. The experimental results and their implications are discussed, with emphasis on the relative, discrete nature of the partial-occlusion effect and on interactions between different depth cues. The article concludes with proposals of a few future research issues and applications of semitransparency in human-computer interaction.

© All rights reserved Zhai et al. and/or ACM Press

1994
 
Edit | Del

Zhai, Shumin, Buxton, Bill and Milgram, Paul (1994): The "Silk Cursor": Investigating Transparency for 3D Target Acquisition. In: Adelson, Beth, Dumais, Susan and Olson, Judith S. (eds.) Proceedings of the ACM CHI 94 Human Factors in Computing Systems Conference April 24-28, 1994, Boston, Massachusetts. pp. 459-464.

This study investigates dynamic 3D target acquisition. The focus is on the relative effect of specific perceptual cues. A novel technique is introduced and we report on an experiment that evaluates its effectiveness. There are two aspects to the new technique. First, in contrast to normal practice, the tracking symbol is a volume rather than a point. Second, the surface of this volume is semi-transparent, thereby affording occlusion cues during target acquisition. The experiment shows that the volume/occlusion cues were effective in both monocular and stereoscopic conditions. For some tasks where stereoscopic presentation is unavailable or infeasible, the new technique offers an effective alternative.

© All rights reserved Zhai et al. and/or ACM Press

 
Edit | Del

Zhai, Shumin and Milgram, Paul (1994): Asymmetrical Spatial Accuracy in 3D Tracking. In: Proceedings of the Human Factors and Ergonomics Society 38th Annual Meeting 1994. pp. 245-249.

This paper reports on asymmetrical spatial accuracy of human subjects in tracking of an object which moves randomly with 6 degrees-of-freedom (DOF) in a 3D environment. It was found that, for translational errors, RMS deviations in the depth (Z) direction were 40% higher than those in the horizontal (X) direction, for an experimental display which provided binocular disparity (stereopsis), perspective and partial occlusion cues. In general, translational tracking errors in the vertical (Y) direction were greater than those in the X direction and smaller than those in the Z direction. In early stages of practice, vertical errors were similar to those in the Z direction, but as learning progressed, errors in the X and Y dimensions converged. These finding were consistent across two types of controllers and different tracking paths in the 3D environment. It would appear that horizontal movement requires higher attentional resource priority over vertical movement in such a tracking task.

© All rights reserved Zhai and Milgram and/or Human Factors Society

1993
 
Edit | Del

Drascic, David, Grodski, Julius J., Milgram, Paul, Ruffo, Ken, Wong, Peter and Zhai, Shumin (1993): ARGOS: A Display System for Augmenting Reality. In: Ashlund, Stacey, Mullet, Kevin, Henderson, Austin, Hollnagel, Erik and White, Ted (eds.) Proceedings of the ACM CHI 93 Human Factors in Computing Systems Conference April 24-29, 1993, Amsterdam, The Netherlands. p. 521.

This video describes the development of the ARGOS (Augmented Reality through Graphic Overlays on Stereovideo) system, as a tool for enhancing human-telerobot interaction, and as a more general tool with applications in a variety of areas, including image enhancement, simulation, sensor fusion, and virtual reality.

© All rights reserved Drascic et al. and/or ACM Press

 
Edit | Del

Waterworth, John A., Chignell, Mark and Zhai, Shumin (1993): From Icons to Interface Models: Designing Hypermedia from the Bottom Up. In International Journal of Man-Machine Studies, 39 (3) pp. 453-472.

We describe a method to derive design models for hypermedia interfaces from the bottom up. Firstly, we compile a list of hypermedia interface features which we classify according to the category of functions they fulfill. We then describe an experiment in which candidate designs for low-level interface features were designed and tested for recognizability. In the experiment, icons for each of 61 hypermedia concepts were generated and then judged. Finally, we outline and illustrate a model induction phase in which low-level features are combined into an overall interface model, via "micro-models" that take account of the types of icons that worked best for each class of interface feature. We suggest that, at least for hypermedia systems, a bottom-up approach to interface design based on the functions of low-level features is preferable to the dominant, top-down approach based around one or more metaphors.

© All rights reserved Waterworth et al. and/or Academic Press

 
Edit | Del

Fitzmaurice, George W., Zhai, Shumin and Chignell, Mark (1993): Virtual Reality for Palmtop Computers. In ACM Transactions on Information Systems, 11 (3) pp. 197-218.

We are exploring how virtual reality theories can be applied toward palmtop computers. In our prototype, called the Chameleon, a small 4-inch hand-held monitor acts as a palmtop computer with the capabilities of a Silicon Graphics workstation. A 6D input device and a response button are attached to the small monitor to detect user gestures and input selections for issuing commands. An experiment was conducted to evaluate our design and to see how well depth could be perceived in the small screen compared to a large 21-inch screen, and the extent to which movement of the small display (in a palmtop virtual reality condition) could improve depth perception. Results show that with very little training, perception of depth in the palmtop virtual reality condition is about as good as corresponding depth perception in a large (but static) display. Variations to the initial design are also discussed, along with issues to be explored in future research. Our research suggests that palmtop virtual reality may support effective navigation and search and retrieval, in rich and portable information spaces.

© All rights reserved Fitzmaurice et al. and/or ACM Press

 
Edit | Del

Zhai, Shumin (1993): Investigation of Feel for 6DOF Inputs: Isometric and Elastic Rate Control for Manipulation in 3D Environments. In: Proceedings of the Human Factors and Ergonomics Society 37th Annual Meeting 1993. pp. 323-327.

An increasing need exists for both a theoretical basis and practical human factors guidelines for designing and selecting high degree-of-freedom (DOF) computer input devices for 3D interactive environments such as telerobotic and virtual reality systems. This study evaluates elastic versus isometric rate control devices, in a 3D object positioning task. An experiment was conducted with a stereoscopic virtual reality system. The results showed that the elastic rate controller facilitated faster task completion time in the first of four phases of the experiment. The results are discussed in light of psychomotor literature. While the richer proprioceptive feedback afforded by an elastic controller is necessary for achieving superior performance in the early stages of learning, subjects performed equally well with the isometric controller in later learning stages. The study provides evidence to support a theory of skill shift from closed-loop to open-loop behaviour as learning progresses.

© All rights reserved Zhai and/or Human Factors Society

 
Edit | Del

Zhai, Shumin and Milgram, Paul (1993): Human Performance Evaluation of Manipulation Schemes in Virtual Environments. In: VR 1993 1993. pp. 155-161.

 Cited in the following chapter:

3D User Interfaces: [/encyclopedia/3d_user_interfaces.html]


 
 
Add publication
Show list on your website
 

Join our community and advance:

Your
Skills

Your
Network

Your
Career

 
Join our community!
 
 
 
Date created: Not available
Date last modified: Not available Date created: Not available
Date last modified: Not available

Page Information

Page maintainer: The Editorial Team
URL: http://www.interaction-design.org/references/authors/shumin_zhai.html