This study examined whether eye and head responses can be used to evaluate attention cue effectiveness. The subjects' tasks were to complete a centrally-located tracking task while periodically responding to cues to identify targets at four peripheral locations. Five directional cues were evaluated: visual symbol, coded sound, speech cue, three dimensional (3-D) sound and 3-D speech (the 3-D cues appeared to emanate from the peripheral locations). The results showed significant performance differences in eye and head reaction time, as well as peripheral target task completion time, as a function of cue modality. Since these relatively nonobtrusive measures were as sensitive to cue modality as the peripheral task completion time, these results suggest that eye and head reaction time can be used in evaluations addressing the effectiveness of attention cues.
Stewart, John E. and Lofaro, Ronald J. (1990): A Secondary Analysis Comparing Subjective Workload Assessments with U.S. Army Aircrew Training Manual Ratings of Pilot Performance. In: D., Woods, and E., Roth, (eds.) Proceedings of the Human Factors Society 34th Annual Meeting 1990, Santa Monica, USA. pp. 104-108.
Performing a front-end analysis (FEA) is a required part of user training, documentation, and interface development for nuclear-related systems (NUREG CR/0737, DOE Order 5480.6). A traditional FEA for a large existing system can be extremely expensive and require many months to complete. Performing a complete front-end analysis on an emerging or changing system can be even more costly. In some cases, this may be almost impossible because the data required for the later steps in the analysis simply do not exist or are constantly changing. The use of rapid prototyping is less time consuming than the traditional top-down (or sequential front-to-back) approaches used to develop the training, documentation, and human-computer interfaces for new or evolving systems. In addition, it provides for enhanced communication between the analysts, designers, and end users of the system. This paper details the approach taken and the lessons learned for one example of rapid prototyping as applied to a nuclear facility upgrade.
The effects on performance from the use of icons and alphanumerics in pilot-vehicle interfaces were investigated in an experiment. Varying numbers of single status display indicators were presented in both iconic and alphanumeric formats in fixed and random display positions across three levels of difficulty. Subjects' ability to maintain a tracking task while concurrently searching and selecting appropriate display indicators was tested. Results indicated that for all numbers of indicators presented, icons produced faster search and selection reaction times. Significant interactions were also found for format type and difficulty level. Questionnaire assessment revealed that subjects preferred the iconic to the alphanumeric formats. Implications for the design of aircraft interfaces and further research suggestions are discussed.
Gould's (1987) iterative design principle was applied to the design and development of a large, complex interface. Specific challenges we faced in implementing his recommended design approach included the sheer volume of panels in the interface, communication across the design team, excess baggage stemming from the previous interface, management of design changes, and translation into multiple languages. Our methods of facing those challenges are documented, and the lessons we learned in the process are detailed.
Developing a workstation for the next generation Air Traffic Control system (AAS) represents a significant design challenge. Not only are a large number of potentially conflicting requirements identified for this workstation, but several unique features of those requirements exacerbate the potential problems. For example, a large (20 X 20 inch) CRT is the primary visual display. This must move to both an inline and wraparound console configuration. The system must accommodate a large range of user sizes and be acceptable to approximately 16,000 air traffic controllers. A team of controllers has participated in the iterative design effort through reviews, demonstrations and hands-on evaluation. The key feature of all design activities is the narrowing of alternatives as the design approaches production release. This paper addresses this process and suggests how this process may be managed to ensure a satisfactory outcome.
Verbal protocols have been used for many years in different research contexts, but there still is no clear consensus about the validity of the technique and methods for maximizing validity in an applied setting; how to standardize the collection and analysis of protocols; and last but certainly not least, whether the resulting data is worth the effort. This panel discussion is a companion to a symposium at this conference which presents empirical studies and human factors applications of verbal protocol techniques. The panel will focus in more depth on issues raised in that earlier session, with the goal of providing guidance for practical applications of the technique.
Traditional methods of evaluating icon comprehension and discriminability have relied on a sequence of multiple screening tests to measure various aspects of icon meaning, image content, and the user's perception of the icon. The most frequently used procedures have been the icon appropriateness test to determine the best conceptual design from a group of icon candidates, followed by the icon matching test to ensure that individual icons are not confused with others in a set. This paper describes an automated paired-comparison test procedure that provides reliable measures of both icon appropriateness and icon discriminability using the same test method with a single metric. The procedure was validated in two experiments involving the design and evaluation of two different mouse-pointer icons. In Experiment 1, the procedure was used as an icon screening test to determine the most appropriate and meaningful icon that best represented each concept from two different sets of proposed icon variants. In Experiment 2, the same procedure was then used to confirm the discriminability of the final icon selections, and to verify the accuracy of results from the initial appropriateness test.
Making relative and absolute judgments of alternatives is compared. Relative judgments, following Saaty's procedure, require that each possible pair of conditions be compared. The subject indicates which member of the pair is preferred, then gives the magnitude of the preference on a 1-9 scale. The scores are entered into a matrix and eigenvectors calculated for each subject in each condition. These eigenvectors then are evaluated in a conventional subjects x conditions analysis of variance. Two experiments are reported which show relative rating using eigenvectors is a more sensitive rating instrument than absolute rating. Experiment 1 compared discomfort glare for three simulated streetlight luminances. Experiment 2 evaluated the likability of various fonts when used on transparencies with two sizes of fonts (subtending .57 or .72 {deg}), two styles (bold and regular) and three types (executive, roman and sans serif). The relative rating method is a "more sensitive instrument." It has two disadvantages. One, it requires evaluation of all possible pairs of conditions by the same subject so the experiment itself may take longer. Second, the program to calculate the eigenvectors is not presently available in a standard statistical package such as SAS or SPSS.
This paper proposes a technique to apply the protocol analysis method to usability testing. The "trouble analysis" method is offered for data analysis tasks. These techniques involve a procedure to improve time efficiency and convenience. As a criterion for data analysis, a "trouble model", which consists of 22 trouble categories, is also offered. Most of the problems in a user-interface can be identified by extracting troubles from the verbal protocols using the model. In the trouble analysis technique, analyses of the human cognitive/thought processes, which usually requires expert knowledge and a lot of time, are not taken into consideration. Quick and easy detection of problems is first considered. The "trouble analysis" technique contributions to usability testing were empirically verified through 9 tests, employed on different kinds of products. The evaluation extent limitation, and user behavior during trouble situations, are also discussed.
In a series of studies we address the two questions of: 1) Do verbalizations reflect concurrent thought, and 2) Does concurrent thinking aloud differ from normal thinking? The design of experimental tests was based upon Ericsson and Simon's model of thinking aloud, incorporating variation of how information is represented in short term memory. Eye-movement recordings were used as a source of additional data, allowing us to go beyond a mere analysis of solution time and accuracy. Comparing verbalizations and eye-movement data, we arrived at a positive answer to the first of our initial questions. The second question was approached on several levels, always involving a comparison of 'think-aloud groups' with silent controls. We found no differences with respect to accuracy, but longer solution times in think-aloud groups. In a final experiment, the influence of thinking aloud on concurrent task performance could be narrowed down to an effect which only persists through the early stages of familiarization with tasks. We conclude that concurrent verbalization is a viable tool in the study of cognitive processes.
A traditional concurrent verbal protocol method was compared to a heavily cued retrospective verbal protocol in which users were presented with a video tape of their performance to help them recall their thoughts after task completion. The two methods of protocol were employed in a comparison of two different size monitors. Subjects were required to complete 12 tasks which varied in the number of windows required simultaneously on the monitor. The subjects' performance, as measured by steps to completion, task completion time, and errors committed, was compared across monitors and protocol methods. Subjective data were also collected in the form of task difficulty ratings, as well as a global measure of user satisfaction. Verbal data were compared to assess any information differences due to the methods of collection or the monitor sizes. No performance or subjective differences were found between the two protocol methods. The kinds of information gathered were quite different for the two methods, with concurrent protocol subjects giving procedural information and retrospective protocol subjects giving explanations and design statements. Performance data, as well as subjective data, indicated that on tasks that require that one or two windows be present simultaneously, there were no differences between the two monitor sizes. As the number of simultaneous windows increased, however, the large monitor's advantages became apparent. Tasks which require that four windows be present simultaneously were judged to be easier and required fewer steps on the large monitor than on the small monitor.
The analysis of subjective verbal protocols can provide valuable information additional to that obtained from traditional objective data sources. The most frequently used type of protocol analysis is of the "think-aloud" report where operators verbalize as they perform a task of interest. However, while this concurrent method has been usefully applied to high-level cognitive tasks that are accomplished over extended periods, it is generally considered to be less appropriate for short-duration tasks where the emphasis is on speed of performance. This study reports on the application of a new protocol method to a speeded task based on a procedure where the computer "plays back" the experimental trials and shows the subject's response. The verbal response of the subject was recorded during the playback, augmented by prompts from the experimenter. Several aiming tasks requiring rapid movements to a target were examined using this method. The data obtained from the protocol analysis were a valuable adjunct to the actual performance results, and demonstrated that the new method appears to be a satisfactory procedure for obtaining protocols for rapidly performed tasks. Where movements involving both hands were involved, the verbal protocols supported a divided attention hypothesis for performance over a competing motor-program hypothesis. The reports implied that the movement characteristics were under conscious control requiring division of attention.
SHAPA is an interactive verbal protocol analysis tool based on Ericsson and Simon's (1984) recommendations for verbal protocol analysis (Sanderson, James, and Seidler, 1989). It provides a "shell" for carrying out protocol analysis. This paper shows how SHAPA has been used in three different domains: (1) the control of a complex continuous process, (2) a task where subjects give navigational information to active or passive listeners, and (3) the control of a simple city transport system. These examples show how SHAPA can help researchers collect data about the frequencies with which certain categories of verbalization occur and determine the patterns into which they fall.
Thinking-aloud protocols traditionally have been used by academic researchers as a qualitative data collection method. This method is currently gaining acceptance in industry usability testing. The Usability Group at Microsoft has adopted the thinking-aloud protocol as a primary method for obtaining data from users. We have found the method valuable not only because it is valid for gathering qualitative data, but also because it is responsive to the constraints we face and the organizational culture we work within. The issue of validity has been discussed in detail by researchers such as Deffner & Rhenius and Ericsson & Simon. Our case study further pursues the validity of thinking-aloud protocols and also discusses how this method allows the researcher to work within industry constraints and incorporate changes into the product within a small time frame. Finally, our case study demonstrates how thinking-aloud protocols fit in well with Microsoft corporate culture where understandable and persuasive results are needed. This case study will have particular relevance for usability practitioners in industry.
Cognitive methods of task analysis have been used for training development. Although quite promising, these methods are generally time consuming and labor-intensive, and require considerable expertise. This has precluded their full use in field training situations. Economical, practical and user-friendly methods are needed which can be integrated easily with current approaches. This symposium paper discusses the potential of cognitive task analysis as well as the practicality problem. Of particular concern is how cognitive methods can receive widespread application among training practitioners -- how to transition theory and research in cognitive task analysis into mainstream training development programs.
As part of a review and evaluation of the Federal Aviation Administration's air traffic control (ATC) training program, we tested whether cognitive task analysis techniques could help identify the knowledge, skills, and strategies used by proficient controllers, at a level appropriate for deriving instructional objectives. Our approach involved modifying commonly-used methods (e.g., interviews and think-aloud protocols) for use in a real-time, real-world task domain. Expert controllers were videotaped performing realistic ATC scenarios. We then elicited "play-by-play" analysis of the scenario from other expert controllers and retrospective protocols from the subjects. Other techniques were used to obtain convergent data on controllers' knowledge representation and organization. The methodology was successful in describing several cognitive components of ATC expertise that had previously defied explication at a level of detail appropriate for instruction. We discuss briefly training implications and other ways in which we have used the data.
We describe an approach to cognitive task analysis that utilizes two mutually reinforcing analyses. One analysis focuses on building a description of the cognitive demands imposed by the world that any intelligent agent would have to deal with (a model of the cognitive environment). The second analysis, conducted in parallel, is an empirical investigation of how practitioners, both experts and less skilled individuals, respond to the task demands (a performance model). We then discuss how a cognitive simulation can support a cognitive task analysis.
"Knowledge engineering" refers to the process of getting rules out of the heads of experts and into expert systems. A broader field include a variety of "low technology" applications. If we think of knowledge as a valued resource, analogous to petroleum, this suggests four aspects of knowledge engineering: (a) locating sources of expertise in organizations; (b) assaying the cost/benefits of engineering the expertise; (c) acquiring the knowledge; and (d) codifying the knowledge. In this paper we discuss knowledge engineering strategies and applications beyond expert systems.
The failure to detect a telephone ringer signal can prove frustrating or even hazardous in certain situations, especially for older individuals who rely heavily on telephone access. This study was conducted to investigate the detectability of telephone ringer signals with individuals having elevated hearing levels. Specifically, the study investigated the detectability of three acoustically different telephone ringer signals under two masking noise conditions (quiet and 65 dBA pink noise) for two subject age groups: 20-30 years of age and over 70 years of age. Common residential telephone ringers were sampled, with three acoustically different ringers selected for study. To determine hearing ability, pure tone audiograms were administered to all subjects. Subjects' threshold levels for each ringer were then determined. Significant differences were found between the two age groups, both across telephone ringers and across noise conditions. For the older group, an advantage was found for the ringer signal which contained prominent low-to-mid range frequency components. In addition, the threshold level in noise of one ringer (a high frequency "beeper" type ringer) proved to be approximately equal to the naturally occurring decibel level of that ringer. Thus, the beeper ringer in moderate level noise (65 dBA) was effectively inaudible. The results suggest that certain electronic ringers which are currently in vogue may be unsuitable for use by the elderly or by any individual with significant high-frequency hearing loss.
Seamster, Thomas L., Eike, David R. and Ames, Troy J. (1990): Knowledge Acquisition and Representation for the Systems Test and Operations Language (STOL) Intelligent Tutoring System (ITS). In: D., Woods, and E., Roth, (eds.) Proceedings of the Human Factors Society 34th Annual Meeting 1990, Santa Monica, USA. pp. 1323-1327.
This presentation concentrates on knowledge acquisition and its application to the development of an expert module and a user interface for an Intelligent Tutoring System (ITS). The Systems Test and Operations Language (STOL) ITS is being developed to assist NASA control center personnel in learning a command and control language as it is used in mission operations rooms. The objective of the tutor is to impart knowledge and skills that will permit the trainee to solve command and control problems in the same way that the STOL expert solves those problems. The STOL ITS will achieve this objective by representing the solution space in such a way that the trainee can visualize the intermediate steps, and by having the expert module production rules parallel the STOL expert's knowledge structures. This approach has resulted in a knowledge acquisition process that places a great emphasis on both the domain expert's knowledge structures and solutions steps. Concept sorting tasks combined with scaling analysis techniques are being used for organizing and analyzing domain concepts. These techniques have been used to identify the critical STOL commands, the related concepts, and significant problems that will direct the design of the tutor's user interface as well as the production rules of the expert module.
This research investigates the efficacy of the Automated Performance Test System (APTS), a battery of tests measuring basic psychomotor, cognitive, and spatial abilities, to predict complex psychomotor performance on two part-task tank gunnery simulators, TOPGUN and the Videodisk Gunnery Simulator (VIGS). It was hypothesized from past research that the Manikin, Simultaneous Pattern Comparison, and Four-Choice Reaction Time subtests of the APTS would be predictive of TOPGUN and VIGS performances. Additional research goals were to examine the stabilities and reliabilities of APTS, TOPGUN, and VIGS. Forty male undergraduate students were tested on the APTS; afterward, they completed either TOPGUN (N = 20) or VIGS (N = 20) training. Results obtained indicated that Code Substitution, Manikin, and Pattern Comparison were predictive of tank gunnery simulator performance at the p = .01 level. It is concluded that 1) these results need to be replicated, due to the complexity of the analyses conducted, 2) the APTS were found to be very stable and reliable, but TOPGUN and VIGS measures were unreliable, and 3) the unreliable simulator measures limited APTS' surrogate potential.
The purpose of this paper is to describe an abbreviated Instructional Systems Development (ISD) approach that was adopted to support identification of training requirements for two system management positions in the Theater Air Command and Control Simulation Facility (TACCSF). TACCSF is a large scale, man-in-the-loop air defense simulation facility located on Kirtland AFB, New Mexico. A tailored ISD approach was used to support the evaluation of existing training-related documentation and materials. A 41 X 64 cell Training Resources Matrix was generated. Training requirements were arrayed vertically and training resources were listed horizontally in the matrix. The matrix was used to help define and develop preliminary training requirements, resources, and training plans. Results of the analyses were useful, and would be improved by including more detailed information in the data base.
This paper describes the task analysis procedures and data obtained to support development of a part-task trainer for a CBI military training system (as well as various training aids and recommendations), and the trainer design and evaluation. This was part of a two-year R&D program which was unique in that the trainer was designed based upon data derived from an integrated task analysis methodology which incorporated both cognitive and behavioral methods. Because the task to be trained, electronic warfare, is an area with a complex conceptual base and heavy decision-making components, the task analysis was primarily cognitive. The task analysis provided information about expert versus novice mental models, and effective heuristics and algorithms for problem solving.
During the last decade, many commentators have preached the need to improve the quality, productivity, and training of the American workforce. Many of the observations relative to our future competitiveness have been unsettling, if not alarming. Accordingly, prescriptions for change have included such things as doubling and tripling training budgets. Most case studies in the popular literature talk about the management and improvement of manufacturing enterprises. In contrast, this paper deals with the challenges facing knowledge workers and, specifically, those of the Nuclear Reactor Research and Technology Department (NRRT) of EG&G Idaho, Inc. where a full-blown training needs analysis was conducted. The findings, in brief, indicate that a majority of the knowledge workers, especially the scientists and engineers, had at least some training needs that were not being adequately met. Lack of emphasis, scarcity of time, and limited availability of relevant, technical, training were the primary reasons. The employees also noted that they believed the experts needed to provide the training were their colleagues already working within the Department and that on-the-job training and other individualized training methods would be most effective. The employees expressed a strong interest in the company instituting a career development program. Based on this analysis, the NRRT has taken several steps to better the training needs. These findings, which would appear to be typical for knowledge workers nation-wide, underscore the vital importance of leaders seriously assessing and responding to the needs of a vital national resource -- the knowledge workers.
Air-intercept control is a complex Navy combat task which requires a radar operator to advise a pilot of optimum headings for avoiding, meeting, or intercepting other aircraft. The goal of the research reported here was to explore procedures for teaching some critical elements of air-intercept control with a pc-based version of the training simulation that is normally used for instruction. Principles derived from attention theory, and in particular Multiple Resource Theory (Wickens, 1984), were used to guide the development of two part-training strategies. In one strategy, specific skill-based elements of the task were taught in isolation and were then recombined into the whole task. In another strategy, an abstract procedural task was added by isolating those features that contributed to the spatial and temporal coherence of the whole task. A transfer of training design was used to test these two part-training conditions. The procedural-based version of the task emerged as a training strategy that could help students develop resistance to potentially disruptive effects of making the task more difficult. The results are viewed as supporting an approach to training that attempts to alleviate resource overload so that learning may proceed with maximum efficiency while, at the same time, allowing critical task elements related to time-sharing skills to be practiced.
This study tests the effectiveness of a training strategy to improve situational awareness skills. The training approach suggested by this study is to expose subjects initially to only those cues relevant to the task. When other extraneous cues are added, these subjects should be better at extracting those familiar cues relevant to the task than subjects who are first exposed to both relevant and irrelevant cues. Subjects trained with only the relevant
This research investigated the development of automatic processing with alphanumeric materials that are representative of those processed by operators of some complex information systems. According to automatic processing theory, consistently mapped (CM) components of complex skills can be automatized with extensive practice, such that they are performed rapidly and accurately with minimal effort. Experiment 1 compared the effects of 3200 training trials in a memory search paradigm with alphanumeric materials under CM and variably mapped (VM) conditions. Dissimilar target and distractor sets were used. The results were consistent with the development of automatic processing in the CM condition. Experiment 2 examined the effect of similar target and distractor sets on CM performance. The results of Experiment 2 indicate that target/distractor similarity significantly affected CM performance. Such similarity therefore represents an important factor to be considered in the design of training programs to support the development of automatic processing with complex alphanumeric materials.
The methods which people use to reason about everyday events and the strategies they employ have received much attention throughout the years. One aspect of this history is the debate about whether learning rules or examples most facilitates transfer of knowledge to a different domain. This research attempted to answer this question through two experiments. The first experiment concentrated on defining the dimensions along which subjects perceived problems which embodied statistical heuristics. The results identified a contextual dimension along which subjects classified the problems. The second experiment was conducted to determine if the contextual dimension or the problem domain dimension could best account for transfer of training. The results indicated that the training transferred to all novel problems, however, training did not transfer to a different set of problems presented to the subjects as a phone survey. Explanations for this lack of transfer are discussed.
Seamster, Thomas L. and Glass, Richard H. (1990): Human Factors Considerations in the Design of Displays and Switches for a Flight Simulator's Onboard Instructor/Operator Station (IOS). In: D., Woods, and E., Roth, (eds.) Proceedings of the Human Factors Society 34th Annual Meeting 1990, Santa Monica, USA. pp. 1400-1404.
One reason that intelligent tutoring systems (ITSs) are rarely found outside of the research lab has to do with the guidelines available to developers of these systems. First, some of these guidelines are stated as general, abstract goals such as "adapt to the student." What ITS developers need, however, are specific strategies and techniques which can be implemented in an ITS to accomplish those goals. Second, not all of the guidelines have an empirical basis. One solution to both of these problems is to study human tutors. This paper demonstrates this approach, and discusses an empirical study of human tutors which was conducted to address these issues. Specifically, it discusses 1) the knowledge acquisition method which we designed to capture the appropriate empirical data, 2) how we used this method to study human tutors and students in the medical problem-solving domain of immunohematology (blood banking), 3) several guidelines which appeared to drive the tutors' behavior (e.g., "limit the number of interrupts to the student"), and 4) specific tutoring strategies which can be incorporated into an ITS to make its behavior follow these guidelines.
Hypertext is increasingly being used in training and education to provide an alternative (non-linear) presentation format for both verbal and graphic information. While hypertext provides a very flexible format for structuring information, this flexibility itself can lead to information that is structured in a vague or inconsistent form. We suggest that hypertext system designers perform "knowledge engineering" just as AI system designers do. This includes acquiring some body of knowledge in a systematic fashion, determining the global structure to be imposed on the information, and then using explicit algorithms to structure the information into hypertext form. This paper contains two separate but complementary components that address the latter two processes. First, we describe certain features and issues specific to instructional applications of hypertext, and provide some suggestions for structuring the system to accommodate those features. We then present an algorithm for engineering hypertext information, the Cluster Coherence Algorithm. While the algorithm was developed specifically for instructional applications, it is also relevant and applicable to other types of hypertext and hypermedia systems.
Hahn, Heidi Ann, Ashworth, Jr. Robert L., Phelps, Ruth H. and Byers, James C. (1990): Performance, Throughput, and Cost of In-Home Training for the Army Reserve: Using Asynchronous Computer Conferencing as an Alternative to Resident Training. In: D., Woods, and E., Roth, (eds.) Proceedings of the Human Factors Society 34th Annual Meeting 1990, Santa Monica, USA. pp. 1417-1421.
Asynchronous Computer Conferencing (ACC) was investigated as an alternative to resident training for the Army Reserve Component (RC). Specifically, the goals were to (1) evaluate the performance and throughput of ACC as compared with traditional Resident School instruction and (2) determine the cost-effectiveness of developing and implementing ACC. Fourteen RC students took a module of the Army Engineer Officer Advanced Course (EOAC) via ACC. Course topics included Army doctrine, technical engineering subjects, leadership, and presentation skills. Resident content was adapted for presentation via ACC. The programs of instruction for ACC and the equivalent resident course were identical; only the media used for presentation were changed. Performance on tests, homework, and practical exercises; self-assessments of learning; throughput; and cost data were the measures of interest. Comparison data were collected on RC students taking the course in residence. Results indicated that there were no performance differences between the two groups. Students taking the course via ACC perceived greater learning benefit than did students taking the course in residence. Resident throughput was superior to ACC throughput, both in terms of numbers of students completing and time to complete the course. In spite of this fact, however, ACC was more cost-effective than resident training.
Industrial training manuals must often convey sophisticated information to an audience with less than proficient literacy. This paper presents an overview of a hypertext-based system that can compensate for reader deficiencies, serving as an instructional tool for basic literacy skills, as well as means to making job-related information available to training populations with below average reading ability.
Physiological, subjective and mission effectiveness measures were evaluated to test their relative sensitivity and diagnosticity to pilot workload in a part-mission simulation. Two different radar displays were evaluated in an air-to-air simulated scenario using an advanced horizontal situation format display vs the conventional radar display. Data were recorded during the ingress and engagement portions of the mission. The engagement segments were associated with higher subjective workload estimates, smaller cardiac IBIs, fewer eye blinks and shorter duration eye blinks. The new display was associated with shorter duration eye blinks than the current generation display. None of the other measures were associated with statistically significant changes due to display type.
Three experiments were conducted in which positive and negative contrast on visual display terminals were directly compared. Operator tasks included visual search and reading, with accuracy and timeliness of response measured. In all cases where significant differences exist, better performance was obtained with negative contrast (dark characters or symbols on a lighter background). The increases in performance range from a low of 2.0 percent to a high of 31.6 percent. Based on the above results, we believe that there are significant advantages in visual task performance obtained from the selection of negative contrast displays. Current standards that require negative contrast appear to be justified, while future revisions of ANSI/HFS 100-1988 and other standards should seriously consider incorporating negative contrast as a recommendation or requirement.
This project explored the practical importance of ambient color as a concern for maintaining human visual accommodation. Correct accommodation and regression toward resting point accommodation were considered in broadband red, broadband green and white environments. The involvement of voluntary control of accommodation was manipulated by requiring extended performance on a difficult visual task across four light levels. Declining light levels and increased time-on-task were found to degrade the accuracy of accommodation, while manipulation of ambient color produced differences attributable to chromatic aberration. Differential abilities associated with red, green or white conditions were not apparent, as no statistically significant interactions were evidenced. Results of these and other related findings generally suggest that, allowing for chromatic aberration of the lens, the human eye maintains visual accommodation equally well across varied color conditions. Maintenance of correct accommodation and regression to the resting point of accommodation do not appear to be influenced by ambient color.
Visual mechanisms involved in target detection, recognition, and tracking were examined. Relationships were analyzed in the context of simulated combat, focusing on the short range air defense weapon operator. Objectives were to identify visual ability interrelationships, predictors of performance, and interactions with target characteristics, directional cuing and experience. Good predictors included visual acuity, contrast sensitivity, resting focus, near focal point, and blur interpretation. Many of these abilities interacted with the independent variables, producing differential effects on performance. Visual abilities logically grouped into three principal components. Active accommodation predicted target detection and identification; passive accommodation predicted detection and acquisition; and image interpretation predicted acquisition, identification and tracking. Results supported the three visual subsystems theory, based on neurophysiological evidence of pathways in the brain corresponding to specific visual functions.
Currently an estimated 2.8 million people aged 65 years or older need some type of assistant in carrying out everyday activities. Therefore, there exists a need to identify strategies which enhance the functional independence of older adults. There are a number of computer and communication technologies which can be used to provide support. For the potential of these technologies to be realized, they must be easy to use, easily available and accepted by older adults. The goal of this research project was to evaluate the feasibility of having older people use computers to perform tasks in their own home environment and to identify design parameters which facilitate their interaction with these systems. The study involved installing a customized e-mail system in the homes of 38 elderly women. Additional features were added over the course of the project. Data collected included: frequency of use, number and type of messages sent, communications patterns, time distribution of messages and frequency of features used. Overall the results of the study indicate that older adults are willing and able to use computers in their own homes if the system is simple, features are added in an incremental fashion and they are provided with a supportive environment.
The American National Standard for Human Factors Engineering of Visual Display Workstations specifies that character height-to-width ratios be within the range of 1:0.7 to 1:0.9. The empirical literature, however, fails to provide unequivocal support for that requirement. In designing CRT displays there is a complex interaction among several parameters, including character aspect ratio and character height. The present study compared a font with a character aspect ratio within the range allowed by ANSI/HFS 100-1988 to a font with a character aspect ratio outside that range. Using three different visually-intensive tasks, no real performance differences between the two fonts were observed. The study demonstrated that meeting individual design specifications, such as those provided in ANSI/HFS 100-1988, does not necessarily produce the most legible character set. It is argued that a performance-based compliance procedure may allow more flexibility in the design of visual display workstations.
Two exocentric azimuth judgement experiments with a perspective display were conducted with 16 subjects. Previous work has shown these judgements to exhibit a bias possibly due to misinterpretation of the viewing parameters used to generate the display. Though geometric compensations may be used to correct for the bias, an alternate technique selected in the following 2 experiments was the introduction of symbolic enhancements in the form of compass roses. It is suggested that a compass rose with 30 degree divisions results in overall optimal azimuth estimation accuracy when accuracy and decision time are both considered. The data also suggest that the added radial lines on the compass roses may interact with normalization processes that influence the judgement errors.
This study investigated the effect of providing visual enhancements to a three-dimensional (3D) perspective display on the observers ability to judge the azimuth and elevation which separated two computer-generated images. The 3D perspective scenes were modeled after displays presented previously by McGreevy and Ellis (1986) but with several visual enhancements designed to assist users in performance of the experiment tasks. The visual enhancements included: (1) the capability to rotate the perspective scenes in near real-time and, (2) the presentation of solid shaded objects in the computer-generated scenes. The results provide information on the magnitude of the errors which occur when observers are required to make directional judgements using perspective displays and on the effectiveness of several visual enhancements on the accuracy of directional judgements using a 3D perspective display.
This research concerns the use of orientation information by periscope operators. As periscopes in submarines are integrated with image processing systems and graphics workstations, the spatial information concerning the direction of the view of the periscope becomes more difficult to obtain. Consequently, the orientation of the periscope must be graphically represented on the workstation display. The major objective of this research was to determine the best way of displaying this type of information. The research was also concerned with the use of mental rotation by subjects to process the information, as well as the mediating effects of sex differences and spatial ability on performance in this type of task. The experiment tested two display types, outside-in and inside-out, each at two levels of complexity. The subjects were instructed to answer questions concerning the compass headings of submarines and periscopes, and the position of a periscope relative to a submarine. Results showed that the outside-in orientation would be the most preferable type of display representation, with the simple format producing the best performance. The results also revealed that mental rotation-type curves were evident for some combinations of question and display-type and that there were no differences between men and women in this task.
The similarities and contrasts between scientific visualization, and the tasks imposed on the pilot and air traffic controller are highlighted. Relevant principles for 3 dimensional display design for both of these applications are stated, and an experiment is described which contrasts four graphical formats across a number of tasks involving the interpretation of a hypothetical set of scientific data. The tasks vary in the degree to which focused attention vs. integration is involved. The graphical formats were either 2 or 3D renderings and either did or did not contain contours to emphasize objectness. The results revealed that emergent features, created either by objectness or 3 dimensionality, facilitated integration performance. However, 3 dimensionality generally slowed performance on all tasks.
Object displays are receiving increasing interest due to their potential contribution to display designs and to the understanding of basic visual attention mechanisms. The aim of the present research is to develop a more in depth understanding of the attention mechanisms involved in object perception. Multidimensional information was presented in the form of an object that was defined by its color, form, and size. Subjects' ability to divide and focus attention on a specific dimension of the object were examined as a function of (a) the number of irrelevant varying dimensions, (b) the uncertainty of the relevant dimension, and (c) the number of objects that the subjects simultaneously attend to. Two possible mechanisms by which processing resources can be allocated among the dimensions of an object were explored. Three task conditions with various degrees of irrelevant information were presented in a single or dual object display. The task was to identify one of the dimensions as quickly and as accurately as possible. As predicted by Kahneman and Treisman's object file model, results show that all dimensions of the object appeared to be processed. This was evidenced by the influence of the irrelevant size variation on color and form identification. However, the data suggest that although all dimensions were processed they were not processed without cost. Attention appeared to be divided among the dimensions. As the number of dimensions increased, the amount of attention available for each of the dimensions would be reduced. Further, only a small difference between the single and dual object case was detected. The small difference attests to a relative ease in selective attention between relevant and irrelevant objects.
In the present study we examined the degree to which contour and color could be used to minimize focused attention costs. Twelve subjects performed a task in which they were instructed to respond to a centrally located stimulus and ignore flanking items. The flankers could be either compatible or incompatible with the response of the target. Additionally, the flankers could be embedded in the same object as the target or embedded in different objects. When the target and flankers were embedded in the same object, performance was poorer when the target was surrounded by response incompatible items than when it was surrounded by compatible items. However, the response compatibility effect was eliminated when the target and flankers were embedded in different objects. The results are interpreted within a Hybrid Space/Object-based model of visual attention.
The Proximity Compatibility Hypothesis (PCH) proposes that in designing displays, we should try to match the proximity (unity or similarity) of a display's components to the level of mental integration required of information represented by those components. Thus, for tasks demanding integration of information from several channels, we should display the task-relevant information in a perceptually unitary and homogeneous fashion. For tasks requiring independent processing of multiple information sources, unity and homogeneity should be minimized. The present study tested these predictions using thirteen bivariate graphs that varied in terms of the unity and homogeneity of dimensional pairings. All thirteen graphs were used to perform four tasks, with a different group of fifteen subjects performing each task. These tasks included two integration tasks, an independent processing task, and a task that combined both integration and independent processing demands. As predicted by the PCH, subjects performed the more integrative tasks better when using graphs that contained homogeneous elements combined into a single object. When less integrative tasks were performed, multi-object displays were associated with superior performance. However, the PCH failed to predict an interaction between the effects of object integration and homogeneity for the two integration tasks. While homogeneous object displays were used efficiently for both tasks, the benefits of heterogeneous object displays were specific to the task requiring logical rather than computational integration.
This study examines the effects of stress on the processing of displayed information from two types of object displays, when dimensions were formed by the color and size of a bar, by the height and width of a rectangle, and from a separated two bargraph display. Subjects either integrated information across the two dimensions of each display or focused attention on each dimension, in a simulated airborne decision task. In Experiment 1 (14 subjects), stress was imposed via three levels of workload of a concurrent visual search task. In Experiment 2 (14 subjects), it was imposed by 88 dB helicopter noise. Results indicated that information integration was best supported by the rectangle display at higher levels of workload. Both the color bar and the bargraph display were associated with poor performance on the integration task, but were superior on the focused attention task. Hence, an emergent feature of the rectangle (its area), rather than objectness per se, was the critical element supporting information integration and disrupting focused attention. The imposition of noise enhanced the subjective feeling of stress. Noise did not influence performance on the decision task, but differentially affected the resources necessary to extract that information. Noise reduced the resource demands of both object displays and increased the resource demands of the separate bargraph display.
The present experiment investigates the processing demands associated with two tracking strategies: double-impulse and continuous. Twelve subjects performed a Sternberg memory search task concurrently with a compensatory tracking task using either strategy. Central processing demands of both tasks were manipulated as well as the response demands of the Sternberg task. The two tasks showed little resource competition for central processing resources. Response load resulted in resource competition, but did not show any strategy differences. Results are discussed with regard to the importance of understanding strategy differences for workload analysis.
Two experiments were conducted to determine whether grouping of icons on complex graphic displays reduces information processing loads, as measured by the Subjective Workload Assessment Technique and error rates. In Experiment 1, between 2 and 25 symbols were presented on a computer display. Participants were asked to chunk symbols under class labels and store these labels in short-term memory. Two different display formatting variables were tested: spatial proximity grouping of icons was manipulated across three levels, while temporal grouping was manipulated across two levels. Results suggest that display grouping helps operators organize, encode, and store information into task relevant chunks and, in turn, reduces subjective workload and error rates. Experiment 2 was similar to Experiment 1, except that participants were required to remember individual icon names (i.e., participants were asked to remember as many as 25 item names). Results suggest that for chunk formation, storage, and parsing tasks, display grouping may reduce subjective workload, but not error rates.
Decision aiding systems are becoming an important part of command and control. Selecting the best type of decision aiding information remains an important design decision. The research reported in this paper assesses the is to determine if a decision aid in an aircraft identification task should provide a recommendation for action or status information about the identity of the aircraft. Thirty-two subjects were equally divided into four groups: a control group where no decision aiding information was provided; a group who received only status information; a third group who received only recommendation information; and a fourth group who received both status and recommendation information. Results indicated that, in general, providing decision aiding information reduced the time required to identify the aircraft. Differences among the three types of decision aiding information occurred under those conditions when the decision aid was incorrect. When the decision aid provided inaccurate information, the group receiving only status information was least affected by the decision aid and was best able to correctly identify the aircraft. Recommendations for selecting the type of decision aiding information are discussed.
Measurements were made of subjects' head movements as they found and memorized the position of targets located around them. Four factors were manipulated: the size of the field-cf-view (FOV) with which they could view the targets, the number of targets, the background against which the targets were presented (blank or terrain), and the search instructions (slow or fast). The targets and terrain were viewed on a binocular helmet-mounted display. The dependent variables included measures of the amount of head displacement and head velocity. In the slow search trials, small FOVs produced significantly more head displacement and lower head velocities than did the large FOVs. In the fast search trials, head velocity increased with increasing FOV. The results are interpreted in terms of the disruptive effects of small FOVs on the efficient use of coordinated head and eye movements to acquire spatial information.
In aviation, effective execution of some flight maneuvers, such as rescue operations at sea, requires that pilots form a veridical perception of their position and motion with respect to the environment. Previous research has shown that human observers can determine their own motion or spatial orientation from displays simulating observers motion through a rigid three-dimensional environment (Stoffregen, 1985; Andersen & Dyre; 1987; Dyre & Andersen, 1988; Andersen & Dyre, 1989;), however, the sensitivity of spatial orientation to noise in the visual field has not been examined. The present study examined the sensitivity of spatial orientation to noise in the global optic flow field. Displays simulating observer motion along the line of sight through a volume of randomly positioned points were observed monocularly through a circular window that limited the field of view to 30 degrees. The velocity of each display varied according to a function that was the sum of four sine functions of prime frequencies (between 0.15 and 1.0 Hz). Noise was produced by randomly shifting the phase lag of the three-dimensional motion function for each individual point within the display. Two levels of lag were examined: no lag and 10 second lag. Change in posture was used as an objective measure of spatial orientation and was recorded by a Kistler force platform. When no lag was present, increased postural sway was found to occur at all the frequencies of motion simulated in the display. However, for a lag of 10 seconds subjects exhibited no increase in postural sway at the display frequencies. These results suggest that if global optic flow patterns are obscured by noise then the information important for determining spatial orientation is greatly reduced. The importance of these results for flight of maneuvers will be discussed.
The purpose of this study was to evaluate the effectiveness of three visual depth cues, and combinations of these cues, in a dynamic air situation display. The study was conducted to help determine how best to display aircraft location to a pilot. Three different depth cues (stereo 3-D, aerial perspective, and familiar object size), were investigated. Additionally, two levels of display density (13 or 25 aircraft) were evaluated. The results of the study indicated that the number of depth cues, which ranged from zero to three, affected the subject's ability to determine aircraft location. Display density also affected performance. However, the particular type of depth cue did not have a differential effect. In other words, it makes a difference if one or two depth cues are displayed, but not the particular cues used.
Speech communication among crew members in military vehicles suffers from several sources which interfere with speech intelligibility. The effects of intelligibility were studied in the SIMNET Training facility at Ft. Benning, GA. Twelve Bradley-qualified, three-man crews were tested on a series of navigation and gunnery exercises. A repeated measures design was used to test
Text-to-speech systems are currently used in a variety of telephone applications for remote access to information. While this form of synthetic speech may be cost-effective relative to digitized speech, the impoverished quality of the speech signal may adversely affect its comprehensibility in telephone applications. The primary objective of the present research was to investigate the amount of familiarization needed to achieve an asymptotic level of comprehension performance with high-quality synthetic speech presented in the telephone environment. Sixty-four male and female native English speakers listened to digitized natural and digitized synthetic sentences that contained relatively high and low-predictable components. Subjects provided truth-value judgments for which accuracy, response time and response certainty were measured. Results indicate that a high-predictable introductory message of approximately three relatively short sentences may improve comprehension performance of high-quality synthetic speech in some telephone applications.
The experiment evaluates several alternatives for the design of user dialogues of a telephone system which integrates advanced features to accelerate access to telephone services and also Spoken Speed Dial, Call Answering and Call Delivery. In the experiment, subjects placed phone calls and relayed pieces of information. The dialogue they then encountered presented them with experimental variations of dialogue styles and provided an opportunity to use the Spoken Speed Dial feature. Subjects made 16 call attempts in the first phase of the experiment, and their preferences for dialogue features were recorded between trial blocks and at the end of the experiment. In a second phase, subjects were the recipients of Call Delivery of various types. The results show clear preference for verbal prompts, the usability of Spoken Speed Dial, conflicting attitudes towards the extra step of recording the recipient's name in Call Answering, and a preferred mode of Call Delivery.
Telephone information systems using synthetic speech displays have become a common form of communication between a computer and a remote user. The purpose of this study was to examine five variables associated with the design of such telephone information systems: the rate of synthetic speech, the time allowed for user input, the structure of the menu hierarchy, the availability of a diagram of the menu structure, and the amount of augmented feedback provided as the user traversed the menus. Each subject completed 16 searches through the auditory database using the telephone keypad. After each search, the subject transcribed a message presented by synthetic speech. The search task was affected by all variables except feedback. The accuracy of transcription was affected only by the rate of the synthetic speech. Implications for the design of telephone information systems are discussed.
Converging experimental tasks were used to address the development of usable interactive icon sets for telecommunications network applications. Naming and matching addressed the individual informativeness value of an icon, with naming reflecting natural context response biases and familiarity contributions more than matching. Naming also allowed intrusions to be identified early. Preference ratings simulated user behavior with iconic menus, and provided discriminability data that could help to select icons where naming and matching revealed only failures. Issues resolved and revealed during this work are discussed.
This study looked at the way users enter alphabetic information on a standard, 12 key telephone. Twenty subjects entered names on the telephone keypad using the one keystroke per letter method. Subjects were not given instructions on how to enter the characters Q, Z and other punctuation which do not appear on the keypad. Data were collected on the keys chosen for these special characters, and for keypress errors and name entry times. The results do not indicate a clearly preferred entry method for Q, Z and hyphen, however, apostrophes were likely to be skipped (not entered) by the subjects.
With the growing use of the telephone as an input device, human factors designers need more human performance data on how quickly and accurately users can learn and execute alternative data-entry input strategies, as well as indications of what strategies users prefer. This study assesses five different strategies for entering alphabetic codes from a telephone keypad.
The purpose of this paper is to present an evaluation of the icon-based interface employed in Words Strategy, an augmentative communication system used by speech impaired and nonspeaking individuals. Words Strategy is a software system implemented in the Prentke Romich Touchtalker, a special purpose computer which allows individuals to enter entire sentences with very few keystrokes, and which provides synthetic speech output of those sentences. The system has been criticized because of the long training period required for mastery, and because its use of multi-meaning icons might impose a severe memory load on the user (Light et al., 1988). The two studies presented derived a learning curve for Words Strategy and investigated the relearning of the system six months after initial training. The Touchtalker keyboard layout was also evaluated. Results indicated that the assignment of multiple meanings to an icon did not create a problem; in fact, it enhanced performance. In addition, relearning of the icon associations occurred rapidly, generally within one trial. The implications of the data for redesign of the Touchtalker keyboard are discussed.
Maintaining secure radio communications in the armed forces places an especially heavy cognitive load on all involved personnel. For example, in the Army each individual must memorize on a daily basis at least three new five character codes. The current Army code consists of the sequence letter-digit-letter followed by the sequence digit-digit (LDL-DD.) Research on paired associate learning suggests that recall could be improved by using either a letter only stimulus pair (LLL-LL) or a letter-digit stimulus pair (LLL-DD). An experiment was run to test this hypothesis. Recall for the experimental letter-digit code was over twice as good as recall for the current code.
Four different formats of an existing mining equipment repair manual were prepared for comparative human performance test: (1) original hard copy text; (2) improved hard copy text with enhanced readability and indexing features; (3) computerized (hypertext) version of original text; and (4) hypertext version of improved text with interactive help features. Students in a diesel mechanics class (n=55) then were tested for proficiency in accessing and understanding information presented in these different formats. The results indicate that: (1) although the users accessed the information less quickly using the computer compared with hard copy, they positively endorsed computerized hypertext presentation of maintenance information; and (2) enhanced text readability and indexing improve access to and understanding of maintenance information, but this improvement was not subjectively appreciated by the users relative to other manuals they had used. This text indicates that change to computerized maintenance manuals should be made cautiously, and that more research is needed to measure different hypertext design and training factors.
The primary goal of this research was to examine the relative effectiveness of traditional versus computer-based training techniques. Additional goals were to assess how presentation modality and dynamic versus static presentation of material affects learning. Four training techniques were evaluated: paper instructions, computer simulations with on-screen text instructions, computer simulations with auditory instructions, and computer simulations with on-screen text and auditory instructions. Sixty subjects performed four tasks using a computer-based on-screen simulation of a display telephone. Before executing each task, subjects in each of the four treatment groups received a brief training session. Dependent measures consisted of time to complete the tasks, error rate, and subjective measures of how well the various training techniques were liked. An analysis of variance indicated that computer simulations with auditory instructions and simulations with combined modality instructions resulted in task performance times that were significantly less than those obtained following paper instructions. Tasks performed following computer-based training had a significantly lower error rate than did tasks performed without instructions. No significant differences were found among the training techniques for subjective measures of how well the training techniques were liked.
The design of the user interface for emergency communications systems is critical in providing timely and appropriate response to emergency requests. With the creation of a plan to implement a national emergency number (9-1-1), more communities are choosing to centralize all emergency service communications with a single center called a public safety answering point (PSAP). This paper describes some of the issues faced in building a PSAP attendant workstation user interface for an Enhanced 9-1-1 emergency communications system. Designing a user interface for emergency communications requires a thorough understanding of the circumstances in which PSAP attendants operate. Researching the work environment of PSAP attendants has served to identify several human factors issues that became design goals for the user interface. Conducting iterative usability testing with PSAP attendants ensured the design of a usable system.
The human-computer interface of a computer-aided instruction (CAI) system can affect the learning of knowledge and skills. This study investigated the relative effectiveness of various mental models (i.e., metaphor, surrogate, and network) of a CAI system on the acquisition of intellectual skills and verbal information. Before the learning session started, subjects from each mental model group were given instructions about a representation of the system. Immediately after completion of the learning session, subjects were tested for their problem-solving performance. Time spent to solve each problem along with its accuracy was recorded. Results showed that there was an effect of mental models on the acquisition of intellectual skills. In terms of response speed and accuracy in problems requiring high complexity reasoning tasks, the Network model was most effective among the three. The Metaphor model, however, was best for problems requiring low complexity reasoning tasks. Mental models showed no effect on speed of the recall in verbal information. The metaphor model, however, was the best in term of recall accuracy. These results suggested that CAI systems would require different human-computer interfaces depending on types of content knowledge and task requirements.
As an input technique, handwriting recognition offers benefits in ease of use, but poses special problems for the user when a recognition error occurs. When a recognition error occurs, the user is often surprised since the misrecognized character often looks acceptable to him/her. In contrast, when a typing error occurs with a keyboard interface, the user immediately understands what has happened. The purpose of this study was: 1. to gain insight into what people think when a recognition error occurs, and 2. to discover whether a simple monochrome display of a user's handwriting prototypes would provide information which could be used to improve recognition accuracy. Such a display might serve as a point of reference for understanding and avoiding recognition errors. The results of the study suggested that a display of handwriting prototypes can be used by people to improve recognition accuracy. The study also found that in a large percentage of instances, people do not have any insight into the cause of a recognition error. Some possible causes for this predicament and some possible remedies are discussed in the paper.
Accurate models of operator decision making have been advocated by a number of researchers as a fundamental component of system design (Glenn, (1989); Norman, (1986); and Rasmussen, (1985)). Such models can be used to strengthen the design integrity of decision support systems in which tasks are allocated between human and computer. In this investigation, a cognitive modeling technique based on the GOMS model of Card, Moran, and Newell (1983) was used to analyze the composition and structure of decision-making strategies in a multidimensional diagnostic task. Two general strategies emerged along with the finding that certain strategies predominated according to visual display format. The methodology offers a promising approach to the analysis of verbal and retrospective protocol data solicited in conjunction with complex decision-making tasks.
The GOMS model (Card, Moran, and Newell, 1983) was used to develop the content of a help system from the goals, operators, methods, and selection rules needed to perform HyperCard authoring tasks. Three groups of 12 novice HyperCard users performed 28 authoring tasks using either the GOMS help system, an original help system developed by Apple Computer, or no help at all (a control group). In the two help groups, users were provided the most complete help method and did not have to search for the help information. The results indicated that both help systems significantly decreased the time spent performing the authoring tasks when compared to the control group. Although a 23% decrease in execution time for GOMS users compared to original users was not significant, variance ratios confirmed that GOMS users, as a group, were more consistent when compared to original and control users. Also, GOMS users spent significantly less time per help display, translating the help methods into execution performance 78% more efficiently than original users. This result probably was due to the procedurally explicit and consistent help methods specified by the GOMS model.
This panel will explore the varied uses of prototyping in the user interface design process. We expect to show that there is no single thing called "user interface prototyping" and that the differences are, in many ways, greater than the similarities. Panelists have been chosen to represent a wide cross section of user interface design tasks. Collectively, members of the panel have experience in prototyping hardware and software, computer programs and telecommunications services, residential, business, and engineering applications, at various levels of fidelity, and in all parts of the design process. We expect to show how these factors all influence the way prototypes are used and that the designer must be careful in choosing the most appropriate prototyping methodology for his or her needs. Each panelist will begin by characterizing the portion of the design process that he or she will be talking about. This represents a major division in the way prototypes are used, both in the way that they are built and in the type of information sought by the designer. Prototypes used early in the design process (requirements analysis) tend to be of lower fidelity and are used to test preferences for design alternatives, while those used later in the design process (system specification) tend towards higher fidelity and are used to test usability. Each panelist will point out the strengths and weaknesses of his or her prototyping methodology. Each panelist will address the following points: * Appropriate uses of prototyping methodology (early vs. late in design process) * Characteristics of prototypes (platform, level of fidelity, etc.) * Information gathered from the prototypes (evaluate design preferences, measure performance, etc.) * Relative costs of the method (time to build, flexibility, etc.)
Numerous computer input devices have been designed and evaluated in the last decade. In most evaluations, simple pointing and tracking tasks were used that do not adequately represent today's computer tasks. The following research evaluated four input devices with respect to usability and preference issues. The UnMouse, the Turbo mouse, and the Felix mouse were compared with the Apple Macintosh Mouse on four different types of task: tracking (point-and-click), desktop manipulation (e.g., point, click, and drag), word processing, and graphics generation. Users expressed preferences for the devices in terms of lower-arm fatigue, precision of control, and comfort of movement. Results indicate that the Macintosh Mouse and the Felix device were quicker and preferred over the other devices.
This paper discusses important usability issues that impact the future development of graphical user interfaces for UNIX. UNIX provides a user with the capability to combine basic commands using input/output redirection to create new commands to perform more complex tasks. The new graphical interfaces do not directly aid composing commands. However, it takes more than five years of experience to begin to be able to fluently compose new, complex commands. This paper describes a methodology which focuses attention on the problems that must be solved in order for these core features of UNIX to be accessible to individuals with one to five years of experience.
This paper describes a visually-oriented, iterative methodology for the design of human-computer interfaces. It focuses on the implementation of an interactive electronic information kiosk, the "CHI '89 InfoBooth." Throughout the system's design, the interdisciplinary project team concentrated on using visual materials to simulate the user's experience, rather than on writing text specifications. The paper discusses the role played by visual design in three phases of the system's development. It first describes how the use of "visual placeholders" -- sketchy drawings conveying interface ideas -- facilitated early design explorations. Next, it shows how "storytelling prototypes" were used to refine ideas before rigorous programming was undertaken. Finally, it describes how problems uncovered during informal user testing of functional prototypes were corrected by seemingly small changes to the interface's appearance. Specific visual examples are provided throughout.
Three studies from a text search usability program are reported. A logging study revealed the most frequent kinds of searches carried out by users. A laboratory study compared three user interface design alternatives accommodating these searches. A column layout with the logical operators AND, OR, and NOT as column headings, and which assumed parentheses by the spatial positioning of the search term, was significantly faster than the other two designs, and was also preferred by a majority of the subjects. An interview study indicated that users need a verification step prior to search submittal, and a thesaurus linked to the search term fields. There is a discussion of the limitation of each screen design and an evaluation of the methodologies used.
There are a variety of techniques that can evaluate rapid prototype design alternatives for human-machine interfaces. These techniques can be used singly, or in combination. Empirical techniques require that the analyst obtain data from respondents who exercise a rapid prototype. Empirical techniques include questionnaires, observation and unobtrusive methods of data collection, and retention tests. Analytic techniques do not require data collection, but require that the analyst have a formal description of the human-machine interface and use methods for evaluating these descriptions. Analytic techniques include structured walkthroughs, behavioral models of human-machine interaction, Operator Sequence Diagrams (OSD), and Link Analysis. Empirical and analytical techniques can lead to systems that are more likely to meet users' needs, especially when the analyst employs several techniques simultaneously.
Recent attention has been focused on making user interface design less costly and more easily incorporated into the product development life cycle. This paper reports an experiment conducted to determine the minimum number of subjects required for a usability test. It replicates work done by Jakob Nielsen and extends it by incorporating problem importance into the curves relating the number of subjects used in an evaluation to the number of usability problems revealed. The basic findings are that (1) with between 4 and 5 subjects, 80% of the usability problems are detected and (2) that additional subjects are less and less likely to reveal new information. Moreover, the correlation between expert judgments of problem importance and likelihood of discovery is significant, suggesting that the most disruptive usability problems are found with the first few subjects. Ramifications for the practice of human factors are discussed as they relate to the type of usability test cycle the practitioner is employing, and the goals of the usability test.
One of the new tools in human factors today is usability testing. More and more human factors professionals are conducting these tests to get accurate feedback from typical users to improve the usability, overall quality, and sales of their products. American Institutes for Research has been doing usability testing for five years now and have discussed testing with the directors of many labs. We have a body of knowledge and experience from which other professionals can benefit. In particular, we will be discussing data logging software and the functional requirements for it. In this paper we will describe the requirements for data logging software to log data, edit the data log, back up data and analyze data. Due to the scarcity of commercially available data logging packages (we know of only one at the present time) we found it necessary to write our own software for use in our usability lab and we know others are doing the same. Based on our experience of writing and using this software, we will describe the important functional requirements for data logging software.
The process of designing a customer activated terminal (CAT) is described. A CAT is a self-service computer system that enables people to order food or merchandise, request information, complete banking transactions, etc. The specific application that this paper considers is a quick service restaurant lunch menu. Designers of CATs must assume that many users of such systems have no prior computer experience. One of the goals of this paper is to identify some specific interface design principles that seem to be appropriate for other CAT applications. A second goal is to illustrate how an iterative design process that focuses on user, task, and environmental characteristics can result in a successful product. The paper describes a four phase iterative development approach: data collection, initial design, testing and redesign, and implementation. Activities in each phase emphasize understanding user, task, and environmental characteristics. Several examples of the interface design at various stages of development are presented, and reasons for why design features were altered are discussed. The paper concludes by articulating several principles that apply to the design of CATs.
A number of human factors data sources provide guidelines and recommendations for the system design process. Much of this information is available to the human factors engineer in design handbooks, textbooks, and periodicals. This paper will discuss the feasibility of incorporating human factors design data into intelligent, knowledge based systems referred to as design associates. Results of recent efforts to implement two types of design associates are also discussed.
Case-Based Reasoning (CBR) is a methodology for employing imprecise data and uncertain information in the development of solutions to fuzzy real world problems. It is seen as an alternative to rule-based systems, which may fail under these conditions. Under the sponsorship of DARPA, we have developed a generic CBR shell. The system was evaluated in the domain of NACA airfoils. A subject matter expert was asked to select airfoils (cases) which were similar to target airfoils. He then defined attributes and weights by which he had judged this similarity. These parameters were then used by PROSPER in a retrieval of airfoils similar to the same targets. From a case base of 98 airfoils, PROSPER retrieved 9 out of 17 selected by the expert. After modifying the similarity algorithms, PROSPER retrieved 11 out of 17.
The Explosive Ordnance Disposal/Automated Information Retrieval & Expert System (EOD/AIRES) is a combination of state-of-the art computer hardware and software, developed by the U.S. Army Electronics Technology & Devices Laboratory (ETDL) to assist EOD teams in identifying and rendering safe unexploded ordnance. This study examines the identification time and error rate of using the EOD/AIRES versus using the current method, the TM-60 series manuals. Identification time was significantly reduced using the EOD/AIRES for less-experienced, U.S. Army EOD soldiers. Error rates decreased with the EOD/AIRES, but not by statistically significant amounts. The qualitative results clearly show a preference for the EOD/AIRES.
Shute, Steven J. and Smith, Philip J. (1990): Knowledge Acquisition Techniques: A Case Study in the Development of a Knowledge-Based System for Document Retrieval. In: D., Woods, and E., Roth, (eds.) Proceedings of the Human Factors Society 34th Annual Meeting 1990, Santa Monica, USA. pp. 320-324.
A case study is presented describing and illustrating the use of a number of knowledge acquisition techniques for the development of a knowledge-based system. The system developed is a computerized intermediary to assist in searches of bibliographic databases. Particular emphasis is placed on a discussion of the use a conceptual model to guide in the design and analysis of an empirical study of the expertise used by human search intermediaries. The framework provided by this model made it possible to conduct a rigorous analysis of discourses that were recorded as human intermediaries assisted information seekers.
This paper explores some of the implications of comparing interface design with engineering design, arguing that such a comparison has typically been more misleading than fruitful. The problem, however, lies not in the comparison but in the fact that the model of engineering design most often used is an idealized one which is not representative of the actual history of the process of engineering design. When engineering design is viewed from a more realistic perspective it can be seen that design failures are both inevitable and instructive. Similarly, it can be seen that interface design, human factors, and psychology in general are very often informed more by analysis of failure than by success.
Quantitative measures of consistency are formulated for human-computer interactive tasks. Two different kinds of consistency are considered: cognitive consistency and display layout consistency. Cognitive consistency is formulated by constructing the methods used for a task and the steps needed to perform the methods. A quantitative value for cognitive consistency is determined by analyzing the number of changes which would have to be made to change one method to another method. Display layout consistency is formulated by examining display parameters between two or more layouts. An experiment was performed to test the predictions of the quantitative analyses of consistency. Cognitive inconsistent tasks and inconsistent display layouts had a slightly detrimental effect on the speed of performance during an initial session. When the subjects had to return to the task several days after originally learning the task, performance on the cognitive inconsistent tasks was slower than on inconsistent display layout tasks. This latter result indicates that users will not necessarily have difficulty when learning inconsistent interactive methods but the problem will occur once the methods are learned and the user must switch between programs using inconsistent methods of interaction.
System adaptation is necessary as organizations and individuals evolve. There are various ways in which systems can be made to adapt. By identifying the dimensions and degrees of adaptation and selecting those feasible to implement, it is possible to incorporate useful adaptive features in the systems of today.
The Multi-Oriented Structured Task analysis (MOST) methodology attempts to be most things to most of its users most of the time by balancing the needs of both system users and system designers for flexibility and adaptivity. The MOST methodology structures a task analysis and integrates it with other more formal specification methodologies including software engineering methodologies, human-computer interaction methodologies, and explicit user models. MOST stores these specifications in a knowledge base of four major interlinked foci for the information (users, tasks, data, and tools) and an optional foci (constraints) that can be linked to any of the major foci. The linkages in a MOST knowledge base facilitate the flexible structuring and restructuring of records. These linkages can model alternative designs and/or paths by which a system can adapt its interface while maintaining functional consistency. Various design heuristics (both software engineering and human factors) can be applied to an analysis recorded in a MOST knowledge base to assist in its transformation into a suitable design. The MOST methodology is designed to cooperate with and to assist the designer rather than to force the user to serve the methodology.
The context of human-computer interaction consists of those objects referred to by the users in the computer system and the users themselves. A contextual knowledge base (embedded within a computer system) can be used to facilitate and control the adaptation of the computer system based on information about users stored within this context. A general Context Management System design has been proposed and a prototype of this design implemented to meet the needs of the adaptation of systems to individuals and groups of users.
The objective of this investigation was to experimentally evaluate possible relationships among personality types, selected psychometric factors, and categories of cognitive activity, with an intent to develop user behavioral models for interface design. Twenty subjects (10 novice and 10 experienced) participated in an interactive scheduling task with two levels of task complexity. The task involved navigation through ten action alternatives, with each alternative being represented by a screen, to allocate resources. The subjects were administered with Myers-Briggs Type Indicator (MBTI) tests and a battery of psychometric tests. Cognitive time, total number of menu selections, total number of assignments, and the distribution of cognitive time into intelligence, design and choice activities were the performance measures. Variables derived from measurements of personality traits and psychometric factors were evaluated as predictive measures of performance. The personality trait for sensing/knowing was significant in predicting overall performance, as were psychometric factors for induction, integrative processing, and spatial scanning. The personality trait of extrovert/introvert was found to be significant in predicting the distribution of screen use times, as were derived factors for locus of control, memory ability, and personality. These results can form the basis for examining the usefulness of personality types and psychometric factors as variables in models of user characteristics.
An experiment was conducted to evaluate menuing and scrolling as alternative information access techniques when a touch-sensitive input device was used to interact with the system. A hierarchical menu structure and three scrolling methods, line-by-line, half-screen, and full-screen, were tested. Level of goal word familiarity (familiar and unfamiliar) and window display size (12 or 24 lines displayed on the screen) were also examined. The task consisted of using a touch tablet to locate a target goal word with one of the four access methods. Members of a single set of 64 words, 32 familiar and 32 unfamiliar, served as goal words in all conditions. Performance data (total time to complete the task) were collected from 48 subjects. Access method and window size were between-subject variables. Each subject received both word familiarity levels. Results of an analysis of variance on mean total task time (MTIME) revealed a significant access method by word familiarity interaction. Separate analyses of variance were conducted on MTIME for familiar and unfamiliar goal word sets. When the goal word was familiar, menuing was fastest, followed by line-by-line, full-screen, and half-screen scrolling. For unfamiliar goal words, line-by-line scrolling was fastest, followed by full-screen, half-screen, and menuing. The effect of window size was not significant. The findings of this study suggest that the operator's familiarity with the information being searched is important when deciding upon an access method.
This study examined the effects of adding 3-D stereoscopic altitude information to a standard aircraft tactical situation display. In an experiment, displays presenting varying number of hostile aircraft were presented to subjects in either a 2-D or 3-D format via a computer driven real time simulation system. Tests indicate, that for all levels of number of threats, the 2-D displays resulted in faster response times and fewer errors to locate particular classes of targets. Questionnaire and interview results, howev