Crash avoidance technology is an area of ongoing research in Europe and in the US. The increasing interest in crash avoidance is shared by the National Highway Traffic Safety Administration and by car manufacturers. There are three main types of vehicle crashes: (1) single vehicle, (2) vehicles crashing at an intersection, and (3) vehicles driving in the same direction when the "following" car crashes into the "lead" vehicle. This last type of crash is called "rear-end" or more descriptively "front-to-rear-end." This symposium focuses upon this last type of crash because technological advances in radar and/or laser technology could assist drivers in avoiding a crash into a lead vehicle by warning the driver of an impending danger ahead. The technological challenge for avoidance of single vehicle or intersection crashes is more complex and far from a technological solution. While the technological development of collision warning systems is ongoing, little research is available in the area of human factors. We need to know more about the potential of collision warning systems in avoiding crashes, how to design the human-machine interface in terms of warning timing and types of display, and the effects of a warning system on driver performance.
The potential value of a front-to-rear-end collision warning system based on factors of driver behavior, visual perception and brake reaction time is examined in this paper. Twenty-four percent of all motor vehicle crashes involving two or more vehicles are front-to-rear-end collisions. These collisions demonstrate that several driver performance factors are common. The literature indicates that drivers use the relative size and the visual angle of the vehicle ahead when making judgments regarding depth. In addition, drivers often have difficulty gauging velocity differences and depth cues between themselves and the vehicle they are following. Finally, drivers often follow at distances that are closer than brake-reaction time permits for accident avoidance. It is apparent that the comfort level of close following behavior increases over time due to the rarity of consequences. Experience also teaches drivers that the vehicle in front does not suddenly slow down very often. On the basis of these driver behavior and human performance issues, a front-to-rear-end collision warning system that provides headway/following distance and velocity change information is considered. Based on the driver performance issues, display design recommendations are outlined. The value of such a device may be demonstrated by the added driver safety and situation awareness provided. The long-term goal would ultimately be the reduction of one of the most frequent type of automobile crashes.
Two visual factors in the avoidance of front-to-rear-end collisions are (a) judging time to collision so as to control braking optimally on a moment-to-moment basis, and/or (b) judging one's heading relative to the lead car so as to steer appropriately. It is known that time to contact equals {Theta}/(d{Theta}/dt) and it is also known that the eye is sensitive to {Theta} and, separately, (d{Theta}/dt) ({Theta} is the angular size and (d{Theta}/dt) is the rate of increase of angular size). But whether the eye is sensitive to the ratio ({Theta}/(d{Theta}/dt) and, if so, whether drivers use this information are further questions. We report here that the human visual system does contain neurons sensitive to the ratio {Theta}/(d{Theta}/dt) rather independently of {Theta} and (d{Theta}/dt). It is important that the driver looks directly at the lead vehicle: sensitivity to (d{Theta}/dt) falls off steeply in peripheral view. But, over a wide range, sensitivity to (d{Theta}/dt) is independent of contrast. In addition to the classical disparity-driven system for binocular depth perception, there is a separate binocular system for motion in depth. Precise judgements (0.2 deg) of heading are supported by this stereomotion system, but on the other hand about 20% of the population have stereomotion "blind spots" (i.e. field defects). Monocularly-available informations can also support precise judgements of heading, and field defects seem to be rare. Field studies on flight simulators and telemetry-tracked jet aircraft showed that laboratory measures of sensitivity to (d{Theta}/dt) and to the rate of expansion of the optical flow field predicted intersubject differences in performance on flying tasks that were closely related to the rear-end collision situation.
Warning signal effectiveness issues associated with the design of a front-to-rear-end collision warning system are discussed. Potential negative effects are that warnings may occur rarely, startling the driver and adding to cognitive load and stress, or alternatively, warnings may occur frequently and be ignored by the driver. To minimize negative effects, four design concepts are considered: (a) a graded sequence of warnings, from mild to severe, (b) a parallel change in modality, from visual to auditory, (c) individualization of warnings, and (d) a headway -- distance to lead car -- display.
The degree of caution that people are willing to take for a given product is largely determined by their perceptions of the risk associated with that product. Research suggests that risk perceptions are determined by the objective likelihood or probability of encountering potential hazards (Slovic, Fischhoff, and Lichtenstein, 1979). However, there is also research suggesting that objective likelihood plays little or no role in determining risk perceptions. Rather, risk is determined by the subjective dimension of the hazard or in other words, the severity of injury (Wogalter, Desaulniers and Brelsford, 1986, 1987). The present research examined aspects of these two studies in an attempt to reconcile the observed differences. Subjects evaluated either the Wogalter et al. (1986, 1987) products or the Slovic et al. (1979) items on eight rating questions. Results demonstrated that severity of injury was the foremost predictor of perceived risk for the Wogalter products, but that likelihood of injury was primarily responsible for ratings of risk for the Slovic items. The two lists differed substantially on all the dimensions evaluated, suggesting that the content of the lists is responsible for the contrary findings. In a second study, subjects rated another set of generic consumer products. These ratings showed a pattern of results similar to the Wogalter products. Overall, this research: (a) explains the basis for conflicting results in the risk perception literature, and (b) demonstrates that severity of injury, and not likelihood of injury, is the primary determinant of people's perceptions of risk for common consumer products.
The evolution of automated and semi-automated systems is rendering continuous regulation relatively obsolete, leaving periodic "management" interventions as the main way in which operators exercise control. Consequently, the human is now more frequently required to respond in uncertain, unusual, or "emergency" conditions. Such circumstances connote high stress environments. Consequently, the research reported here investigates expertise at decision making under stress. The source of stress is ubiquitous in occurrence, namely time pressure. We present a process model that explains and predicts the decision behavior of skilled operators as they manage risk under time stress. The model identifies three components of decision making, (1) attention, (2) assessment, and (3) intervention. Attention (1) scans widely among information displays and focuses action narrowly upon one of three procedures for (2) assessing the attended information. Separate procedures assess ({alpha}) the risks posed by the environment, ({beta}) risks generated by interacting with the environment, and ({gamma}) uncertainty about those risks. The uniquely appropriate intervention (3) is selected by a small set of rules that match heuristically the assessments of risk and uncertainty to a short list of alternative actions. The model is validated with respect to the operation of skilled operators in the domain of currency exchange. In comparing performance versus simulation data, the model identifies the one procedure that resists automation -- the assessment of risks posed by the environment. This assessment involves causal arguments that often rely upon extensive domain knowledge. In contrast, attention to displays, heuristic matching, and the procedures for assessing uncertainty and the risk of interaction can be delegated to an automated decision support system. This result has clear implications for the design of systems to support skilled decision making under emergency conditions: decision support systems for dynamic environments like currency trading must notify the operator of the occurrence of system parameters that require assessments of environmental risk and incorporate these assessments into automated procedures that recommend appropriate interventions.
Subjective fatigue of 11 C-141 pilots serving in the United States Air Force Military Airlift Command (MAC) during the Desert Storm campaign was assessed in a 30-day field study. Subjective fatigue measures were obtained from pilots at the beginning and end of each duty day using the Profile of Mood States (POMS) fatigue dimension. Also, a 7-point fatigue rating was recorded every 4 hours. The two fatigue measures were each evaluated with respect to (1) 48-hr cumulative flight time, (2) 48-hr cumulative sleep time and (3) 30-day cumulative flight time. The data indicate that at least 15 hours of sleep per 48-hr time period is needed to avoid pilot fatigue. Recent flight time was also found to be related to subjective fatigue, but this relationship seems rooted loss of sleep during long flights. Cumulative 30-day flight time, which is the measure currently used to regulate flight hours, was not related to increases in subjective fatigue.
This paper deals with the question to what extent various factors, suggested in the literature, can be identified as contributory to the occurrence of accidents with consumer products. Data have been gathered in an on-site investigation of accidents. Contributory factors taken into consideration in the method of data collection include characteristics of the use actions, the product, the situation and the user. The explorative study revealed that the relevance of the various contributory factors is limited. This would imply that the development of general guidelines for the anticipation of accidents in the design of everyday products is seriously hampered.
The efficacy of two warning-related factors to produce cautionary behavior in a chemistry laboratory task was examined. Experiment 1 compared the effects of a posted-sign warning and a within-instruction warning on behavioral compliance. The results showed that a warning embedded in a set of task instructions produced significantly greater compliance (the wearing of protective gear) than a similar, larger warning posted as a sign nearby. Experiment 2 reexamined the effect of location and also examined the influence of the presence versus absence of pictorials. The results of Experiment 2 confirmed the location effect of Experiment 1. No influence of pictorials was noted, although there was a nonsignificant increase in compliance when pictorials were added to the within-instruction warning. The results indicate that warning placement is important for eliciting behavioral compliance to safety messages. Explanations such as differences in field of view and perceived relevance are discussed.
Safety standards for consumer products can offer an important contribution to accident reduction. This paper describes how effective testing methods for safety requirements, which are part of a safety standard, may be developed. In order to be effective, testing methods for the evaluation of products on aspects of safety must be valid, reproducible, and efficient. Various forms of testing methods are discussed with regard to their effectiveness. The development of testing methods for trapping hazards, which can be applied to all kinds of products, is described in a case study. For this purpose, a number of accident scenarios have been drawn from data on trapping hazards. In each scenario a testing method is described, and criteria based on human factors data is added. Accident scenarios have proved to be an extremely useful link between accident data and the simulation of performance on which a testing method can be based. Because human factors data is usually gathered for other purposes, implementation of this data in safety standards must be done with great caution. We recommend the use of man-models in testing methods for optimal results. Furthermore, we recommend the verification of test conditions and criteria by user trials or panel tests.
The circumstances of all work-related fatalities occurring in Australia over a three year period were analysed to determine how they differed between occupational groups. Correspondence analysis was used to examine the relationship between the sequence of events immediately preceding the accident, the involvement of unsafe work practices and type of work being performed. There were clear differences in the causes of deaths at work between occupational groups which provide information about the most likely targets for accident prevention.
Smillie, Robert J., Snyder, Harry L., Gunning, David, Inaba, Kay and Booher, Harold R. (1992): What is More Important in Information Design -- the Hardware and Software Used to Process and Present the Information, or the Principles Used to Determine the Content and Format of the Information?. In: Proceedings of the Human Factors Society 36th Annual Meeting 1992. p. 1044.
Proposition: Information Design is nothing more than an interface issue, i.e., the human user and the presentation medium. Research on the following topics is sufficient to design and develop legible, comprehensible, interactive, adaptable electronic display systems: - eye movement, - visual performance, - audition, - document design, - information processing, - data base design/organization, - visual angle, - hypermedia techniques, - color phenomenon, - electronic presentation display technology. After controlling for training, the differences in human performance (reading, understanding, etc.), using such display systems are more a function of the psychophysical factors (spatial, temporal, and chromatic) than information design factors (data organization, graphical representation, and simple english). Therefore, consistent and quantifiable improvement can only be obtained through improvements in image quality that correlate with the psychophysical factors.
A new software program, Ergonomics Analysis and Problem Solving can be used to assist in the performance of consistent ergonomic analyses. The program works by taking the operator through a series of simple steps to compile comprehensive job-task breakdown, perform a consistent ergonomic risk factor identification, and prepare a list of effective interventions. The operator will enter specific information about the job being studied. This specific information will be integrated into several interim reports as well as a final report.
Star Cruiser is a complex laboratory task that was designed to study decision making processes. It is intended to provide a rich perceptual environment in which to study the perceptual decision heuristics utilized by operators in similar tasks (Shively & Kirlik, 1991, Kirlik, Markert & Shively, 1990). In addition, a great deal of flexibility is offered by its script-style control. Researchers interested in such areas as workload, situational awareness, and skill development may also find it useful. It is presently being utilized in laboratories at NASA-Ames and Georgia Tech, where it was jointly developed, but the software is now available for distribution to other interested laboratories.
According to aviators and researchers, the optimal environment for aircrew coordination training (ACT) is in flight simulators (Prince, Oser, Salas, & Shrestha, 1992). However, most flight simulators are designed solely for cockpit personnel, and additional personnel in multi-crewed aircraft (e.g., crew chiefs, flight attendants) are typically excluded from this vital phase of training. This demonstration presents a PC-based Crew Chief Station specifically designed to enable the inclusion of military helicopter crew chiefs into simulator-based ACT scenarios.
Cyberspace is the environment created during the experience of virtual reality. Therefore, to assert that there is nothing new in cyberspace alludes to there being nothing new about virtual reality. Is this assertion correct? Is virtual reality an exciting development in human-computer interaction, or is it simply another example of effective simulation? Does current media interest herald a major advance in information technology, or will virtual reality go the way of artificial intelligence, cold fusion and junk bonds? Is virtual reality the best thing since sliced bread, or is it last week's buns in a new wrapper? There are experts who support both views. The best-thing-since-sliced-bread protagonists point to potential applications in training, communications, entertainment and human-computer interaction. They use terms like "intuitive", "circumambience", and "presence." The opposition use terms like "so what?", "when?", and "right!". Are the proponents harbingers or visionaries? Are the opponents sceptics or Luddites? Predicting the impact of technology is notoriously difficult. Hindsight allows us, for example, to express pitiful disdain towards the engineer who saw no future for the telephone, or the clerk who could not be convinced of the benefits of the photocopier. Experts are no better, or no worse, at predicting than the rest of us. The value of experts is in their ability to fit current ideas and events into the context of past events, and to do so in a coherent and engaging manner.
Plain Old Telephone Service (POTS) is undergoing many changes. The phones themselves may no longer be "plain"; the service is no longer limited to just connecting two voices. This forum concentrates on the design, evaluation, and standardization of telephone-based interfaces. There are four central topics: (1) the current design issues with treating the telephone as an interface for users of many services; (2) the concerns of national and international standards bodies for phone-based interfaces; (3) the human factors issues surrounding the incorporation of speech synthesis and speech recognition into the telephone network; and (4) innovative design responses to current limitations. Overall is a trend from POTS to pixels.
During this demonstration, the user interface for one model of a Photo CD Player will be presented. This product, currently available to consumers, plays both Photo CD discs and Audio CDs. The Photo CD technology as it relates to consumers will be introduced as well as Human Factors design goals and intended usage. Demonstration of the product will be used to show how well the user interface met these design goals. User interface evaluation techniques and the design direction for future player models will be discussed.
The ability for groups of individuals to work together as a team is quickly becoming a prerequisite in the modern workplace. Surprisingly, however, this increased demand for effective teams has not been accompanied by improved technology for the study of teamwork. One factor that complicates the issue in the study of teams is the level of fidelity required to perform useful research on team processes and performance. These issues have previously been assumed to require high fidelity, full-mission simulators. However, it has recently been suggested that inexpensive low fidelity simulations might be sufficient for this purpose (Driskell & Salas, in press). Therefore, the present demonstration presents an application for low fidelity simulation that appears to be useful as a tool for aircrew coordination research.
Integrated Decision/Engineering Aid (IDEA) incorporates a standard process and a set of automated tools to support the application of the DoDs Human/System Integration (HSI) program, the Army's Manpower and Personnel Integration (MANPRINT) initiative and Human Factors Engineering (HFE) throughout the materiel development process. IDEA provides the HSI/HFE analyst with guidelines data and tools to integrate HSI/HFE into the acquisition of: (a) non-developmental items (NDIs), (b) product improvements, and (c) new system developments, focusing on the activities and products at each phase of the materiel acquisition process. The purpose of the session is to demonstrate how IDEA is utilized in the definition of the HSI/MANPRINT requirements in support of the system development/acquisition process. Specifically, participants will have an opportunity to view the presenter operate the various components of IDEA. Emphasis will be placed on review of the overall architecture, arrangement and organization of the modules, and the recent additions/modifications. A handout identifying the objective(s) and product of each analysis, as well as the data requirements, will be made available to participants. From the demonstration and the handout participants will be familiar with the scope, capabilities and limitations of IDEA.
Honeywell has developed a unique, motor driven, two degree of freedom hand controller that offers high levels of performance and ease of programming variables of importance for controller responsiveness and user acceptance. The simple design leads to relatively low cost and high reliability when compared with other designs. Independent motors lead to improved performance for a given motor size and ease of adding redundant motors.
Intelligent Vehicle-Highway Systems (IVHS) is a major U.S. Department of Transportation initiative to improve the safety and efficiency of our nation's highways. IVHS includes five related components: Advanced Traffic Management Systems (ATMS); Advanced Traveler Information Systems (ATIS); Commercial Vehicle Operations (CVO); Advanced Vehicle Control Systems (AVCS); and, Advanced Public Transportation Systems (APTS). Although the Federal Highway Administration has initially chosen to address each of these components separately, a number of issues are shared by all components. One critical common element deals with the capabilities of the humans in the system. Appropriate guidelines that consider the needs and capabilities of operators, maintainers, and users will be critical for efficient functioning of each system. Efforts are underway to define and resolve critical human factors issues related to IVHS components. This symposium addresses four of the five more highway related IVHS components. For each of these components, presenters will define the key engineering characteristics, hypothetical scenarios that focus on human-system interfaces, and examples of human factors issues that must be considered in the design of IVHS systems.
The Intelligent Vehicle-Highway System (IVHS) is an important and broad ranging Department of Transportation program to reduce congestion and increase safety on the nation's highway system. The Automated Highway System (AHS) represents the full realization of one IVHS subsystem, Automated Vehicle Control Systems. Efforts are underway to define and resolve critical human factors questions related to the AHS. As part of the process, human factors issues will be identified through development of hypothetical AHS scenarios. This requires a generic AHS scenario be presented, and affiliated human factors issues identified.
Advanced Traffic Management Systems (ATMS) are those components of Intelligent Vehicle Highway Systems (IVHS) that integrate traffic detection, communication, and control functions to be responsive to dynamic traffic conditions and increase the efficiency of existing traffic networks. ATMS provide the management foundation that will enable and integrate other IVHS components such as Commercial Vehicle Operations, Advanced Traveler Information Systems, Advanced Vehicle Control Systems, and Advanced Public Transportation Systems. This paper defines Advanced Traffic Management Systems. It also describes the functions that may take place within an ATMS-class Traffic Management Center (TMC), a scenario that a future TMC operator may encounter, and some of the human factors issues that must be addressed in the design of an ATMS-class TMC.
The nation's motoring public is increasingly burdened by recalcitrant transportation problems, many of them directly attributable to increasing traffic congestion. In response to this, the US. Department of transportation is actively moving on several fronts to address this problem. One of the more promising approaches to relieving congestion is through the design and implementation of new technology in the Intelligent Vehicle/Highway System (IVHS). IVHS is composed of five elements: Advanced Traffic Management Systems (ATMS), Advanced Traveler Information Systems (ATIS), Commercial Vehicle Operations (CVO), Advanced Vehicle Control Systems (AVCS), and Advanced Public Transportation Systems (APTS). This paper will discuss human factors issues associated with ATIS.
The Commercial Vehicle Operations (CVO) segment of the IVHS program is targeted at users of interstate trucks, local delivery vans, buses, taxis, and emergency vehicles. Specifically, the goals of the CVO program are to improve (a) the efficiency and effectiveness of traffic management and regulatory administration by government; (b) the efficiency and effectiveness of fleet management; (c) safety for operators of commercial vehicles and others affected by them; and (d) driver performance. Although a number of technologies have been developed to support these goals, the human factors aspects of these systems have not been examined.
An approach is presented for evaluating the mission effectiveness of combat units. The formal evaluation scheme is unusual in that it works with complex tradeoffs and deals flexibly with the changing conditions of combat. The scheme is centered on an index measure in which many indicators are related through a nonlinear mathematical model. The model mimics the pattern by which an expert evaluator judges whole performance.
CRT displays aboard U.S. Navy ships use a standardized monochrome Naval Tactical Data System (NTDS) symbol set to represent properties of symbols such as platform type (e.g. Aircraft Carrier, Combat Air), environment (e.g. air, surface, subsurface), and identification (e.g. hostile, friendly). A color symbol set has been proposed in NATO Standardization Agreement 4420, Display Symbology and Colours for NATO Maritime Units (1990). The U.S. Navy is currently considering ratification of this Standardization Agreement (STANAG). Empirical comparisons of operator performance using the NTDS symbology versus those using the color-filled NATO STANAG symbology were conducted. Two additional experimental symbologies were also created. The first, called NTDS Equated, is a color version of the NTDS symbol set, and the second experimental symbol set, called NATO Outline, is a color outline version of the color filled NATO STANAG symbol set. Test subjects were asked to find (hook) specific symbols during a tactically relevant scenario. Time to the first correct hook and percentage of correct hooks were subjected to analyses of variance (ANOVA). Experimental results revealed that the NATO STANAG symbol set outperformed all other symbol sets in terms of symbol recognition time, and outperformed the NTDS Standard symbol set for symbol recognition accuracy as well. The results indicated that tactical information can be transferred more quickly and accurately to watch standers through effective use of symbol coding. Test subjects familiar with the NTDS symbology expressed a preference for the color symbol sets in opinion surveys administered after the experiment. General conclusions resulting from comparisons across symbol sets were that color fill was more effective than color outline, and that operator performance gains were achieved as a result of color coding and greater information content on the symbol. This paper presents the human performance assessment that was conducted, the results, and the implications of the findings for ratification of NATO STANAG 4420.
This paper describes the process involved in the development of a reaction time test bench for the Computer Aided Systems Human Engineering (CASHE) program, which is based on a strategy for converting human factors information into simulation software, using a test bench metaphor. The metaphor takes its strength from the familiarity systems designers have with test benches and breadboarding facilities currently at their disposal. The purpose of this paper is to provide a description of this software development activity, illustrate the procedure we followed, specify the decision points we encountered, and relate our lessons learned. Our goal was to convey functional specification information to the software developers in a parsimonious, unambiguous, structured manner to facilitate the development of both the software and the user interface, while complying with hardware system constraints. Development of the Reaction Time (RT) Test Benches involved the following tasks: collect and digest the Engineering Data Compendium entries; analyze the variables; determine the scope of the relevant variables to be tested; select the test bench phenomena to be demonstrated; and develop each of the deliverables. These deliverables included the variable range tables, initial variable settings, the control flow and storyboard graphics. We believe that this task is typical of the input human factors specialists can provide to designers in a variety of contexts and hence generalizes beyond this specific application.
A proposed military standard was developed that establishes and defines the methodology to incorporate the use of rapid prototyping of operator/maintainer machine interface design into the system design process. The specification includes the work to be accomplished by the contractor or subcontractor in prototyping interfaces and evaluating those interfaces under operationally equivalent conditions. The tasks outlined provide the basis for defining, integrating and validating operator interface prototyping methodologies, as well as the software interface. Two data item descriptions were also developed; Rapid Prototyping Program Plan, and Rapid Prototyping Design Approach Document. The Rapid Prototyping Program Plan is the single document which describes the contractor's entire system prototype engineering program, identifies its elements, and explains how the elements will be managed. The Rapid Prototyping Design Approach Document provides a source of data with which to evaluate the extent to which the rapid prototyping design meets system engineering requirements and other human engineering criteria. This paper presents the main topics from the proposed standard.
This paper identifies a significant emerging problem in the definition and development of future ATC system enhancements. Metaprototyping (prototyping at the system level) in the Integration and Interaction Laboratory (I-Lab) will allow the FAA to initiate a new process for RE&D of the future NAS that will support the active involvement of system operators and users. Initial simulation studies have begun and will continue to address interaction among planned, future ATC automation programs in support of the people within the NAS. Preliminary results from these studies and the process for the conduct of metaprototyping will be described. The key difference between the I-Lab effort and previous prototyping is the integration of concepts, models and system elements within the context of the future system as opposed to prototyping of individual components or system elements. The I-lab is a tool for systems engineering and research activities to envision the future and guide development to achieve that vision. At each step the resulting information will be used to support the FAA's efforts to establish and maintain a future system definition and relevant research and development programs to achieve enhanced efficiency for the end-users, and NAS resources and personnel. In addition, the process by which system prototyping is effected will be established including the roles of RE&D organizations, outside researchers, operational personnel, and system users.
A research and development program is underway to produce an innovative design support system for crew station designers. Known as the Performance Visualization Subsystem of the Computer Aided Systems Human Engineering Program (CASHE: PVS), this design tool will have data visualization and prototyping capabilities that will enable designers to "go beyond" the human perception and performance information available in the PVS database. Interactive software modules (called test benches) are being developed to allow designers to explore behavioral phenomena under different stimulus and response conditions. The objective of this paper is to describe a method we have used to translate the information in the PVS database into test bench specifications for software development. The basic approach in test bench design is: 1) to rely on standardized tasks and conditions where possible and 2) to provide designers with pedagogical illustrations of perceptual and performance effects. The procedures used in developing test bench specifications included identifying good candidates for test bench simulations, prioritizing the set of proposed test benches according to selection criteria developed by the design team, and recruiting subject matter experts to generate test bench specifications that will be used by the software engineers to implement the test bench code. The result of this effort will be a commercially available software product that will help crew station designers more effectively understand and apply human factors principles in the design process.
Opportunities for fatigue related accidents are greatest when extended duty cycles must be maintained. A means to plan for the influence of fatigue would be useful to best utilize crew resources. Equations were derived to predict performance degradations associated with fatigued cognitive abilities. During a 30-hour sleep deprivation study, nine male subjects were required to perform a 45-minute cognitive performance battery every 120 minutes. Plasma melatonin levels also were obtained. Cognitive performance measures sensitive to fatigue were determined and used to derive composite response time and accuracy scores. The equations that best described the composite scores included a linear component (hours awake weighting) and a circadian component (melatonin weighting). The respective prediction equations accounted for 33% of the variance in response time performance (p < .0001) and 18% of the variance in accuracy performance (p < .0005). Tests on the beta weights indicated that accuracy predictions were more enhanced by the circadian component than were those for response time. This work represents a mathematical description of fatigued performance that is sensitive to circadian cycles and requires minimal input data. The results might be used to recommend the best crew rest times and when additional crew should be employed as individual performance falls below critical thresholds during sustained operations.
Much has been written about the value of prototyping during the requirements analysis phase of system acquisition. This paper focuses on the classification of prototypes as they impact the user interface. Using a style guide to manage the user interface during the prototyping process is also examined. A distinction between different categories of prototypes, can be drawn in terms of the amount of functionality provided to the user. Consequently, three distinct types of prototypes can be differentiated: conceptual, detailed design and operational prototypes. The conceptual prototype presents the user with the least amount of functionality and is often undertaken to derive functional requirements and to exhibit a proposed solution to a problem. The detailed design prototype exhibits more functionality and is often used by human factors personnel to collect detailed user feedback and performance data to make specific tradeoff decisions and to derive a more detailed design of the user interface. An operational prototype is a complete system that has been fully tested by selected end users, but is not sold as a commercial product. Once the category of prototype is established, a style guide is helpful in managing the development of the user interface. Ensuring consistency within a prototype is the principal role of a user interface style guide. Recent experience in the creation of a style guide for a prototyping effort for the US Air Force has led to a number of suggestions. These suggestions are discussed and future efforts in the development of user interface style guides is indicated.
Assessment of heavy vehicle driver workload can benefit from earlier research on pilot workload. Four workload issues are particularly salient: performance, methodology, measurement, and conceptual problems. Since an airplane is not a heavy ground vehicle, and since there are many differences between pilots and truck drivers, aviation workload tools cannot be applied to ground vehicles without some caveats. A summary of ten lessons learned from aviation workload is given.
Based on airline pilot surveys, industry committees and workshops, conducted on advanced technology "glass cockpit" airplanes, concerns have been raised about the application and long-term effects of automation technologies. It has been pointed out that purely technology-driven designs had resulted in unintended and unforeseen negative consequences. In order to counter this trend it has been proposed to shift the focus from technology-centered designs to what has become known as human-centered design. There are three primary objectives within a human-centered design philosophy: (1) the design should enhance the user's abilities, (2) the design should help overcome user limitations, and (3) the design should foster user acceptance. This paper discusses the human-centered design objectives within the context of commercial transport airplane developments. Representative examples of a human-centered design are presented.
Human error has been shown to cause 65-80% of maritime casualties; this figure is similar to that of other industries. However, there has been little systematic research and development work in the area of maritime human factors. This paper presents a model of five technical domains that comprise a useful framework for conceptualizing human factors in the maritime industries. The multi-modal research potential in the areas of fatigue and the cognitive impacts of automation are discussed.
The U.S. Army Human Engineering Laboratory (USAHEL) approach to HSI in the early stages of materiel development is to base the process for accomplishing HSI/MANPRINT on the HFE front-end analysis methodology as described in MIL-H 46855B. The USAHEL under contract with Carlow International Incorporated is developing an HSI standardized and formalized process tied to the events, activities, products and milestones for all phases of the WSAP and incorporating a set of automated tools and information systems to support the application of the HSI process. The system, including the process, associated tools and information resources, have been designated the HSI Integrated Decision/Engineering Aid or IDEA. A major element of IDEA then is the standardized and formalized HSI process tied to the events, activities, products and milestones of each phase of the WSAP as directed in DoD 5000.1, 5000.2, and 5000.2M, and incorporating a set of automated tools to support the application of the HSI process. The HSI process architecture contained in IDEA is an interactive graphic which has the following characteristics: a) it is integrated with the WSAP activities, products and requirements for each WSAP phase; b) it defines and describes HSI activities, events, inputs/outputs, products and methods for each WSAP phase, and provides guidelines on the application of the activities and methods and on the contents and format of the products; c) it incorporates the tools required to apply the HSI methods and to accomplish the HSI activities; d) it is focused on personnel readiness and effectiveness requirements; e) it addresses the development of a new system, a non-development item (NDI), or product improvement; and f) it provides a formal mechanism for getting HSI issues and concerns addressed early in system acquisition.
Downsizing the Department of Defense (DoD) means accomplishing more with fewer people. Enlightened design that considers all requirement and interaction issues simultaneously is the key to productivity. In the past, human issues have been difficult to quantify or depict during the systems engineering process. Recently, there has been an explosion of affordable HSI technologies. Despite the new DoD directives that require HSI analyses throughout acquisition, it is difficult to identify the most appropriate technology for HSI analyses. Defense acquisition managers, contractors, and the HSI research and development (R&D) community need a database of information about HSI tools, databases, and test facilities. They need this database to identify technology available in each of the Liveware domains of Manpower, Personnel, Training, (MPT) Safety, Health Hazard Prevention, and Human Factors Engineering (HFE) and to fully integrate human consideration into the acquisition process. However, no comprehensive catalog of HSI technology exists. Under the sponsorship of the Office of the Assistant Secretary of Defense (Force Management and Personnel) HSI office and North Atlantic Treaty Organization (NATO) Research Study Group.21 (RSG.21), TPDC and CSERIAC are surveying the HSI community for a comprehensive database of HSI technologies, an ambitious effort requiring the help of all HSI technology developers, owners, and users. This paper reviews previous HSI-related technology studies. It supports the thesis that a comprehensive survey and database are needed to improve prioritization of HSI technology R&D; aid in HSI technology identification and use; and take full advantage of the new acquisition climate. It also describes the survey and database which is now being populated, and highlights the need for HSI community participation.
A study of criticality safety, commissioned by the U.S. Department of Energy, was conducted by Scientech Inc. at the Rocky Flats nuclear weapons facility. The study concluded that human performance is the driving factor in the risk of an inadvertent criticality incident at the Rocky Flats Plant (RFP). A study of the infractions which have occurred at this facility bears this point out. A human factors team was established to identify a means of reducing human error in every day operations. The team determined that the posted instructions near each work area are key to operators having a clear understanding of operating requirements. An evaluation of the posted instructions revealed that they were very complex, required operators to monitor multiple parameters, and resulted in the operators' attention being divided between operational tasks and the task of monitoring nuclear safety parameters. Alternative graphics, textual, and graphics and textual formats combined with color coding were developed to improve comprehensibility, understandability, controllability, and usability in the Job Performance Aids (JPAs). Results of field tests of the different formats provide clear indication that operators prefer short, concise textual statements summarizing important information over both other formats. Although operators indicated interest in the graphics formats, the magnitude of change in presentation techniques and the generalizability of the icons argued against their immediate use. Issues in the development of candidate JPAs and other usability requirements are discussed.
Although technological innovations have changed the role of operators from active participants to supervisors of semi-automatic processes, an understanding of the cognitive demands of supervisory control has not kept pace. In particular, little is known about when, and how well, operators might intervene and switch control from automatic to manual. This research addresses this issue by monitoring the information use and control actions of operators of a simulated semi-automatic pasteurization plant. The results of this experiment shows that individual differences in operators' monitoring patterns during the normal operation of the plant correspond to differences in their ability to mitigate the effects of faults. Specifically, an operator who controls the plant well during both normal and fault conditions tends to observe the plant frequently, integrating control actions with other control actions, and does not fixate on narrow sub-systems of the plant. On the other hand, an operator who performs poorly when exposed to faults tends to observe the plant less often, fails to integrate control actions, and fixates attention on a narrow subset of plant variables. Although all operators interacted with the plant using the same interface and automation, large individual differences in the operators' monitoring patterns, and the associated differences in performance suggest that individuals' attitudes, motivation, and training may play a critical role in the successful implementation of automation.
In the first phase of this two-year project, workplace factors contributing to hand, arm and back injuries among employees at a large central public library were identified and prioritized. A central research committee was established consisting of Library Management, Union representatives and an Ergonomist. The next phase involved formation of four sub-committees to procure and prototype new equipment, develop new methods of working, evaluate the new equipment and methods, and make recommendations to the central research committee. The Ergonomist facilitated the process by helping committees remain systematic and objective in their approach and evaluations. In some cases more detailed analyses were conducted using computerized lifting models and electromyographic (EMG) analysis of muscle activity. Efforts resulted in recommendations for the current library facility and conceptual design guidelines for architects planning the new Central Public Library.
Often, manufacturing systems operate under the common perception that safety projects are generally detrimental to successful manufacturing operations. On the contrary, emphasizing safety on all levels of a corporation can not only create an overall more positive attitude in the workplace while reducing worker's compensation and insurance costs, but it can also improve the manufacturing capabilities of an organization Within this paper, a model is presented which can prove safety-related ventures can be lucrative in measurable manufacturing terms. The model shows that accepting safety-related projects and approaching safety standards more positively will actually improve manufacturing strategy components such as quality, productivity, utilization, and reliability. It also includes redefining the mission statement of the corporation, developing a manufacturing strategy emphasizing safety, and implementing safety measures in all aspects of the corporation. Such evaluations in measurable terms will result in greater percentages of safety projects being accepted because relevant information will be communicated to managers in terms they can easily understand. Results indicate that there is a correlation between implementing safety projects and the improvement of manufacturing capabilities. Therefore, one can infer that safety, like quality, cost, and time, is a strategic tool to be used for improving manufacturing, and not merely a tangential issue.
A major problem in environmental restoration and waste management is the disposition of used fuel assemblies from the many light water reactors in the United States, which present a radiation hazard to those whose job is to dispose of them, with a similar threat to the general environment associated with long-term storage in fuel repositories around the country. Actinides resident in the fuel pins as a result of their use in reactor cores constitute a significant component of this hazard. Recently, the Department of Energy has initiated an Actinide Recycle Program to study the feasibility of using pyrochemical (molten salt) processes to recover actinides from the spent fuel assemblies of commercial reactors. This project concerns the application of robotics technology to the operation and maintenance functions of a plant whose objective is to recover actinides from spent fuel assemblies, and to dispose of the resulting hardware and chemical components from this process. Such a procedure involves a number of safety and human factors issues. The purpose of the project is to explore the use of robotics and artificial intelligence to facilitate accomplishment of the program goals while maintaining the safety of the humans doing the work and the integrity of the environment. This project will result in a graphic simulation on a Silicon Graphics workstation as a proof of principle demonstration of the feasibility of using robotics along with an intelligent operator interface. A major component of the operator-system interface is a hybrid artificial intelligence system developed at Oak Ridge National Laboratory, which combines artificial neural networks and an expert system into a hybrid, self-improving computer-based system interface.
A major factor in determining the success of any manned long duration space mission will be how well the human body can endure the microgravity environment. Data collected from long duration space missions conducted by both the United States and the former Soviet Union have shown that almost every system in the human body is adversely affected by microgravity. These adverse affects, taken individually or in concert, can have operational implications for a long duration space mission. Data collected to date indicate that significant human factors complications could arise due to the deconditioning of the musculoskeletal, cardiovascular and hematological systems that occur in a microgravity environment, resulting in decrements in overall astronaut performance. This paper examines some of these deconditioning effects, their immediate operational implications and possible countermeasures.
The objective of this symposium is to discuss research dealing with the role of complexity in functions and tasks commonly allocated to the operators/users of systems. Systems of greater capability are being designed and the complexity of operation is increasing. Conceptual/experimental approaches to complexity are reviewed under the categories of procedural, cognitive, and conceptual complexity. Specific projects are reviewed on the effects of complexity on memory for procedures, information extraction for displays, and flight operations in the glass cockpit.
The objective of this paper is to review research dealing with the role of complexity in functions and tasks commonly allocated to the operators/users. This topic is complex and not well-structured. We have reviewed principal approaches to provide better structure for the psychological domain of complexity. The research reviewed is partitioned into three categories: procedural, cognitive, and conceptual complexity. What we were after in the review was to find quantifiable attributes of complexity in cognitive tasks and skills and how to use these attributes to manage complexity during system design.
This paper describes the relative effects of task complexity on the retention of a skill over prolonged periods of non use. The paper focuses on the decay of skills and knowledges of the 20,000 reservists called up for active duty during Operation Desert Storm. Reservists were tested upon reentry to determine the extent of skill decay since their release, up to one year, from active. These data were analyzed with multiple regression and analysis of variance techniques. The major findings were: (a) procedural skills and knowledge about Army jobs decayed mostly within six months, but psychomotor skills (weapons qualifications) did not begin decay until ten months; and (b) previous skill qualification score was the best predictor of skill decay followed by aptitude score.
The ANETS model for representing and measuring the degree of the cognitive complexity of visual displays is described in general. Moreover, the results from several studies are described briefly that address the model's predictive power, the reliability of model usage, and the relationship to perceptual measures of display quality. Finally a model-based approach for interface design is discussed as possible and desirable.
Models of human performance which include concepts of task or procedural complexity have been used to evaluate the design of specific procedures which are dictated either by the airline or the flight environment (such as a specific airport). The procedures and environment as they currently exist can be modeled producing a profile across time of the output variable of the model. The variable that has been of most interest to us is pilot workload. One way in which we are using these modeling procedures is to compare a complex departure procedure with another departure procedure which is considered to be typical of most departures. Pilot workload profiles were obtained for the pilot-flying and the pilot-not-flying for each departure. A comparison was made of the profiles from the two departures and it was indicated that the more complex departure greatly increased the workload of the pilots, especially the pilot-flying. The complex departure procedure was analyzed looking particularly at the requirements that produced large peaks in pilot workload for either pilot, and recommendations are being made for changes to the procedure based on this analysis. The value of using such a modeling procedure in the airline environment will be discussed including other possible application areas.
A study was performed to assess pilot workload associated with the employment of an air-to-air weapon system integrated onto an attack helicopter. Mental workload was assessed using the Subjective Workload Assessment Technique (SWAT). Pilots performed simulated engagements against an airborne target under varying conditions of engagement type, time of day, target background, and target range. The results indicated significant differences in SWAT ratings as a function of time of day and engagement type. To a lesser degree, SWAT ratings were also sensitive to changes in target background and range. These results are consistent with laboratory and simulation studies which have shown SWAT to be sensitive to changes in task demand and further demonstrate the utility of SWAT for assessing operator workload in the less structured test and evaluation environment.
Human factors research of automobile driver behavior often calls for timing in-car manual tasks. The present study was designed to compare the accuracy, bias, and consistency of various techniques for measuring in-car manual task durations. Additionally, this research was intended to reveal how closely these techniques approach the preciseness of the frame-by-frame video analysis method, which is time-consuming and expensive to perform. Six subjects were required to use an electronic stopwatch to measure "hand-off-wheel" times for 30 driver tasks. Each subject performed this procedure three times: while sitting as an observer in the back seat of a research vehicle, while watching a real-time video recording of task performance, and while watching a one-sixth real-time video recording of task performance. Timing Method (three levels), Duration of in-car task (three levels), and Subject (six levels) served as independent variables. Dependent measures gathered were raw timing error (a measure of response bias), absolute timing error (a measure of response accuracy), and squared timing error (a measure of response consistency). Timing error was obtained by subtracting the measured time for a particular task from the "true" task time obtained by using the frame-by-frame video analysis technique. Analysis of the data indicated a significant effect of method on response bias. Specifically, use of the slow-motion video technique resulted in overestimation of in-car task durations, and use of the two real-time techniques resulted in estimates of task durations that were either equal to or less than the true durations. Significant effects of Subject, Gender, and Subject x Method were also revealed. The results suggest that the on-road timing technique should be used in the future, since this procedure requires little in terms of cost and implementation time, and errors are small when compared with the frame-by-frame technique.
Although use of the mental model construct has proliferated in recent research, the construct lacks a clear definition and an agreed upon method of measurement. Furthermore, the reliability and validity of the different measurement techniques in use have not been established, thereby making generalizations across studies of mental models difficult. The purpose of the current project was to assess several methods of measuring mental models in terms of their reliability/stability over time. Subjects' mental models of the automobile engine system were elicited on two occasions separated by one week, using seven different knowledge elicitation techniques. Subjects' level of experience was also measured to allow comparisons between experts and novices. The results indicate that each of the measurement techniques tended to be reliable for both experts and novices. However, reliability tended to be greater for experts than novices. Additionally, experts tended to agree with each other more than did the novices. Some evidence also indicated that the results from the similarity ratings and subsequent Pathfinder analysis converged with those from the structured interviews.
This experiment, conducted at the OECD Halden Reactor Project, Halden, Norway in the spring 1991, aimed to assess the effect on nuclear power plant operators diagnostic behaviour when using a rule based diagnostic expert system. The rule based expert system used in the experiment is called DISKET (Diagnosis System Using Knowledge Engineering Technique) and was originally developed by the Japan Atomic Energy Research Institute (JAERI). The experiment was performed in the Halden man-machine laboratory using a full scope pressurized water reactor simulator. Operator performance in terms of quality of diagnosis is improved by the use of DISKET. The use of the DISKET system also influences operators problem solving behaviour. The main difference between the two experimental conditions can be characterised as while the DISKET users during the diagnosis process are following a strategy which is direct and narrowed, the non-DISKET users are using a much broader and less focused search when trying to diagnose a disturbance.
The purpose of the study was to compare a team usability testing paradigm with that of the typical single user paradigm in terms of the quantity and quality of the user's verbalization (i.e. thinking out-loud) and performance. The study employed a three group design in which the type of usability paradigm (Single, Observer, Team) was manipulated. Users first learned to use an off-the-shelf database management package by means of a short tutorial and then engaged in six structured tasks. While engaging in the tasks, the users either thought-out-loud alone (Single condition), in the presence of an observer (Observer condition), or as participants of a team working on the tasks together (Team condition). Results indicated that there were no significant differences among the three conditions in terms of performance nor any extensive differences in their subjective evaluation of the software. However, users in the Team condition spent more total time verbalizing than those in the Single or Observer condition. More importantly, results of a verbal protocol analysis revealed that the Team spent more time making statements which had high value for designers than did the other two conditions (which did not differ from one another). When broken out by individual users in the Team, there were no significant differences between individual team members and users in the other two conditions in making high value comments. The results suggest that the Team paradigm may be more efficient in extracting high value information without any noticeable differences in performance or subjective impression of the software.
The need to scan and integrate sources of information in multifunction displays (MFDs) forces consideration of the relationships between the screens in the database. This paper develops a spatial metaphor that can be used to explore these underlying relationships. Three metrics of distance -- navigational (the number of choice points lying between screens), organizational (the structure of the hierarchy), and cognitive (the user's mental representation of information relatedness) -- were identified and empirically examined by using a simulated, hierarchically arranged, menu-driven MFD in an aviation context.
Although usability tests are typically conducted with a purpose of making products less stressful for people to use, the usability testing process itself can be stressful for many test participants. The combination of trying to use a new product, being videotaped, and being watched by others, is a potentially stressful environment for many people. Although the Subject Precautions section of the Human Factors Society Code of Ethics clearly states that "the exposure of human or animal research subjects to ... stress" should be "commensurate with the significance of the problem being researched," the Code of Ethics does not provide guidance for reducing exposure of human subjects to stress. This paper describes several practical extensions to the Subject Precautions that can help reduce stress associated with participating as a subject in a usability test. The recommendations in this paper are based on anecdotal evidence gathered in numerous usability tests conducted in both laboratory and field settings. Recommendations are included for preparing the test environment, recruiting test participants, and interacting with participants during testing.
The purpose of the present study was to evaluate the effect of evaluator intervention, task structure, and user experience on the users subjective evaluation of software usability. The study employed a 2 X 2 X 2 factorial between-subjects design with two levels of Evaluator Intervention (Intervention vs. Non-Intervention), 2 levels of Task Structure (Guided-Exploration [free-form] vs. Standard Laboratory), and 2 levels of User Experience (Novice, Experienced). The users were asked to learn to use and then subjectively evaluate a restricted subset of 12 common word processing features over four hours of participation. Day 1 was a training day and Day 2 was a test day. The major finding was that the user's subjective impression of the software was affected by both user Experience and evaluator Intervention. For difficult to use word processing features, experienced users rated the features as more difficult to use under the intervention than non-intervention condition. For novice users, this difference was in the opposite direction but not significant. The same pattern of results was obtained for the subjective rating of ease of learning, overall evaluation of the software, and confidence in ability to use the software. These results were interpreted within context of attribution theory. The effect of structure, although not as prevalent, interacted with user experience in the evaluation of screen features and system capabilities. The relative lack of task structure effects was attributed to the difficulty in implementing free form learning and the number of problems encountered in use of the software under Guided Exploration which counteracted any of its benefits.
Concurrent verbal protocols are gaining wide acceptance in software usability testing. In this study, the impact concurrent verbalization has on task performance during a software usability test was investigated. Subjects randomly assigned to two levels of verbalization were asked to complete four tasks of varying difficulty using a disk utility package. Subjects in the verbalization condition were asked to provide an explanation for each step taken to complete a task. Subjects in the control condition were allowed to complete each task silently. Dependent variables were task time, error frequency, and responses to subjective measures of mental workload and ease-of-use. Subjects in the verbalization condition committed fewer errors and consumed less task time than subjects in the silent condition. Further, the mean difference in error frequency and task time between conditions increased with task difficulty. These results were extremely important in revealing a potential method bias in usability tests.
Humans are the "raison d'etre" for human factors, yet what do we really know about the characteristics of those who serve as our subjects and on whom our science is built? What do we need to know? Most authors gloss over the topic briefly and tersely describe the subjects as "10 male and 10 female college age students." The articles then move onto what many consider to be the real action: the experimental design, test procedures, and statistical analysis. A conclusion is reached and generalized to the population. When is this appropriate/inappropriate? What population do the subjects (Ss) represent? What are the characteristics of our current Ss? Are subject differences even relevant? What, if anything, can be gained by examining subject by condition interactions? What techniques do we have which will allow us to go beyond performance data, and examine the subjects cognitive processes? What changes can we expect to see in the worker/user population which should influence our subject selection strategies? The four papers presented in this symposium will address these issues, provide some answers, and certainly raise some questions.
Humans are the "raison d'etre" for human factors, yet what do we really know about the characteristics of those who serve as our subjects and on whom our science is built? What do we need to know? This article addresses issues related to subject selection and data reporting, provides some recommendations, and hopefully raises some questions. Our subjects, as volunteers, differ from the population they are drawn from in many ways, specifically individuals volunteer as a function of the type of experiment. Volunteers have a greater sense of personal responsibility than non-volunteers. They tend to be higher in the need for social approval, and more social than non-volunteers. Volunteers for experiments involving risk taking score significantly higher in risk taking/arousal seeking, and are less anxious than non-volunteers. They are also less authoritarian and less conforming than non-volunteers. Finally, volunteers tend to be better educated than non-volunteers Articles, which reported findings on human subjects and were published in Human Factors (N=84) or Ergonomics (N=64) between August 1989 and April 1991, were reviewed. Forty percent of the Human Factor articles did not provide sufficient detail for the reader to determine if the subjects were fulfilling a course requirement, paid or unpaid. Our literature seems to be based on individuals between 18 and 30 years of age. In the issues of Human Factors, which were reviewed, among those articles reporting data derived from subjects, the gender of 42 percent of the subjects could not be determined from the article. In 27 percent of the human factor articles demographic data were not reported. It is recommended that authors provide additional information on the characteristics of their subjects, so that researchers and practitioners alike can develop an informed opinion about the applicability/limitations of the findings. As a minimum, details on age (mean, SD, median and range), gender, and specific demographics should be reported.
The opportunities presented by subjects-condition (SxC) interactions are discussed after an introduction to their nature. Operator Strategy Differences (SDs), Scale-of-Measurement Effects (SOMs), and Condition Requirement Differences (CRDs) are each seen as potential sources of SxC interactions. It is shown that SxC interactions can (1) frequently be detected using an analysis of "error" variances approach, (2) be characterized in terms of their nature, and (3) enhance the utility of research results (once characterized). It is recommended that subjects-condition (SxC) interactions be routinely evaluated in human factors research.
Failure to debrief test subjects (Ss) is dangerous, because Ss may be responding to the measurement situation in a highly idiosyncratic way which could produce corrupted results. Much of S's behavior, particularly in advanced problem-solving systems, is covert and so must be reported directly by S. Since verbalization during measurement is inadmissible, a method is proposed of debriefing S following the test.
The already diverse workforce in America is expected to diversify at an even greater rate over the next decade. Projected workforce changes include those of age, gender, and race. The recently passed Americans with Disabilities Act also ensures that a growing number of persons with diverse physical needs will enter the workforce. Data from Moroney and Reising (1992) provide some clear indications of the types of subjects currently used in human factors experiments. Not surprisingly, these subjects represent a range of persons that is much less narrow than the range represented in the current and projected workforce. If not corrected, the differences between human factors subjects and those of the American workforce will increase at a magnified rate. To ensure that the results produced from human factors experiments are useful and valid, researchers should first analyze the diverse characteristics of their intended users and select subjects who possess these characteristics.
There is an emerging effort in the automotive industry to explore a new horizon of quality, e.g. sensory comfort, beyond the traditional measure of reliability and durability. This is evidenced by the birth of a new engineering field, referred to here as sensory engineering. The objective is to evaluate and characterize human's feeling and incorporate these findings into the engineering and design of the product. In this paper, engine sounds from various passenger vehicles were examined using this approach. Five sound samples under wide-open-throttle acceleration condition and five under constant speed conditions were evaluated using the semantic differential method. Results showed that subject's perception of these sounds can be very well characterized in a semantic space made of three factor axes. Significant difference of mean factor scores appeared along 'smooth, reliable, & desirable', 'loud/whining', and 'special & modern' axes. These results can be used to refine the engine design to achieve a better acoustic quality of the engine.
Future space vehicles such as the Space Station Freedom will be equipped with computers that have direct manipulation capabilities. The human factors challenge is to provide an optimal human-systems interface which will accommodate a wide range of users and tasks in a microgravity environment. A series of experiments have been conducted by the Man-Systems Division at Johnson Space Center to resolve anthropometric issues related to human reach capabilities and limitations impacting workstation design. To facilitate this goal, two approaches, "Performance-based" and "Model-based" analyses, were integrated to investigate the human reach mapped onto the workstation display panels. Microgravity maximum reach sweep data were collected onboard NASA's KC-135 Reduced Gravity Aircraft. A three-dimensional (3-D) interactive graphics system, PLAID, was used to generate anthropometrically correct human computer models. Video tapes recorded during the flights were used to extract information for positioning each human representation in the computer model relative to the workstation. The approach, findings and implications of the evaluations are discussed in the paper.
This paper discusses the results of the first phase of a research project concerned with developing methods and measures of user-system interface effectiveness for command and control systems with graphical, direct manipulation style interfaces. Due to the increased use of prototyping user interfaces during concept definition and demonstration/validation phases, the opportunity exists for human factors engineers to apply evaluation methodologies early enough in the life cycle to make an impact on system design. Understanding and improving user-system interface (USI) evaluation techniques is critical to this process. In 1986, Norman proposed a descriptive "stages of user activity" model of human-computer interaction. Hutchins, Hollan, and Norman (1986) proposed concepts of measures based on the model which would assess the directness of the engagements between the user and the interface at each stage of the model. This first phase of our research program involved applying three USI evaluation techniques to a single interface, and assessing which, if any, provided information on the directness of engagement at each stage of Norman's model. We also classified the problem types identified according to the Smith and Mosier (1986) functional areas. The three techniques used were cognitive walkthrough, heuristic evaluation, and guidelines. It was found that the cognitive walkthrough method applied almost exclusively to the action specification stage. The guidelines were applicable to more of the stages evaluated but all the techniques were weak in measuring semantic distance and all of the stages on the evaluation side of the HCI activity cycle. Improvements to existing or new techniques are required for evaluating the directness of engagement for graphical, direct manipulation style interfaces.
Usability evaluators used an 18-item, post-study questionnaire in three related usability tests. I conducted an exploratory factor analysis to investigate statistical justification to combine items into subscales. The factor analysis indicated that three factors accounted for 87 percent of the total variance. Coefficient alpha analyses showed that the reliability of the overall summative scale was .97, and ranged from .91 to .96 for the three subscales. In the sensitivity analyses, the overall scale and all three subscales detected significant differences among the user groups; and one subscale indicated a significant system effect. Correlation analyses support the validity of the scales. The overall scale correlated highly with the sum of the After-Scenario Questionnaire ratings that participants gave after each scenario. The overall scale also correlated moderately with the percentage of successful scenario completion. These results are consistent with the hypothesis that these alternative measurements tap into a common underlying construct. This construct is probably usability, based on the content of the questionnaire items and the measurement context.
A study was conducted to assess the capabilities and limitations of the DataGlove, a lightweight glove input device that can output signals in real time based on hand shape, orientation, and movement. The DataGlove was used as an input device to control the Proto-Flight Manipulator Arm (PFMA), a large telerobotic arm with an 8-foot reach. Twelve volunteers (six males and six females) participated in a 2x3(x2) full-factorial experiment in a simple retraction, slewing, and insertion task. Two within-subjects variables, time delay (0,1, and 2 seconds) and PFMA wrist flexibility (rigid/flexible) were manipulated. Gender served as a blocking variable. Retraction, insertion, and slew times, as well as total task time were collected as the dependent variables. An analysis of variance found a main effect of time delay for slewing and total task times. A post hoc Newman-Keuls pairwise comparison of the means was performed for the significant effects. Slew times with no time delay were significantly faster than slew times with either 1- or 2-second time delays. Total task time with no time delay was significantly faster than total task time with a 2-second time delay. PFMA wrist flexibility had no significant main effect on the ability of the subject to accurately and effectively operate the PFMA with the DataGlove. It was concluded that the DataGlove is a legitimate teleoperations input device that provides a natural, intuitive user interface and should be considered in future trades in teleoperation systems' designs.
This paper examines the application of order-or-processing networks to the simulation of performance of a complex skill, the copying of high-speed Morse code. A sequence of processing stages and memory buffers is described that is presumed, on the basis of earlier work, to represent the task. Two models of this sequence, distinguished by their assumptions regarding concurrent processing of characters, are also presented. Simulations were run on these models to find the parameters that yielded the best fit to performance data from 19 students undergoing the early stages of military Morse code training. The implications of the results to an analysis of early performance and the potential benefits of applying the same technique to data obtained from students late in training are discussed.
This research has developed a theoretically-based cognitive model and design framework for Integrated Decision Aiding/Training Embedded Systems (IDATES). Based on a review of empirical studies of novice-expert differences and of theoretical and computational models of skill acquisition, we defined a three-stage cognitive hierarchy model as the basis for our IDATES framework. The levels of novice, intermediate, and expert are discrete stages which differ along two primary dimensions: problem representation and problem-solving procedure. Both decision aiding and training must be targeted to the problem representation and cognitive processes of the user/trainee. Thus, there must be three levels of decision aiding targeted to novice, intermediate, and expert decision makers. Furthermore, there are two types of training: incremental training to improve performance within each of the three expertise levels, and representational training to elicit a jump to the next higher level of problem representation. Two implications arise from the IDATES cognitive model. First, integrated cognitive/behavioral task analyses are able to drive both the embedded training requirements and the decision aiding requirements, although the three skill levels must be separately addressed. Second, a single integrated architecture can underlie all the decision aiding and embedded training components of a given IDATES application.
Cognitive task analysis and Computer Science have revolutionized training technology with intelligent tutoring systems (Wenger, 1987). However, some key assumptions determine the success of such systems: 1) Student knowledge is rule-based, so that performance may be evaluated according to the presence or absence of rules and 2) The computer and the student have access to the same information about the problem context. In the instructional task domain we are addressing -- typeface selection -- neither assumption is appropriate. Each selection emerges as an interaction with the parameters and contingencies of the particular problem (Suchman, 1987). Furthermore, an important property of typeface is its evocative or emotional power -- a property that cannot be represented adequately in a computer. Our objective is to develop a satisfactory compromise using computer-aided instruction, specifically for the domain of typeface selection. Following Clancey (1983) and Winograd & Flores (1986), we recognize that some of the knowledge we seek to train will simply not "be in" the computer. However, it may be reflected in the design and organization of training exercises, which set up a sensitivity to the important dimensions of the problem. We take advantage of the computer medium to demonstrate typeface and layout transformations of preprogrammed text examples, as well as text examples entered on-line by the student. In addition, a hypertext style menuing system allows the student to access any part of the system from any point. In this paper we provide a description of the system we have built for training in the domain of typeface selection, and discuss the relevance of this system for two applications concerns in human factors: 1) The design of messages for public display and 2) The training of context sensitive skill.
When managing complex systems, cognitive demands or problem-solving situations can appear in different ways. In some situations, problems surface gradually while being recognised, identified and treated. This category of situations has been labelled as 'going sour' incidents. Within these incidents, there are a number of interesting and unique features warranting special attention. The present research project attempts to depict the task demands associated with going sour incidents. After initial analysis of complexity and some field observations, initial hypotheses were generated. Subsequent field study has provided support for the hypotheses. Major findings on going sour incidents include (1) problem-solving spans a long period of time and requires synthesis of information over this period; (2) trouble spots have to be checked repeatedly as the environment is likely to change over time; (3) interventions are required before obtaining sufficient number of signs; and (4) multiple hypotheses must be maintained and examined as the underlying problem changes appearance slowly from one form to another.
In recent years, there has been an increase in the number and power of calls for a systematic development of new principles for the design of training simulators (e.g. Baudhuin, 1987; Donchin, 1989; Gopher, Weil, Bareket and Caspi, 1988; Lintern, 1991). Such principles may replace the long prevailing physical fidelity approach, which has been enshrined by its compelling appeal to Folk Psychology (Flexman and Stark, 1987). The guiding principle of physical fidelity is that the closer the resemblance between a training simulator and the real system, the better it is as a training device. An alternative approach considered in this panel draws on contemporary concepts and models in human performance and learning theory. Departure from the physical fidelity principle is called upon by the reality of modern technology as much as it is motivated by enhanced scientific knowledge and improved methodology. With the rapid advance of technology, the constraints and limitations of the physical fidelity approach become clearer and more prohibitive. On the one hand, the increased sophistication of engineering systems, their much enhanced performance envelope and the extreme operational environments (e.g. air, space, underwater, nuclear), preclude on the job training. On the other hand, development of high fidelity simulation becomes either impossible or a difficult and costly undertaking. Consequently, the vast majority of existing training simulators represent a compromise. The extent of the compromise and its impact on the value of training and transfer are difficult to assess. Modern microprocessor technology and the development of rich, colorful and challenging computer game environments, provide powerful tools with which the foundations of a new approach can be studied and tested. Indeed, this was the rationale that has guided an international research collaboration directed towards the development of training strategies embedded in a complex computer game named Space Fortress (Donchin, Fabiani & Sanders, 1989). The three studies reported in the panel, are an outgrowth of this work. All three employed a modified version of the Space Fortress game (SF-II) which was developed at the Human Engineering Laboratory of the Technion - Israel.
A study was conducted at the flight school of the Israeli Airforce to test the transfer of skills from a complex computer game to flight. The context relevance of the game to flight was argued on the basis of a skill oriented task analysis, anchored in contemporary models of the human processing system. The influence of two embedded training strategies was compared, one focusing on the specific skills involved in performing the game, the other designed to improve the general ability of trainees to cope with the high attention load of the flight task. Flight scores of two groups of cadets who received 10 hours of training in the computer game were compared with those of a matched group of cadets without game experience. Both game groups performed significantly better than the no game group in the subsequent test flights. They also had higher final percentage of graduation from the flight training program. The game has now been incorporated in the regular training program of the airforce.
A field study was conducted at the US Army Aviation Center to determine whether workload-coping and attention-management skills developed through structured video game experience would generalize to flight training. Three groups of 24 trainees were compared (1) One received 10 hours of training on an IBM-PC version of Space Fortress, replicating an earlier study; (2) The second played a commercial video game (Apache Strike) for 10 hours which also required tracking, monitoring, situation assessment, and memory; (3) The third matched group receive no game training. Flight school records were monitored during the next 18 mos to compare performance of the three groups during initial flight training. Check ride ratings began to show an advantage for the group trained with Space Fortress by the Instrument stage of training, as predicted. Furthermore, attrition rates were lower for this group, replicating the results of an earlier study conducted by Gopher (1990) in the Israeli Air Force Flight School.
We are utilizing Space Fortress in a basic research program that is designed to integrate cognitive and social learning theory in the development of group protocols for training complex skills. We present evidence that groups of 2, 3 and 4 can learn Space Fortress as well as 1 using 1/2, 1/3 and 1/4 the trainer time and resources respectively. We also present preliminary empirical steps towards individualizing training within groups according to individual differences in selective attention. We discuss implications for developing automated instruction that is designed for small groups rather than for individuals.
This experiment investigated the workload associated with both a consistently mapped (CM) and variably mapped (VM) version of a memory/visual search task that required the processing of spatial pattern information representative of that found with some Command and Control (C{squared}) systems. A secondary loading task paradigm which required concurrent performance of an additional spatial pattern search task was employed. The results demonstrated superior dual-task performance relative to single-task baselines on both the primary and secondary tasks when the CM version of the task was performed. The results indicate that the development of automatic processing through training can reduce the workload associated with processing spatial patterns of the type employed by C{squared} operators.
The theoretical and practical importance of search paradigms has been well established. This experiment was designed to extend understanding of learning processes in search tasks. Subjects trained under memory, visual, or hybrid memory/visual search conditions and then either transferred to a different search condition (e.g., train on memory, transfer to visual search) or served as controls (e.g., train on memory, transfer to memory search). Asymmetrical transfer was observed. These results have implications for current theories of attention as well as applicability in training situations.
Analyses are continuing in the Man-Systems Division of NASA Johnson Space Center, on the restructured Space Station Freedom (SSF). Viewing requirements for the SSF indicate that assembly and extravehicular crew operations should be viewed by direct means whenever possible. To analyze the extent to which the Cupola meets this requirement, positions on the port side of Node 2 and the zenith side of Node 1 were evaluated. These analyses utilized the tasks from Mission Build (MB) 6 through 16 to investigate these two Node positions. The analyses conducted were based on a 4-position rating scale (Excellent, Good, Marginal, and Inadequate) which solicited data from both expert crewmembers and specialists in the field of robotic assembly operations. To remedy the potential direct viewing problems identified through this investigation, it was recommended that additional camera ports be placed along the truss and on the modules to provide indirect orthogonal viewing for berthing operations and pressurized module attachment. It was also recommended that additional Node positions be investigated to determine the optimal location for the Cupola based on crew considerations, direct viewing requirements, lighting issues, and operational tasking issues.
The present experiment investigated the effect of varying the degree of task consistency on the performance and maintenance of skill in a semantic category visual search task. It is well established that for a wide variety of tasks, skill development is a function of the degree of task consistency. However, the effect of inconsistency on established skills has not been investigated to date. The present experiment included a consistent Training Phase, an Adjusted Consistency Phase, and a Retraining Phase. Subjects were trained for 6,000 Consistently Mapped (CM) trials on two different categories. Subjects then performed 4,000 trials in which one of the previously trained categories remained 100% consistent, while the other category became either 100, 67, 50, or 33% consistent. Task consistency was then restored and participants performed another 4,200 CM trials. The Retraining Phase included a New CM category. Results indicated that performance was disrupted by inconsistency, and that disruption increased as consistency decreased. Upon the retum of task consistency, performance improved rapidly, although some performance disruption was still evident. The results are discussed in terms of visual search theories, and for their relation to training design.
This experiment examined the effects of three methods of presentation, one massed and two distributed, on recognition of complex visual stimuli (military aircraft). Also examined was whether the effects of these methods differ as a function of the view at test (same or different from the studied view). In the massed presentation, aircraft were exposed once for eight seconds with each exposure separated by a blank interval of 20 seconds. In the successive distributed condition, each target aircraft was presented four times in a row for two seconds with each exposure separated by blank intervals of five seconds. In the random distributed condition, the aircraft were presented for the same on-off time intervals as the successive distributed condition, but the sequence of the study list was random. Results showed that recognition performance, as assessed by measures of hits, false alarms, and discrimination accuracy was significantly better when the same view was given at study and at test versus a different view. While presentation method did not produce an effect by itself, it did interact with test view. With a different view at test, distributed presentation showed a small, but significant, improvement in recognition performance compared to massed presentation. These results are discussed with regard to the high likelihood that most real-word visual stimuli are seen in a different views at subsequent exposures. Distributed presentation may be a useful way to prepare individuals for a different view at a later time.
The Federal Aviation Administration has embarked on a major curriculum redesign effort to improve the training of en route air traffic controllers. Included in this effort was a cognitive task analysis. One component of the task analysis was an analysis of operational errors, to obtain insights into cognitive-perceptual factors contributing to controller decisionmaking error. The data suggest that a failure to maintain situation awareness is the primary cause of controller error. These results highlight the importance of the controller task "maintain situation awareness", and are consistent with the findings of the other analyses. An approach for training situation awareness skills is presented in relation to models of expertise developed from other analyses: an expert mental model of air traffic control, and a task decomposition listing thirteen primary controller tasks. The findings and training paradigm have implications for training other complex high-performance tasks performed in a real-time, multi-tasking environment.
The Federal Aviation Administration has embarked on a major curriculum redesign effort to improve the training efficiency of en route air traffic controllers. Included in this effort was a comprehensive cognitive task analysis conducted in several phases, spanning several years. Eight different types of data collection and analysis procedures were used, resulting in an integrated model of controller expertise. This paper provides a description of controller expertise, and describes the training program under development. This is one of the first examples of cognitive task analysis being applied to study expertise in complex cognitive tasks performed in time-constrained, multi-tasking environments.
The purpose of this effort was to model expert pilot performance and decision making in one-versus-one (1v1) air-to-air combat. Several knowledge-elicitation techniques were used to extract air combat expertise from a former fighter pilot, who served as the subject-matter-expert (SME). Unstructured and then structured interviews were used to elicit the goals and sub-goals of air-to-air combat, plus some of the pilot behaviors necessary to accomplish the goals. The SME also flew a number of combat sorties against another former fighter pilot in the Simulator for Air-to-Air Combat (SAAC) to demonstrate pilot performance required to accomplish the goals of air combat. Based on the SME's verbal protocols, a group of air combat rules were developed. A rule-based production system was then designed to incorporate the resulting knowledge base. The production system was also designed to be capable of analyzing an existing data base of air combat engagements. Expert system development required additional input from the SME to identify specific values of flight parameters required by the production system. Upon completion and SME verification of the expert model, it will be validated by comparing its performance to that of our SME in simulated air-to-air combat. If the model can successfully describe expert pilot performance, the model will be used to provide diagnostic performance feedback in conjunction with SAAC training.
Experienced subjects participated in four consecutive experiments in which they performed a simulated low-level flight task. The study spanned several months, and various motivational techniques were employed with each experiment. Since the task involved low-level flight, accurate altitude control was desirable, and crash rates were of major concern. Based on both verbal and written subject debriefings, it was concluded that (1) providing lists of top scores promoted competition and motivated the subjects to improve their altitude control performance, (2) penalizing scores and negative reinforcement in the form of posted crash lists were effective in reducing crash rates, and (3) monetary awards were a minor source of motivation but were not considered a primary incentive to the subjects.
The objective of this symposium is to examine the measurement of teamwork skills (i.e., team process), and the impact of these measurement capabilities on team training. These skills are one of the most difficult components of team performance to both measure and train, because they are not readily quantifiable like team inputs and outputs. Therefore, the papers included in this symposium examine the measurement of team process from the standpoint of theories, methodologies, applications, and psychometric properties.
What if we took seriously the fact that team performance is not synonymous with individual performance? Although teams appear to be the new workhorses of economic and social goal accomplishment, the processes by which they accomplish their goals remains relatively unexplicated and not well understood. In this paper, we argue that coordination is an important unifying construct for defining, measuring, researching, and training effective team performance.
The majority of aviation incidents and accidents are attributable to human error (Billings & Reynard, 1984). Most of these human errors involve the ineffective use of team process factors, which are often referred to as Crew Resource Management (CRM) skills in the commercial aviation literature (Helmreich & Foushee, in press). In addition to these applied concerns, a revised version of McGrath's (1964) theory of group performance (Foushee & Helmreich, 1988) suggests that one must analyze the process (i.e., team process) by which a group's inputs (e.g., personality, attitudes) are transformed into group outcomes (e.g., task performance, mission safety) in order to understand how a task-oriented group functions. Therefore, team process attracts theoretical as well as practical interest. The NASA/UT/FAA Line/LOS Checklist (LLC: Helmreich, Wilhelm, Kello, Taggart, & Butler 1991) is one measure of team process that has proven useful in assessing CRM skills in training and in actual line operations. This paper briefly reviews concepts in team process and summarizes the LLC research findings pertaining to the use of CRM skills in commercial aviation.
The purpose of this research was to establish the construct validity of a behaviorally anchored rating scale developed to measure team process behaviors. This scale contains six skills (i.e. leadership, assertiveness, decision making/mission analysis, situation awareness, communication, adaptability/flexibility) that were identified through a prior needs analysis with training specialists and subject matter experts. Student and instructor pilots (104 individuals, 51 teams) participated in two team tasks (simulated aviation tasks) which were designed to elicit the team process behaviors identified for the rating scale, and were rated on their behaviors. A multitrait-multimethod analysis on the resulting ratings (Campbell and Fiske, 1959) was conducted. Evidence of convergent and discriminant validity as well as some method bias were found when the method investigated was team task. Implications for the use of the team process scale in training are discussed.
The evaluation of team training within the Navy often relies on instructor assessments of human performance. Often, assessments are subjectively derived and may, therefore, contain biases. Consequently, a method for objectively measuring Navy team performance was developed in an attempt to supplement commonly found subjective assessments. The technique is based upon collecting observable indicators of effective and ineffective behaviors across several critical functions of the anti-air warfare (AAW) team. The effective and ineffective indicators are mathematically combined to form a performance index ranging from 0.00 (low) to 1.00 (high) to reflect the team's overall performance level. The AAW Team Performance Index (ATPI) provides a systematic, consistent, and objective measurement approach linked to specific exercise events. The development and use of the ATPI are described.
Manned spaceflight missions result in human exposure to reduced gravity environments, during which the human body undergoes some pronounced physiological changes. Exercise has been identified as a practical and operationally acceptable countermeasure to the physiological responses to "zero-gravity". At the National Aeronautics and Space Administration's Johnson Space Center, a new treadmill is under development for use on Shuttle flights. One of the main challenges of this project is the development of an effective restraint system. The restraint system must place a body weight load on the subject while the subject exercises in zero-gravity. Additionally, the restraint system must allow the subject to exercise in zero-gravity at various percent grades (treadmill slopes). This paper discusses the restraint system of a prototype treadmill and zero-gravity test results. The results indicate the manually operated, prototype restraint system has some limitations and that a real-time feedback system utilizing a servo operated adjustment mechanism would significantly enhance performance.
This panel discussion will examine the proposition that the field of human factors has technology relevant to two national problems: work force competitiveness and education. Specific examples of relevant technology will be presented and discussed.
A fundamental purpose of a display format is to allow the human operator to construct and maintain an accurate representation of reality. In order for display designers to know how to portray spatial information, one must understand how humans represent and use spatial relationships. The purpose of this study was to determine the effective use of four different types of spatial display formats in the performance of a spatial discrimination task. Forty subjects initially viewed a display portraying simulated radar returns representing the relative position of two other aircraft (in formation), and then chose which of two spatial alternatives portrayed the true spatial relationship viewed previously. Results showed that subjects' ability to discriminate between the spatial alternatives was adversely affected by the type of display format used, and the degree of distortion of the true spatial relationships. The results are interpreted in terms of the resolution of one's mental representation of spatial relationships.
Air Force pilots and control subjects were tested on a visual "mental rotation" task. Nine of the 16 pilots, as well as all of the 16 control subjects, required more time to rotate greater angular distances. The performance of the other 7 pilots was unique: their response time did not increase with greater angular rotations. The results suggest that visual mental rotation can be accomplished by at least two different processes. One process involves incremental object rotations in a multi-step mapping -- like an actual physical rotation of an object -- going through intermediate stages. This process requires more time to rotate greater angular distances. The other process involves direct translation in a single-step mapping. In this process, the starting position transforms into the final position in one mapping without any intermediate steps, and thus does not require more time to rotate greater angular rotation. The lack of intermediate stages, which may allow small perturbations in location to be corrected, affects the accuracy of this process; this is particularly apparent when more complex stimuli are rotated. The pilots who did not show incremental rotation effects had different and distinct error patterns, their errors increased when rotating the more complex stimuli.
The purpose of this experiment was to determine if there is a relationship between the development of a perceptual skill and the visual field of presentation for verbal and spatial stimuli. Subjects performed an extended practice Sternberg task in which targets were presented in either the left visual field (LVP) or right visual field (RVF). Both verbal (letters) and spatial (3x3 grid patterns) stimuli were used. The results indicated that visual field was not a significant factor for simple verbal stimuli. However, there was an initial LVF, or right hemisphere (RH), advantage for spatial stimuli that switched to a RVF, or left hemisphere (LH), advantage after a skill develops. These data support an analytic role for the LH, which may be the focus for feature detection expertise. Another finding was that individual differences in cerebral dominance may influence the development of perceptual skill. Together these data shed light on possible biological constraints of human information processing models.
An experiment was conducted in a fixed-base driving simulator which manipulated the time-to-arrival (T{sub:a}) of an oncoming vehicle, the viewing distance to that vehicle and the type of oncoming vehicle to determine the perceptual basis for a left-turn decision. Forty-eight participants were randomly assigned to a group where either a motorcycle, a compact car, a full-size car, or delivery truck represented the oncoming vehicle. There were an equal number of participants of each gender in the four groups. As T{sub:a} was increased, underestimation of vehicle arrival time also increased. Significant main effects were found for T{sub:a}, gender of participants, vehicle type, and viewing distance, and for interactions for gender x T{sub:a} and gender x vehicle type. Males and females differed in their accuracy of judgments for vehicle types, where males were more accurate in estimating the arrival of delivery vans and motorcycles than their female peers. The pattern of results for the size of the approach vehicle were consistent with a margin-of-safety explanation which argues that driver underestimation of the arrival times of larger vehicles generally allows larger margins-of-safety than for smaller vehicles. The importance of these findings for the development of advanced in-vehicle collision avoidance and warning systems is briefly considered.
RAPCOM (rapid communication) displays involve temporal presentation of information in the same spatial location and have been suggested to have useful potential for human-computer interactions involving high information transfer rates (cf., Matin and Boff, 1988). An experiment was conducted to evaluate the relative effectiveness of various spatial and temporal display formats for presenting information pertaining to the likelihood of aircraft stall using the simulated dynamics of a light aircraft. Specific spatial and temporal characteristics of the display formats were based on the proximity compatibility principle (PCP) which attempts to integrate findings regarding the benefits and limitations of displaying multiple sources of information in similar or "proximal" ways (Wickens and Andre, 1990; Carswell and Wickens, 1990). The effectiveness of these display formats were compared for judgments which required the integration of three display parameters (airspeed, bank, and flap angle) to determine stall probability with those requiring focused attention necessitating the recall of the specific value of one of the parameters. For the complex monitoring task used in this experiment, temporal display formats were generally associated with the most accurate performance. Furthermore, the overall pattern of results was not consistent with design guidelines suggested by the PCP, and suggest difficulties when attempting to define "proximity" in terms of physical metrics based on spatial or temporal parameters.
Information formatting in terms of optimal spatial and temporal parameters has become an important issue with the advent of computer automated displays. One temporal format involving sequential presentation of information, termed RAPCOM (for rapid communication; cf., Matin and Boff, 1988), has the potential to increase performance in situations involving high information transfer rates. The present study investigated the relative contributions of two spatial parameters comparing RAPCOM with more conventional spatial formats involving simultaneous presentation of information. The parameters of character size and spatial separation were examined because they are important determinants of display legibility and visual search, respectively. Performance was assessed in terms of speed and accuracy for a task that required observers to recall integers presented in either an analog or digital format. The findings showed that accuracy performance decreased as the information became spatially separated. Specifically, RAPCOM formats produced the best performance and the large spatial separation the worst performance. A different pattern of results was obtained for character size, depending on whether the display indicators were analog or digital. For analog dials, character size had no systematic effect on performance. However, for digital dials, character size produced an interaction in that the fastest and most accurate performance of all conditions was associated with the spatial format consisting of large characters and small spatial separation. In other words, under conditions associated with high legibility and relatively low visual search, more traditional spatial formats exceeded performance levels associated with the RAPCOM format. These findings are relevant for designers when trying to evaluate the relative merits inherent in spatial versus temporal display formats.
Recent work in graphical perception has attempted to identify the mental operations used by an observer when extracting information from a graphical display (e.g., Hollands and Spence, in press; Simkin and Hastie, 1987). The current research varied the alignment, scaling, and size of proportions shown in pie charts and divided bar graphs. Subjects were required to discriminate between two proportions (i.e., which proportion is larger?), each shown relative to its own whole. Response times and errors were measured. Results from Experiment 1 show that for both pies and divided bars, the time penalty for discriminating unaligned proportions was dependent on the size difference between the two proportions, with a greater penalty with a smaller percent difference. Results from Experiment 2 show that different scaling slowed subjects considerably, especially when the size difference was small, and especially with divided bars. The results are interpreted in terms of hypothesized alignment, scaling, and discrimination operations. The practical implications for graphical design are also discussed.
The secondary task technique was used to test two alternative explanations of dual task decrement: outcome conflict and resource allocation. Subjects time-shared a continuous tracking task and a discrete Sternberg memory task. The memory probes were presented under three temporal predictability conditions. Dual task performance decrements in both the tracking and memory tasks suggested that the two tasks competed for some common resources, processes, or mechanisms. Although performance decrements were consistent with both the outcome conflict and resource allocation explanations, the two explanations propose different mechanisms by which the primary task could be protected from interference from the concurrent secondary task. The primary task performance could be protected by resource allocation or by strategic sequencing of the processing of the two tasks in order to avoid outcome conflict. In addition to examining the global trial means, moment-by-moment tracking error time-locked to the memory probe was also analyzed. There was little indication that the primary task was protected by resequencing of the processing of the two tasks. This together with the suggestion that predictable memory probes led to better protected primary task performance than less predictable memory probes lend support for the resource explanation.
The purpose of the present study was to examine the utility of the resource notion, which is the basis for the secondary task technique of workload assessment. The unbiased optimum-maximum method proposed by Navon (1984) was used to manipulate task priority without conveying to the subjects that time-shared performance must tradeoff. Three task pairs that fell on a continuum of degree of shared resources were tested. The data showed that performance tradeoff is not an experimental artifact. Moreover, the data suggested that increased degree of shared resources led to increased resource allocation optimality and decreased time-sharing efficiency, as predicted by multiple resource theories. The present data suggests that resource theories are useful in explaining dual task performance, and that the secondary task can be a useful workload assessment tool.
The present study evaluated dual-task performance as a function of the vertical separation between a tracking task and a discrete-response task, to provide data relevant to the positioning of aircraft head-up display (HUD) information. The data were consistent with Sanders' (1970) research on visual scanning where a nonlinear decrease in performance as a function of the horizontal separation between two displays was observed. Performance is equivalent across a range of visual angles from superimposition to 6.4{deg} vertical separation between displays. The cost to performance is increased for moderate vertical separations (9.6{deg} to 22.5{deg}) where visual scanning is required. At larger separations, the performance cost increases linearly with visual angle, where head movements may begin to supplement eye movements in order to access information. The function which describes the cost of vertical separation was observed to be larger at both small and moderate visual angles when the information in the two displays required integration. The data suggest that nonconformal HUD information may be placed a few degrees down from a superimposed position without a significant performance loss.
This experiment investigated whether mnemonic strategy training, occurring over a two-month period, would result in improved memory performance when combined with reattribution training. It was also hypothesized that the old and young may differ in their ability to perform nonverbal and verbal mnemonics. Therefore, age-related differences in memory performance were investigated as a function of whether the mnemonic was verbal (Alphabet Search Method) or non-verbal (Method of Loci), and whether or not reattribution training was combined with mnemonic training. Subjects were 34 old (Mean age = 69.5) and 34 young (Mean age = 22.8) adults. Memory performance was measured on the California Verbal Learning Test, the Nelson-Denny Vocabulary Test, the Beck Depression Inventory and four memory span tasks, prior and following a two-month period of weekly mnemonic strategy training sessions. A third of the subjects were trained with the Method of Loci, a third with Alphabet Search, and the remaining third served as the waitlist control group. In addition, half the young and old subjects from each mnemonic group did, and half did not, participate in a reattribution training workshop. Results clearly showed that mnemonic strategy training was useful for the old and young. However, the combination of reattribution and mnemonic strategy training only enhanced old, not young, memory scores when the type of strategy required verbal skills (Alphabet Search). The implication was that mnemonic strategy training may be more effective for the old if combined with reattribution training, and, if the mnemonic requires verbal rather than non-verbal skills.
Fifteen male volunteers participated in a dual-task study in which the central processing load of visual memory and tracking tasks and the physical load of the tracking task were orthogonally manipulated to produce varying levels of task difficulty. Multiple modes of assessment were used to measure mental workload (MWL) across difficulty levels, including: performance, subjective, cardiovascular, and metabolic. To our knowledge, this study is the first to demonstrate metabolic change with manipulations of cognitive task difficulty; others have found only baseline-to-task changes. The relation of the metabolic demands of the task to central processing resource utilization provided support for a structural energetic model of attention that may help to explain measure dissociations. The results of the present study indicated that heart period was only sensitive to central manipulations of task difficulty that affected energetic resources. Performance and subjective MWL were sensitive to all cognitive components of the tasks. We suggest that cardiovascular measures will associate with other measures only when the manipulations of task difficulty require energetic adjustment, and would expect these measures to dissociate when energetic adjustment is not required.
The purpose of this symposium is to provide a forum for technical exchanges between vision researchers and those interested in visual target acquisition. The symposium will also introduce those in the audience who are not specialists in these areas to important concepts from modern vision research, and provide an introduction to models of visual target acquisition. These topics play an important role in many applications in which human factors specialists are called upon to provide inputs (e.g., computer graphics, image quality, aviation displays, camouflage). However, information on these topics is not readily available. Few reports on visual target acquisition modeling have been published in the open literature, and the topic is not covered in most human factors texts. Basic vision research and linear-systems models are available in the open literature, but this material requires considerable background before it can be used effectively. This, in fact, is probably the major reason why there has been little communication between the applied community and vision researchers to date. This symposium will help to remedy this problem, and encourage the transfer of knowledge from those involved in basic research to those concerned mainly with applications.
Two traditions of vision modeling have coexisted for many years with little or no transfer of information between them. Those interested in models of visual target acquisition for real-world scenarios have developed engineering models, which are essentially empirical summaries of visual performance data. On the other hand, basic researchers in visual psychophysics and neurophysiology have developed quantitative models of pattern perception. The basic research models have increased in generality and scope to the point that they are potentially powerful tools for addressing certain real-world needs that have recently come to the fore. The needs include quantitative, theory-based methods for evaluating target signatures, effects of background clutter, and observer false alarm rates. This paper reviews the shortcomings of existing target acquisition models, and reports work in progress to develop an improved model of target acquisition that incorporates a model of pattern perception from basic vision research.
The relationship of human target acquisition times and detection probabilities to electronically measured visual clutter was investigated. Ninety computer-generated scenes simulating infrared imagery and containing different levels of clutter and zero, one, two, or three targets were produced. Targets were embedded in these scenes counterbalancing for range and position. Global and local clutter were measured using both statistical variance and probability of edge metrics. Thirty-three aviators, tankers, and infantry soldiers were shown still-video images of the 90 scenes and were instructed to search for targets. Analyses indicate differences between the aviators and tankers in search times and types of errors. Results of multiple regression analyses of global clutter, local clutter, range, target dimension, target complexity, number of targets, and experience on search times are given and discussed in terms search strategies.
A model consisting of multiple tuned and oriented spatial filters followed by non-linear transducer functions is described. The model was originally derived to account for human perception of contrast while viewing isolated stimuli. The model can also account for human estimates for the image sharpness of spatially filtered real world scenes. The model has several shortcomings uncovered by recent experimental results involving suppression of the apparent contrast of a foveally presented grating patch by a peripheral grating.
Visual target acquisition (TA) often involves detecting targets against natural backgrounds that have complex luminance distributions. The purpose of this study was to evaluate a simple technique that controls target contrast in the presence of varying backgrounds. Target contrast was measured by the root mean square (rms) method and was controlled by adjusting only the target luminance, leaving the background unchanged. The technique was tested in a TA paradigm in which observers searched for an aircraft that was embedded in 1) a uniform background, 2) a natural terrain background, or 3) a moving natural terrain background. Four target contrast levels were tested. The results showed that TA time varied with background and target contrast. Significant differences in TA time were observed among the different backgrounds for targets of the same physical contrast, especially at low contrast levels. Although contrast had a systematic effect on TA performance, factors other than contrast influenced TA performance. It was concluded that background structure increased TA time by camouflaging targets and by introducing distractors to the task. Such an approach could be used to model TA performance under conditions where target and background complexity are an inherent feature of the TA task.
Unmanned Aerial Vehicles (UAVs) are used to conduct a variety of reconnaissance missions with human operators interpreting the transmitted imagery at ground stations. Current UAV data link designs require limited capacity which will result in a cost to the operator. Two common techniques to reduce video data rates exist, data compression and simple data reduction such as lowering of frame rate and resolution. The objective of this research was to determine the degree to which data volume can be reduced in terms of frame rate, spatial and grey-scale resolution, while retaining sufficient information to support human performance. Two studies were conducted to examine the influence of frame rate, resolution, and compression trade-offs. Experiment I utilized real mission imagery to assess operator performance in target detection, recognition, and designation. Experiment II used a simulation with dynamically manipulated UAV parameters to assess the influence of frame rate and resolution on target designation and tracking. Results indicate that frame rate has a greater influence than resolution on human performance in all four tasks. Overall, operators can perform tasks at rates reduced to 4 frames per second. Half resolution over the total display does not adversely affect performance except in recognition tasks. When resolution is calculated as a function of dynamically-controlled UAV parameters, 8 TV lines across the target appears to result in the best performance; however, these data are not as consistent as those in Experiment I.
Many experimental and real-world viewing situations provide a context in which the target stimulus is displayed against a background set at a different but determinate distance. Conversely, other situations occur where the background distance is indeterminate, i.e., a textureless background. There has been evidence accumulating over the past two decades to suggest that the assumption of accurate visual accommodation will not be sustained under all these circumstances. Although earlier assumptions held that the centrally located stimulus would determine the level of accommodation, this experiment tests that assumption by varying the cues to background distance (well-textured, lighted, distant background and the same background unilluminated) and the distance to the target stimulus. Two groups of six participants observed targets (2 deg.) at six distances (0.9, 1.8, 3.7, 7.3, 14.6 and 29.3 m) and their visual accommodation was measured with a laser optometer. Results indicated that the group viewing the visible distant background evidenced a more distant accommodative response with the typical lag of accommodation. These results indicate that conditions of accommodation in the natural environment may have a profound effect on accommodative accuracy. In turn, this inaccuracy has been shown by others to correlate with inaccuracies in the perception of size and distance. Inaccurate accommodation has been found to delay target detection appreciably as well. Ameliorative approaches are discussed.
The present study evaluated a new aircraft attitude display concept. The new symbology format, or Theta display, was developed by integrating the features of the conventional attitude/direction indicator (ADI) and head-up attitude reference display (HUD) into a single format. Number of trials to reach a specific performance criterion and tracking performance were collected as dependent variables on an attitude maintenance task. The results show that performance and training time were better with both the Theta display and the ADI than with the HUD. The findings support the hypothesis that an attitude display formed of the integration of ADI and HUD symbology will demonstrate a performance benefit over a pure HUD format.
Two experiments were conducted to investigate the mechanisms which underlie the learning in consistently mapped (CM) memory search. In Experiment 1, old and young adults were trained in both CM and variably mapped (VM) category search. The training results replicate previous findings by Fisk and Rogers (1991). Even though older adults are initially at a disadvantage relative to young adults, the comparison times of young and old adults are near zero after CM training. For VM, older adults remain at a disadvantage relative to younger adults, even after extensive training. A full reversal manipulation was implemented in Experiment 2 to investigate the learning in memory search. Initially, the young subjects were less affected by the full reversal condition compared to the performance of the older adults. However, older subjects quickly recovered and both young and old were performing at trained CM levels within 60 trials of additional practice. These results suggest: (a) attention is not being trained in CM memory search; (b) automatic category activation does not contribute much, if at all, to the performance improvement in memory search; and (c) age-invariant learning mechanisms account for performance improvement in CM memory search.
Performance effects of using different display information formats for the detect, diagnose and correct task components of fault management were evaluated in this preliminary study. Data for accuracy and response times were collected for a detect task, a detect and diagnose task, and a detect, diagnose and correct task across three levels of display information format. Levels of display information format included a digital format, an analogue format, and a combined (digital and analogue) format. Predictions for the appropriate level of display information format for the fault mangement tasks were based on the multiple information format concept. In general, the results obtained in this study failed to support the predictions of the multiple information format concept.
This study compared the relative effectiveness of three computer-based formats for displaying Navy system status data. Response speed and accuracy data were collected for each format on four tasks typically performed in a shipboard Combat Information Center (CIC). The three presentation formats were character readout (CRO), text-only, and text-graphics. Results showed the text-only and text-graphics formats produced faster, more accurate performance than the CRO on count and compare tasks; however, no reliable performance differences were found between presentation formats for identify and criterion tasks. Predictions concerning an advantage for the text-graphics format over the text-only format on certain types of tasks were not supported by the study findings. The practical applications and design implications of these findings are discussed.
Interest in the study of attention control under dichoptic conditions is instigated by the contemporary development of night-vision systems based on single-eye helmet-mounted displays. Two experiments were conducted to investigate the concurrent performance of a tracking task and letter classification under dichoptic display conditions. Subjects were required to fly a simulated helicopter path while classifying letter pairs presented intermittently. Experimental instructions in Experiment A specifically emphasized a two-dimensional interpretation of the visual field. Under these instructions, the presentation of a common visual axis to the two eyes provided by the flight-tunnel did not aid subjects, and their performance deteriorated in dichoptic conditions. In Experiment B, the instructions to subjects were changed to advocate a three-dimensional interpretation of the display. Under these instructions, dichoptic performance-levels were substantially improved when the tunnel was present. These results imply that the presence of a common visual axis is not automatically beneficial. In order to improve performance, attention should be intentionally directed to utilize information supporting a three-dimensional frame of mind. These findings have important implications for understanding the dynamics of performance with single-eye helmet-mounted displays, and the training of pilots in their use.
The illumination and pupillary dilation requirements for calibration on an eyegaze response interface computer aid (ERICA) were studied. The purpose of this study was to determine whether decreases in ambient illumination level would facilitate calibration and increase the probability of use by subjects. Monocular versus binocular calibration was also studied to determine whether the occlusion of one eye would cause the pupil of the other to dilate, therefore allowing the use of a higher level of illumination during calibration. Twenty subjects (10 monocular and 10 binocular) were tested at four ambient illumination levels (10, 50, 100, and 210 lux) in both ascending and descending orders of presentation. Analyses of frequency and pupil diameter data revealed a statistically significant increase in calibration at lower levels of illumination. An increased frequency of calibration for monocular (versus binocular) viewing conditions was also found.
This investigation was conducted to more fully define the physical characteristics of individuals engaged in ordinary reading tasks. Eye to display viewing distances were measured for subjects reading from both a handheld configuration and from a structurally fixed configuration which approximated an electronic display. Estimates of each subject's resting point accommodation were also obtained and compared to observed viewing distances. Findings revealed significant differences between handheld and fixed configuration displays. Relationships between display viewing distance and resting point accommodation were not apparent. The resting posture of accommodation and seated posture are discussed as potential contributors to determination of viewing distance preferences.
Forty subjects responded to a set of 64 different combinations of linear displays and rotary controls presented by photographic slides. The subject's task was to rotate a control to increase the numerical value on the display. It was expected that response time for an arrangement having a strong stereotype would be faster than one with a weaker stereotype. Data showed there a strong relationship between these two measures of compatibility for horizontal displays with controls either on the top or bottom of the display; there was no significant relationship for any of the vertical layouts. Comparing horizontal and vertical displays, the average response times were 1.25 and 1.55 seconds and average stereotype strengths were .86 and .73, respectively. Thus on both criteria, horizontal displays were superior to vertical displays. Response time was found to be dependent on the magnitude of the component principle making the greatest contribution to the strength of the overall stereotype. In the case of horizontal displays this was the clockwise-to-right principle; for vertical displays it was Warrick's principle or, if this was not applicable, the scale-side principle.
Three variables were manipulated in an attempt to determine the conditions of optimal performance using object-like displays. Uniquely color coding the vertices of the object did not appear to cause a significant change in separate or integral task accuracy. The introduction of a display based on the Gestalt law of closure in which the middle third of each side of the object was removed improved separate task accuracy relative to the object display. Separate task accuracy for the closure display was not as good as the bar display. Integration task accuracy was not harmed by this manipulation. The validity of the emergent feature for information integration was manipulated. Lower levels of validity reduce integration task accuracy for all displays equally. Thus, if information integration is the operator's primary task, display designers should consider using the closure display in place of the object display. The usefulness of both object and closure displays may be limited since the emergent feature may be less than 100% valid for the information integration task in many real world situations. This is due to constraints in the geometry of object displays.
A study was performed to test the hypothesis that color coding can be used to enhance the speed and accuracy of performance on a focused attention task when object displays are employed. Subjects performed both a focused attention and an integration task while viewing a rectangle display that represented the readings of four system parameters. The object displays were presented to subjects in one of four color coding conditions: (1) monochrome; (2) parameter type; (3) parameter state; or (4) system state. Study results indicated that the system state color code significantly reduced integration task response time without degrading integration task accuracy. For the focused attention task, there was no significant difference between monochrome and the remaining color code conditions for either response time or accuracy.
Thirty adult subjects studied each of eighteen single-function line graphs for self-determined periods. The structural complexity of the stimulus graphs was varied in three ways: through addition of data points, reversal of trends, and elimination of symmetry. Subjects provided written interpretations immediately following examination of each graph. Indirect indices of comprehensibility (i.e., increased graph study times and increased content in the written interpretations) suggested that trend reversals were the primary determinant of complexity. While the number of data points and the presence or absence of symmetry were not associated with longer study times or greater overall content production, varying these structural features did lead to strategic shifts in the interpretive emphasis on global versus local features of the graphically-displayed data. Specifically, the presence of symmetry or the addition of data points led to increases in global content and decreases in local content. Lastly, cognitive style of subjects was systematically related to graphical interpretation. Impulsive subjects were less likely than reflective subjects to interpret local features of the graph, and were also less sensitive to variation in structural characteristics.
This study sought to: (1) analytically separate the components of a graphical display which contributed to performance on integrated and separable tasks; and (2) determine the effect of the number of dimensions of information which had to be integrated. To that end, the study employed a 7 X 3 mixed design with seven displays manipulated between-subjects and the number of information dimensions (three, six, and nine) manipulated within-subjects. The seven displays examined included two bar graphs (non-object and object formats), two midline displays (non-object and object formats), a direct graphical display, and two numerical displays (numerical separable and numerical integrative). Based upon propositions generated from emergent feature theory, the ability to integrate information in these displays should be a function of the faithfulness, saliency, and directness of mapping the decision statistic onto the display. Results indicated that the displays which directly represented the integrated decision, the numerical integrative and the direct graphical displays, resulted in the best performance. Intermediate performance was obtained on those displays (i.e. the object bar graph, the non-object midline, and the object midline) which incorporated faithfulness, saliency, or both, respectively. The worst performance on the integrated task was exhibited for those displays (i.e. the numerical separable and the non-object bar) which did not represent directness, faithfulness, or saliency. For both the integrated and separable tasks, accuracy increased as the number of information dimensions increased. The unexpected direction of this effect was attributed to subjects' investing more resources in performing the task at the six or nine cue levels due to the perceived increase in difficulty of the task.
The present study was designed to examine the role of boredom, perceived mental workload, and perceived control in vigilance. Subjective estimates of boredom and mental workload were measured before and after a 40 minute vigil during which movements of a computer mouse were monitored. In addition, subjects were administered Rotter's (1966) locus of control inventory. Subjects who made progressively more movements over time reported the highest levels of boredom and workload. In addition, the subjects with the highest performance levels were the most cautious in their responding, had an internal locus of control, and tended to experience less frustration. Significant, positive correlations were also observed between the boredom and workload scores suggesting that boredom may be an important contributor to mental workload in sustained attention.
The purpose of the current project was to examine the nature of the age-related differences on the Raven's Advanced Progressive Matrices (APM). Three components were hypothesized to be involved in correctly solving the APM problems. These included a rule-identification component, a rule-application component (involving a one-rule spatial transformation), and a rule-coordination component. The project was designed to examine the influence of each of the hypothesized components on the age-related variance on the APM. Two tests presumed to measure each hypothesized component were presented to 183 adults between the ages of 21 and 83. Hierarchical regression analyses indicated that although all of the hypothesized components accounted for a significant amount of the variance on the APM (approximately 50% each), only performance on the tasks measuring rule application accounted for a unique proportion of the age-related variance on the APM. Implications of the results in regards to following symbolic instructions in assembly of objects and in driving are discussed.
This study examined the effects of exposure to intermittent jet aircraft noise played through stereophonic speakers (70dBA or 95dBA maximum intensity) on performance efficiency and perceived workload in a 40-min visual vigilance task. The noise featured a Doppler-like quality in which planes seemed to approach from the monitor's left and recede to the right. Performance in noise, measured in terms of perceptual sensitivity (d'), was significantly poorer than in a quiet condition. Moreover, in comparison to subjects performing in quiet, those who operated in noise were less able to profit from knowledge of results (KR) regarding performance efficiency. In addition to its negative effects upon signal detectability, noise significantly elevated perceived workload, as indexed by the NASA-TLX. This effect was robust; it was not mitigated by KR, even though KR served generally to reduce the overall level of perceived workload in the study. The consistency of the effects of noise in regard to both performance efficiency and perceived workload challenges a recent conclusion offered by Koelega and Brinkman (1986) that lawful relations are not observable in studies of the effects of noise on vigilant behavior.
Two studies examined the effects of automation reliability and task complexity on the monitoring of automation failures during performance of a flight-simulation task. In the first study, 24 students performed tracking and resource management tasks while an automation routine monitored for system malfunctions over four 30-minute sessions. Detection of automation failures was significantly higher for variable reliability automation (mean = 81.6%) than for constant reliability automation (mean = 32.7%), indicating that constant-reliability automation induced complacency in monitoring. The effect of automation reliability was eliminated when 16 more subjects were required to complete the monitoring task only. Neither group of subjects exhibited a vigilance decrement. In the second study monitoring performance and vigilance decrement were examined for a situation in which only one automation failure occurred during a session. 36 students were randomly assigned to one of three task groups: simple (visual discrimination task), single-complex (monitoring only) or multi-complex (tracking, resource management, and monitoring). In both the simple and the multi-complex tasks, more subjects detected the automation failure in the first ten minutes of a session than in the last ten minutes of a session (67%-17% and 75%-42% respectively). Subjects in the single-complex condition detected the automation failure equally well in both time periods (92%-83%). The results point to two areas of potential costs in the automation of a task: (1) constant patterns of automation reliability can lead to inefficiency in monitoring automation failures, and (2) infrequent automation failures in multi-task conditions can lead to a vigilance decrement. While these costs should not prohibit the implementation of automation, they should be considered in the design of any automated system.
Although the amount of clinical data available through critical care monitoring systems has steadily increased, little integration of that data occurs. Consequently, higher order relationships cannot be obtained directly. An ecological psychology approach to display design that attempts to reduce the cognitive load for the clinician by directly displaying the functional relationships between parameters is compared with a traditional approach in a monitoring and control task. Analysis of performance by Critical and Non-Critical Care Nurses and Novices suggests that the integrated display facilitates performance for all groups.
Demographics indicate that the population in the United States and other industrialized nations is growing older, and that the number of older workers and systems users can be expected to increase substantially over the next several decades. In order to assess possible differences between age groups the mental workload experienced by older adults as compared to that experienced by younger adults was investigated. Two tasks were utilized to assess short term memory (continuous recognition) and psychomotor (first-order unstable tracking) performance. The workload of each task was assessed with the Subjective Workload Assessment Technique (SWAT). Memory task performance measures and subjective workload ratings indicated a decrement in performance and an increase in workload for the older group relative to the younger group. Psychomotor task performance measures and subjective workload ratings indicated no difference between the age groups. It is hypothesized that the memory task makes greater demands on central processing resources than the psychomotor task used in this study. In support of this hypothesis, an analysis of the changes in ratings on the individual SWAT dimensions of time, mental effort and psychological stress revealed that an increase occurred only on the mental effort dimension for the memory task. This study implies that designers should 1) reduce or provide design features that lessen memory laden task performance for older workers, and 2) give more weight to the reduction of central processing resource requirements in trade-off studies.
Spatial localization has been identified as an age-sensitive process in selective attention. Because visual search in driving involves uncertainty concerning the location of information necessary for maneuvering decisions, an experiment was conducted to examine the effects of age and target location uncertainty on a simulated driving task. Seventeen younger subjects (aged 30 to 45 years) and 13 older subjects (aged 65 to 75 years) completed three tasks including two reaction-time tasks and a simulated driving task. The reaction-time tasks included three conditions (simple left, simple right, and two-choice) in a laboratory and in a stationary vehicle. The simulated driving task was conducted on a closed driving course while subjects sat in a stationary vehicle. Subjects were required to select one of two lanes using information presented either on a changeable-message sign or on traffic signals. In the high-certainty condition, subjects were told where to look for relevant information; in the low-certainty condition, they were told that information could appear in either place. Response times were measured from sign or traffic signal onset to the subject's activation of the vehicle turn signal. The results indicated small non-significant differences between age groups for the reaction-time tasks. Significant age-related differences were found in the simulated lane-selection task. Older subjects were 15% slower overall than the younger subjects. Uncertainty concerning the location of relevant information slowed decision-making speed for all subjects, but proportionately more for the older subjects (16% versus 11% for the younger age group). Uncertainty slowed responses to the changeable message sign more than to traffic signals for subjects in both age groups. The results are consistent with the spatial localization hypothesis, and suggest that older drivers may have more difficulty than younger drivers locating targets in visual search while driving. The results also suggest that effective use of changeable-message signs requires placement in locations with high expectancy, and allowing drivers sufficient time to locate the sign before reading the scrolling message.
Workload assessment has become a common part of system evaluation. Workload assessment is an important adjunct to performance measurement because the operator is sometimes flexible enough to disguise excessively demanding systems by expending additional effort to overcome optimal information processing limits. This is often referred to as the problem of determining a "workload redline." The present paper recounts an evaluation of a proposed redesign of the KC-135 tanker aircraft cockpit. The current KC-135 cockpit has three crew positions: pilot, copilot, and navigator. As part of a proposed redesign, modern automation capabilities to replace the navigator were considered. Ten operational KC-135 crews and two KC-10 crews were studied while performing missions of differing levels of workload in a high-fidelity simulator. Three main classes of data relevant to the redline issue were collected: Performance data, Subjective Workload Assessment Technique (SWAT) ratings, and Subjective WORkload Dominance (SWORD) ratings. Evaluation of the performance results demonstrated that the redesigned cockpit could be flown in accordance to regulations. This was a necessary first step, but could not ensure that acceptable workload had been obtained. Taken together, the SWAT and SWORD results strongly suggested that acceptable performance can be achieved at acceptable levels of workload. In conclusion, the present study is a prototypical example of using available assessment tools to determine system acceptability. These tools should be useful for many other system evaluations.
The purpose of this study was to develop a multivariate model with cross-sectional data that defined the decline in VO2max over time, and cross-validate the model with longitudinal data. The cross-sectional sample consisted of 1,608 healthy men who ranged in age from 25 to 70 years. VO2max was directly measured during a maximum Bruce treadmill stress test. Regression analysis showed that the cross-sectional age and VO2max relationship was linear, r = 0.45 and the age decline in VO2max was 0.48 ml/kg/min/year. Multiple regression developed the multivariate model from age, percent body fat ({percent}fat), self-report physical activity (SR-PA), and the interaction of SR-PA and {percent}fat (R = 0.793). Accounting for the variance in percent body fat and exercise habits decreased the influence of age on the decline of VO2max to just -0.27 ml/kg/min/year. This showed that much of decline in maximal physical working capacity was due to physical activity level and percent body fat, not aging. The multivariate equation was applied to the data of the longitudinal sample of 156 men who had been tested twice (Mean Age{Delta} = 3.1 {plusmn} 1.2 years). The correlation between the measured and estimated change in VO2max over time ({Delta}VO2max) was 0.75. The results of the study showed that changes in body composition and exercise habits had more of an influence on changes in maximal physical working capacity than aging. The developed model provides a useful way to quantify the changes in physical working capacity with aging.
This study was conducted to develop a design concept for an electronic memory device to enhance medication compliance in older users. The effort was supported by a Phase I Small Business Innovation Research (SBIR) grant from the National Institute on Aging (NIA). A user-oriented approach was used to develop a design concept for a memory device for older users. One hundred seniors were interviewed to identify their physical, physiological and cognitive capabilities and limitations, as well as their preferences for memory aid functions. Specific design requirements were gathered from user testing of six currently available memory aids with 30 of the original 100 elderly subjects. The interview and user testing results were consolidated to provide the basis for tradeoff criteria for memory and interface concepts, and for the development specifications for an optimal interface design for a memory aid designed specifically for the elderly user. A design concept was developed for a medication device that would be easy to use, would reduce the likelihood of scheduling errors, and would be non-threatening to older users who might otherwise be intimidated by an electronic device. The Phase I effort focused on enhancing medication compliance, which is a priority issue with the senior population.
Using common household products is often difficult for people with neuromuscular disorders, spinal cord injury, or arthritis. We need to better understand their capabilities when designing and adapting products that are easier for them to use. In this study, individuals with movement impairments used two experimental home control thermostats with features that allowed easier positioning and viewing. The participants employed a variety of grasping and manipulation strategies, including some that were not anticipated by the designers. Participants' preferences indicated that the appearance of the product, not just effective control design, was an important factor in their judgments. We discuss the implications of the study results for universal design and adaptation of traditional products for the elderly and those with disabilities.
Research in the area of computer anxiety has traditionally concentrated on the younger adult. In this study older adults (55 years and over) were compared to younger adults (30 years and under) on levels of computer anxiety and computer experience. Subjects in the study completed a demographic and computer experience questionnaire, and two computer anxiety scales. Previous research findings indicating a negative relationship between computer anxiety and computer experience was replicated for both young and older adults. Additional findings indicated that older adults were less computer anxious and had less computer experience than younger adults. Furthermore, older subjects indicated more liking for computers than younger subjects. However, while young males liked computers more than young females, no differences between older males and older females were found on the computer liking subscale. Some discrepancies between the two computer anxiety scales suggest further research is needed to validate computer anxiety scales for use with older adults.
This experiment investigated the demands synthetic speech places on short term memory by comparing performance of old and young adults on an ordinary short term memory task. Items presented were generated by a human speaker or by a text-to-speech computer synthesizer. Results were consistent with the idea that the comprehension of synthetic speech imposes increased resource demands on the short term memory system. Older subjects performed significantly more poorly than younger subjects, and both groups performed more poorly with synthetic than with human speech. Findings suggest that short term memory demands imposed by the processing of synthetic speech should be investigated further, particularly regarding the implementation of voice response systems in devices for the elderly.
A major challenge for human factors is designing specialized systems to mesh with the domain expertise and mental representations of the system users. Systems for computational chemistry are an important example. Chemists use chemical structure diagrams (CSDs) as part of their basic language of communication, along with chemical formulae, narrative text, graphs and charts. Current computer tools for communicating with CSDs in written contexts are often found by chemists to be incomplete, difficult to use, and offering inappropriate functionality. This problem arises from ineffective human computer-interfaces that, in turn, can be traced to a lack of understanding on how skilled chemists use, think about, and communicate with CSDs. A formal cognitive human factors analysis is applied to address this deficiency. The analysis is developed through critical incident interviews, question-answering protocols, and thinking aloud protocols. Its results include a GOMS-type cognitive model of the drawing/manipulation process and of the conceptual structures underlying that process. Existing CSD drawing tool interfaces can be seen to provide little support for the conceptual structures or cognitive processes identified. The model developed in this research is being used to develop a user-oriented HCI for chemical structure manipulation systems.
This paper describes a documentation writing methodology developed and used by the author to address some of the issues of consistency in documentation and product function, redundancy of research and solution, and product usability (including timeliness of delivery and quality of support) for a software product engineered, developed and deployed in a multi-organizational or corporate environment. The methodology is compatible with technical systems engineering, development and testing documentation requirements, and is applicable to software products for which there are existing or anticipated "user guides". The method used to accomplish these goals is the incorporation of existing user guide formats, wherever possible, in the documentation of technical specifications for detailed engineering, development and testing requirements. This paper describes the "cycle of documentation" methodology employed, identifies opportunities to use this methodology, and describes some of the benefits derived from using the methodology (both initially intended and later discovered).
Methods are needed for implementing findings of theoretical research early in the design phase and tracing them through to final designs. This paper describes one such approach in applying what is known about cognitive psychology, human factors, and development techniques to interface design. The basic technique used to provide a design framework was an adaptation of the Quality Function Deployment (QFD) house of quality. This paper describes the QFD structure and how it was adapted to provide that critical link between theoretical research findings and resulting interface design concepts. The discussion focuses on three topics: basic concepts within the house of quality, the house of quality adapted for interface design, and application to the design process. A number of benefits are realized from use of this approach. First, it describes directly the relationship between human processing characteristics, design requirements, and design solutions. Second, it characterizes the nature of conflicts among alternative design solutions. Third, it indicates areas of potential applied research. Finally, it provides a single, hierarchical construct that carries through from the initial conceptual design to final product evaluation. The benefit of this approach to interface design is that a broad spectrum of theoretical and experimental research is summarized into a manageable design tool, which may provide insights to human factors practitioners, design engineers, and subject matter experts alike.
Call handling duration in a telephone service center was substantially improved by providing users easier access to mainframe data systems. Switching from a system that allowed access to only one system at a time to one that provided a separate window for each system produced an estimated 7% improvement in call handling duration. The improvement was most marked for those calls which required multiple system access.
The terms "Virtual Reality", "Artificial Reality", and "Cyberspace" have been prevalent in the popular press recently. There has also been considerable professional interest. (See Table 1 of recent conferences). This field is an outgrowth of three factors: 1) increases in available technologies of display, storage, and CPU along with new interface devices. 2) increases in the awareness of the importance of the "user interface" and 3) an increase in the awareness of the need for better means of collaboration. While "Virtual Reality" is arguably not completely new, it is only in the last few years that these technological and social trends mentioned above have resulted in the growth of "Virtual Reality" as a field. We argue that "Virtual Reality" is an important phenomenon for the human factors community in at least three distinct ways. First, like other new technologies, Virtual Reality requires human factors research to reach its full potential. Second, Virtual Reality offers the human factors professional an important new tool of investigation. Third, as a tool of communication and collaboration, virtual reality may serve as a medium for collaborative design and/or a means for communicating the results of human factors issues.
We present the results of a laboratory study comparing three styles of audio menus. One of these styles is the technique predominantly employed in interactive voice response (IVR) systems today. Two alternatives to this Standard technique were evaluated in this study. One of these alternatives was first proposed in Resnick and Virzi (1992), which they called Skip and Scan menus. This new style was hypothesized to be superior to Standard menus for intermediate users, but was expected to show limitations for one-time callers and expert users. The third menu alternative we evaluated combines elements of the Standard and Skip and Scan menus and was hypothesized to be superior in a broad range of usage conditions. Performance was measured over 36 tasks and two IVR applications. In all but the first few trials, the Skip and Scan menu style reported in Resnick and Virzi led to performance equal to or better than the other two menu styles. Standard menus showed a performance benefit for the first few trials of the first application only: this benefit was not present in the second application. There were no differences among the techniques in the trials simulating expert behavior.
Nuisance or unwanted calls have always been a problem to subscribers of phone services. One possible solution is a network based service that allows subscribers to control the calls they receive by using a call acceptance list. When the call acceptance list is activated, all callers not on the list would be automatically routed to a voice messaging system. Those callers on the list would be allowed to ring the subscriber's telephone. This study assessed the effectiveness of call acceptance lists in reducing unwanted telephone calls. Participants used a prototype telephone-based interface to establish a list of telephone numbers from which they would always accept calls. At the same time, they logged each of their incoming calls in a diary, recording the telephone number that originated the call, and whether they wished to receive the call. The call acceptance list significantly reduced the number of unwanted calls
In the present studies, a scale was developed for measuring attitudes toward automation technology that reflect a potential for complacency. In the first, developmental study, a 20-item questionnaire consisting of statements concerning various aspects of automation was administered to 139 undergraduates at Catholic University. Factor analysis of the complacency potential rating scale (CPRS) revealed five independent factors, namely: general, confidence-, reliance-, trust-, and safety-related complacency. The internal consistency reliability coefficients of the five factors and the scale as a whole were found to be high, and the scales revealed satisfactory test-retest reliabilities. The pattern of correlations among CPRS score, age, gender, computer use, and computer experience were consistent with previous studies examining attitudes toward microcomputer usage (Igbaria and Parasuraman, 1991). In the second, validation study, the 20-item CPRS was cross-validated on a sample of 175 undergraduate students at Drexel University. Factor analysis similarly revealed five factors with high alphas. The results indicate that the potential for complacency can be evaluated by assessing attitudes towards automation technology.
Many voice phone services are complicated to use, and several human factors obstacles may explain their lower than expected penetration and usage rates among subscribers. Usability difficulties with conventional telephones include: deciding which services are appropriate, knowing which services are available for a particular call, remembering complicated commands, and executing commands via conventional phones (e.g., network-based speed dialing). Bellcore sought to determine whether network services could be more efficient or acceptable to users if engineered to interface with more intelligent Customer Premises Equipment (CPE). For these reasons, it compared service usability of conventional CPE with an experimental screen-assisted telephone designed to help overcome these human factors obstacles. The experimental telephone provided a "context-sensitive" visual display showing available services, context-sensitive "soft keys" that accessed these services with a single key press, and a visual display of telephone numbers on list services. Usability tests showed that the experimental screen-assisted telephone provided significant gains for two services with respect to conventional CPE. The gain for Three-Way Calling (90% vs. 40% success) appeared due to automating the switch hook flashes that the service requires. The gain for Speed Calling (69% vs. 25% success) appeared due to eliminating the need to assign and remember dialing codes, and to providing feedback when an entry was incorrect. The experimental telephone also provided an average gain over all services tested (75% vs. 61% success), but this did not reach statistical significance. The results are an encouraging early attempt at creating richer visual displays for more usable network services.
The purpose of this paper is to present a case for the development of a user interface design guideline or standard for interactive voice response applications, to be widely disseminated throughout business and industry. A number of sample problems are cited, based on the author's consulting experience in this area, which serve to demonstrate that many of the problems encountered in IVR application development, particularly in scripting/dialogue design and use of automated speech recognition as a front-end, are not only solvable, but easily avoidable, given the current human factors knowledge base. The paper also discusses the Specification Document developed by the Voice Messaging User Interface Forum (1990, April), and the reasons why it cannot be applied, as written, to the user interface design of more complex IVR applications. Finally, the author proposes an approach to developing the proposed guideline/standard.
The tone used in the new service Caller ID on Call Waiting (CIDCW) must be reliably detected by customer premises equipment (CPE), so that it can prepare to receive caller ID data, and it should alert customers to new calls but without being annoying. To help select a tone, the first experiment of this study examined the acceptability of a series of high-frequency dual tones that might be able to perform the required CPE signaling. Subjects were presented with the tones under circumstances in which they would typically be heard (while talking and listening over the telephone) and rated the sound quality of the tones. Long bursts of tones were presented as well as short bursts prepended or appended to the 440-Hz tone used in Call Waiting service. The results suggested that customers may find high-frequency dual tones acceptable. To determine acceptable parameters of tones, in the second experiment, subjects rated the loudness of selected tones as their length and power were varied.
For the purpose of designing a method to control the main speech parameters for keyword emphasis in a text-to-speech synthesizer, the relation between speech parameters and emphasis level is determined from experiments. Twelve subjects are instructed to modify keyword emphasis to achieve natural sounding speech from three sentences. An interactive speech editor with a graphical user interface is developed for the experiments. The editor allows the subjects to control speech intensity, speech rate and average fundamental frequency of the keyword, and of the other sentence components. Furthermore, subjects can also control pause (silence) duration preceding and following the keyword. Extracted relations between prosodic feature parameters and emphasis level shows that speech intensity and speech rate are independent of sentence content. Speech intensity increases linearly and speech rate decreases linearly with emphasis level. On the other hand, average fundamental frequency and pause duration depend on sentence content, and relatively large changes are required to strongly emphasize keywords using pause insertion and increased fundamental frequency.
Auditory communication is critical for the successful completion of many tasks which require information be transmitted among crew members. The purpose of the present program of research is to determine the impact that speech communication has on performance of such tasks. As guidance for this program, a model of auditory communication has been developed. This model describes performance as a function of three factors: transmission, linguistic, and individual. The model assumes that variables affecting these three factors alter the level of auditory workload and task performance is a consequence of this workload (Peters, 1991). The present paper describes the effects of two transmission factors: speech intelligibility and communication structure. Speech intelligibility was measured using the Modified Rhymes Test. Communication structure was defined as command, interrogative, and discussion levels. Three studies have been completed in this research program. The focus of the present paper is the most recent study, completed at the Ft. Knox Close Combat Test Bed, an M1A1 tank simulator facility. After describing the results of this study, the results of all three studies are reviewed and found to be consistent with the auditory-performance model proposed by the authors.
Video mediated communication alters our perception of the way in which we interact and communicate. In contrast to face to face or audio only (e.g., telephone) communication, there is relatively little systematic research on the effect of video conferencing on communication within groups of people at dispersed locations (Harrison, 1991b; Harrison et al, 1992b; Sellen, 1992; Wolf, 1988; Cohen, 1982; Short, Williams, and Christie, 1976). In this paper we describe a study of how participants at three distant locations perceived differences between face to face (within site) and video mediated (between site) communication. Results indicate that participants perceived between site, mediated communication to be unnatural and uncomfortable. They felt there were problems with gaining floor control and with conversation flow. Additionally, participants perceived the between site, mediated communication to be less interactive, less social, and less enjoyable than the face to face, within site communication. The insights gained through this and other case studies, summarized here, will be used to guide our future research. This study is one in a series of field trials and controlled experiments aimed at understanding the human factors issues associated with video communication and the design of such systems.
Auditory perception involves the human listener's awareness or apprehension of auditory stimuli in the environment. Auditory stimuli, which include speech communications as well as non-speech signals, occur in the presence and absence of environmental noise. Non-speech auditory signals range from simple pure tones to complex signals found in three-dimensional auditory displays. Special hearing protection device (HPD) designs, as well as additions to conventional protectors, have been developed to improve speech communication and auditory perception capabilities of those exposed to noise. The thoughtful design of auditory stimuli and the proper design, selection, and use of HPDs within the environment can improve human performance and reduce accidents. The purpose of this symposium will be to discuss issues in auditory perception and to describe methods to improve the perception of auditory stimuli in environments with and without noise. The issues of interest include the perception of non-speech auditory signals and the improvement of auditory perception capabilities of persons exposed to noise.
In some environments, there is a serious mismatch between the perceived (psychoacoustic) urgency of a warning and its situational urgency. This pilot study investigated effect of pulse format, pulse duration, and time between pulses on the perceived urgency of warning signals. The intent was to determine the best combination of variables and levels of variables to use in a formal study on the perceived urgency of warning signals. The results indicated that only pulse format and time between pulses were significant. Subjects rated sequential pulses as being less urgent than any other format. Signals with shorter inter-pulse intervals were rated as significantly more urgent. Pulse format and time between pulses were determined to be variables which should be used in future research.
The development of optimal three-dimensional auditory displays requires a more complete understanding of the interactions among spatially separated sounds. Free-field masking was investigated as a function of the spatial separation between signal and masker sounds within the horizontal, frontal, and median planes. The detectability of filtered pulse trains in the presence of noise maskers was measured using a cued, two-alternative, forced-choice, adaptive staircase procedure. Signal and masker combinations in low (below 2.3 kHz), middle (1.0-8.5 kHz), and high (above 3.5 kHz) frequency regions were examined. As the sound sources were separated within the horizontal plane, signal detectability increased dramatically. Similar improvement in detectability was observed within the frontal plane. As suggested by traditional binaural models, interaural time cues and interaural intensity cues are likely to play a major role in mediating masking release in both the horizontal and frontal planes. Because no interaural cues exist for stimuli presented within the median plane, traditional models would not predict a release from masking when the stimuli are separated within this plane. However, with high frequency signals, masking release similar to that observed in the horizontal and frontal planes could be observed in the median plane. The current literature suggests that sound localization in the median plane may depend on direction-specific spectral cues that are introduced by the pinna at high frequencies. The masking release observed here may also depend on these "pinna cues."
Conventional hearing protection devices have often been implicated in compromised auditory perception, degraded signal detection, and reduced speech communication abilities. Recent technological developments have been used to augment hearing protectors in an attempt to alleviate these problems for the user, while at the same time providing adequate attenuation. Operational characteristics, design features, performance data, and applications for active noise reduction, sound transmission, frequency-selective, adjustable attenuation, amplitude-sensitive, and uniform attenuation devices are discussed.
Mode errors are one kind of breakdown in human-computer interaction. The concept was developed originally in the context of relatively simple reactive computerized devices such as word processors. When a device possesses multiple modes, where something is done one way in one mode and another way in another mode, there is increased potential for erroneous actions. In this paper we extend and expand the concept of mode error to supervisory control of automated resources in event-driven situations such as pilot interaction with cockpit automation. In this type of situation, the state of the automated system can change in response to either operator input, situation factors or system factors. This creates complexities in tracking system mode changes over time, surprises created by "uncommanded" mode changes, and the possibility of errors of omission as well as commission in managing multiple system modes. Progress in our understanding of mode error in the context of highly automated systems is important in our ability to develop effective countermeasures for mode-related problems in human-computer cooperation.
Previous research suggests that the temporal pattern of dissimilar sounds may be a basis for confusion. To extend this research, the present study used complex sounds formed by simultaneously playing components drawn from four sound categories. Four temporal patterns, determined by sound duration and duty cycle were also used, producing a total of 16 basic components. The density (i.e., number of components played simultaneously) ranged from one to four. Subjects heard a sequence of two complex sounds and judged whether they were same of different. For trials in which the sounds differed, there were three possible manipulations: the addition of a component, the deletion of a component, and the substitution of one component for another. Overall accuracy was 94 percent across the 144 dissimilar sound complexes. As density increased, a significantly greater number of errors occurred for all classes of manipulations. Changes in individual temporal patterns across a variety of manipulations of sounds involving adding, deleting and substituting components were accurately discriminated. Subjects were least accurate in detecting substitutions of a pattern. A single sound category was identified in error prone sequences which was most often involved as the changing component from first to second sound presentation. Suggestions for the design of easily discriminated sounds are discussed.
This paper presents a preliminary evaluation of a computer data entry device called Infogrip Chordic Keyboard. Learning time, typing speed and accuracy, and operator discomfort are examined. The result shows that the 30 chords can be learned in two hours. The subjects could type 49 characters per minute with the accuracy of 98.3 percent after two hours training.
For use with computers, the traditional QWERTY keyboard has been enlarged to more than 100 keys. This has generated postural and motoric challenges for the user, including cumulative trauma disorders. Among the proposed ergonomic solutions is the Ternary Chord Keyboard (TCK) which has only eight keys. Its evaluation posed use and research issues. TCK operation requires fast and finely controlled force and displacement by the fingertips in a horizontal plane, i.e., "rocking" of keys instead of their familiar "tapping". Associated mental tasks include memorization of the chords for each character. Experiments were performed (a) on TCK prototypes to measure the time needed to memorize and learn its operation, and to assess keying performance; and (b) on specially designed experimental apparatus to measure finger mobility, strength, and speed. The results indicate that finger mobility, strength and tapping performance were not well correlated with keying performance. All subjects were able to learn to operate the TCK, requiring memorization of 58 chords, within two to ten hours. After additional about 10 hours of use, they were inputting averages of 70 characters per minute, or more, with an accuracy of better than 97 percent. These results indicate that key operation such as with the TCK, which is rather different from the traditional QWERTY keyboard use, is feasible.
The goal of this study was to determine an equation ("learning function") that describes long-term learning of a new keyboard. Five subjects learned 18 characters on a chord keyboard, then improved keying speed by inputting typical numeric keypad text for about 60 total hours. Their performance, in characters typed per minute, was recorded for every trial. Of the various functions that were considered to describe performance, the best fitting equation was a Log-Log relationship of the form CPM{sub:i} = e{sup:b{sub:0}}T{sub:i}{sup:b{sub:1}}, where CPM{sub:i} is the performance in characters per minute on the i-th trial (T{sub:i}) and b{sub:0} and b{sub:1} are fitted coefficients. A second goal was to investigate how many trials of performance are needed before the entire learning function can be determined. The coefficients of the Log-Log function were determined using only the first 25, 50, 75, 100, 125, 150, 175, and 200 of the initial performance points (out of about 550 total actual data points). The mean squared error (MSE) was calculated for each of these fits and compared to the MSE of the fit using all points. From the results of MSE data, it appears that at least 50 performance data points are required to reduce the prediction error to an acceptable level.
The IBM Design Center in Boca Raton studied the operating-point key force for a portable computer keyboard. Alden, Daniels, and Kanarick (1972) report that typists prefer operating-point key forces of between 25 and 150 grams. We compared different key forces that fell within the range recommended by Alden et al. The only difference between the keyboards we studied was the amount of force required to activate the keys. The first keyboard (58 keyboard) required 58 grams of force to activate the keys. The second keyboard (74 keyboard) required 74 grams of force to activate the keys. Sixteen skilled typists used both keyboards to enter text. Input speed was significantly faster on the 58-gram keyboard. A significant number of typists preferred the 58-gram keyboard. The results suggest that the optimal key force for portable computer keyboards is less than 74 grams.
Performance of a rule-based handwriting recognition system is considered. Performance limits of such systems are defined by the robustness of the character templates and the ability of the system to segment characters. Published performance figures, however, are typically based on pre-segmented characters. Six experiments are reported (using a total of 128 subjects) that tested a state-of-the-art recognition system under more realistic conditions. Variables investigated include display format (grid, lined, and blank), surface texture, feedback (location and time delay), amount of training, practice, and effects of use over an extended period. Results indicated that novice users writing on a lined display (the most preferred format) averaged 57% recognition performance. By giving subjects continuous feedback of results, training, and after about 10 minutes of use, the system averaged 90.6% character recognition. Following three hours of interrupted use and with performance incentives, subjects achieved an average 96.8% accuracy with the system. Future work should focus on improving the ability of the recognition algorithm to segment characters and on developing non-obtrusive interaction techniques to train users, to provide feedback and to correct mis-recognized characters.
Two studies were performed to test the efficacy of using three different automated speech recognition devices in parallel to obtain speech recognition accuracies better than those produced by each of the individual systems alone. The first experiment compared the recognition accuracy of each of the three individual systems with the accuracy obtained by combining the data from all three systems using a simple "Majority Rules" algorithm. The second experiment made the same comparison, but used a more sophisticated algorithm developed using the performance data obtained from experiment 1. Results from the first experiment revealed a modest increase in speech recognition accuracy using all three systems in concert along with the Simple Majority Rules (SMR) algorithm. Results from the second experiment showed an even greater improvement in recognition accuracy using the three systems in concert and an Enhanced Majority Rules (EMR) algorithm. The implications of using intelligent software and multiple speech recognition devices to improve speech recognition accuracy are discussed.
Touchscreens have been demonstrated as useful for many applications. Although a traditional mechanical keyboard is the device of choice when entering alphanumeric data, it may not be optimal when only limited data must be entered, or when the keyboard layout, character set, or size may be changed. A series of experiments has demonstrated the usability of touchscreen keyboards. The first study indicated that users who type 58 wpm on a traditional keyboard can type 25 wpm using a touchscreen and that the traditional monitor position is suboptimal for touchscreen use. A second study reported on typing rates for keyboards of various sizes (from 6.8 to 24.6 cm wide). Novices typed approximately 10 wpm on the smallest and 20 wpm on the largest of the keyboards. Users experienced with touchscreen keyboards typed 21 wpm on the smallest and 32 wpm on the largest. We then report on a recent study done with more representative users and more difficult tasks. Thirteen cashiers were recruited for this study and were required to complete ten trials in which they typed names and addresses with punctuation. Results indicate that the users improved rapidly from 9.5 wpm on the first trial to 13.8 wpm on the last trial, reaching their fastest performance after only 25 minutes. Although custom interfaces will be preferred for special types of data (e.g. telephone numbers, times, dates, colors) there will always be situations when limited quantities of text must be entered. In these situations a touchscreen keyboard can be used.
Tullis, Thomas S. and Kodimer, Marianne L. (1992): A Comparison of Direct-Manipulation, Selection, and Data-Entry Techniques for Reordering Fields in a Table. In: