The last two decades have witnessed a rapid growth in the introduction of automatic devices into aircraft cockpits, and elsewhere in human-machine systems. This was motivated in part by the assumption that when human functioning is replaced by machine functioning, human error is eliminated. Experience to date shows that this is far from true, and that automation does not replace humans, but changes their role in the system, as well as the types and severity of the errors they make. This altered role may lead to fewer, but more critical errors. Intervention strategies to prevent these errors, or ameliorate their consequences include basic human factors engineering of the interface, enhanced warning and alerting systems, and more intelligent interfaces that understand the strategic intent of the crew and can detect and trap inconsistent or erroneous input before it affects the system.
The objective of this project was to perform a comprehensive analysis of underground mining injury accidents to determine the relative contribution of various causal factors, including human error. A paper presented at the Human Factors Society 31st Annual Meeting (Shaw and Sanders, 1985) describes the methodology for assessing the relative contribution of various factors to accident causation and presented preliminary findings. A Bureau of Mines Technical Report (Sanders and Shaw, 1988) completely documents the research project and presents final findings and recommendations. The present paper focuses on the final data set to explore the multiplicity of contributing causal factors, and the underlying reasons for involvement of these factors, in underground mining accidents.
This paper outlines a three year long, fifteen person-year effort to achieve a safer and more productive plant through the integration of human factors into the design and assessment process for a new plant. The implementation of the human factors and human reliability programme is at half-way point. This programme addresses issues such as adequacy of the VDU monitoring and control system, adequacy of panels in the Central Control Room and on local plant, and staffing arrangements, using techniques ranging from task analysis and audit techniques, to the running of experiments to determine acceptable VDU information density, and the application of techniques of human error analysis and workload assessment. The human reliability side of the programme has involved the development of a new human reliability management system, based partly on two experiments which comparatively evaluated error identification and error quantification techniques. The programme, as carried out so far, on the design and proposed operational structure, and future work proposals, are briefly outlined.
The Human Operator Simulator (HOS-IV) is a general purpose simulation tool. It can be used to simulate the dynamic interactions of the environment, the hardware/software system, as well as the operator. HOS-IV provides time and accuracy data for a core set of cognitive, perceptual, and psychomotor processes. The focus of this paper is the HOS-IV mechanism that is used to simulate global task management. A sample application that demonstrates HOS-IV task management is also presented.
A human factors analyst visited facilities involved in medical use of nuclear byproduct material. The purpose was preliminary identification of factors which can contribute to human errors leading; to medical misadministrations. Results indicated a broad range of such factors. The analyst's observations are being used to guide further research into human factors and the medical use of byproduct material.
Three NO DIVING signs were placed at one middle and one high school in suburban Buffalo and one middle and one high school served as controls (no signs). A total of 864 students participated in the study. It was found that males were more likely than females to notice the signs, but that males tended to perceive less danger associated with shallow water diving than females. High school males were much more likely than females to dive into the shallow end of their school's pool, especially when the NO DIVING signs were present. In addition, students with a history of diving into the shallow end of their school's pool were much more likely to notice the NO DIVING signs than students who never dove into the shallow end of the pool. Moreover, compared to students who never dove into the shallow end of their school's pool, students with a history of diving into the shallow end of their school's pool tended to perceive less danger and were more likely to dive into the shallow end of the pool again. It appears that warning signs are less effective with high school students than with middle school students.
The Savannah River Site (SRS), located in South Carolina, is a key Department of Energy production and research facility for nuclear materials. Incident investigations performed at the Savannah River Site showed the cause of approximately 75% of all operating incidents in non-reactor facilities to be human error. The technical incident reporting system in place required the investigator to list the cause of an incident in broad terms (i.e., Personnel Error, Equipment Error) and to categorize it according subclassifications (i.e., Operator Error, Supervisor Error, Mechanic Error). The reporting system, using three classifications, tended to emphasize "what happened" during an incident and: who was involved" instead of getting to the details of "why" an incident occurred. The high rate of human error as the cause of incidents indicated that further analysis was in order. Human factors personnel in the Facility Safety Evaluation Section (FSES -- an oversight organization with emphasis on non-reactor facilities) wanted to determine the causes of human error in a way that would identify more precisely why the errors occurred. To satisfy these needs, FSES is implementing a root cause analysis program for SRS. Root cause analysis consists of two parts; the first being Events and Causal Factor (E & CF) Charting; and second, the Root Cause Coding using a Root Cause Tree. The objectives were to provide a systematic method for identifying the root causes of a given incident in order to make detailed recommendations for preventing its recurrence, and to provide a database of incident root causes for identifying problem areas across incidents. Root cause analysis would guide the incident investigator to state "why" an incident occurred using detailed cause codes (e.g. Incomplete Training, Labels Less Than Adequate). Root cause trending would enable FSES to track the causes of human error, recommend solutions, and track corrective actions. FSES developed a one day workshop to train several hundred incident investigators at STS to perform investigations using the root cause analysis method. This presentation will discuss the development and implementation of the root cause analysis system at SRS by FSES human factors professionals.
Use of safety devices concerns human factors and safety personnel both as a practical matter of reducing injuries and saving lives and as a basis for studying theories of human behavior. Many reasons are given for non-use of these devices. Seatbelt use provides a good model for examining generally what factors affect safety behaviors. Slovic, Fischoff, and Lichtenstein (1978) suggested that failure to use seatbelts resulted from fear extinction, in that the effort required to fasten the belt was not reinforced and ultimately habit strength was reduced. A test of this hypothesis provided evidence for fear as a factor. Professed seatbelt use was an increasing function of distance driven. In addition, other hazards examined generally showed the greater experience with a hazard the lower the perception of risk, supporting an extinction explanation.
Problems in incident survey and analysis in Japan are discussed and a new method of 'survey and analysis of presumed accidents' is proposed. The survey is conducted with an anonymous questionnaire method on workers in 3 Japanese Railways factories. Workers are asked to describe the presumed accident to which the highest priority of measures should be given, and the accidents are analyzed statistically. As a result, it is shown that an accident pattern can be selected to focus on in safety management. This method can be used for substantial preventive safety management based on features of workplaces.
The Navy ship constitutes one of the most complex weapon systems in the US defense arsenal. It is a multi-personnel system which conducts multi-operations in multi-warfare environments (AAW, ASW, ASUW, EW and strike), as an independent combatant, a member of a squadron, or an element of a battle force. The demands on the ship design from a human factors point of view are unique in the breadth of their scope and the depth of requirements. This paper describes the status of the Integrated Human Factors Program in the Naval Sea Systems Command including the Program objectives, accomplishments, research thrusts, and plans.
Enhanced HARDMAN constitutes the Navy's implementation of the DoD Directive 5000.53 "Manpower, Personnel, Training and Safety (MPTS) in the Defense System Acquisition Process." Enhanced HARDMAN integrates the domains of human engineering, manpower, personnel and training (MPT), and life support and safety through: 1) a front-end analysis applicable to all domains and to the integration of domain requirements; 2) a consolidated data base applying to all domains; 3) acquisition of lessons learned for all domains; and 4) application of Enhanced HARDMAN measures of effectiveness and T&E activities addressing all domains. The elements of Enhanced HARDMAN are: a standardized and formalized Enhanced HARDMAN process addressing MPTS activities and products at each phase of the weapon system acquisition process; a consolidated Enhanced HARDMAN data base; automated Enhanced HARDMAN analysis tools; Enhanced HARDMAN analyst productivity tools; and a report generator for producing Enhanced HARDMAN plans and reports.
Over the past three years, the U.S. Army Research Institute (ARI) and its support contractor, Horizons Technology, Inc. (HTI), have been involved in a wide-ranging MANPRINT (Manpower and Personnel Integration) effort for the Army's Forward Area Air Defense System (FAADS). Elements of this effort have included: program planning, acquisition management support, studies and analyses, test and evaluation support, and support to the source selection process. The paper presents the observations of several key members of the support team on the requirements for a successful operational MANPRINT program. Several additional perspectives regarding MANPRINT concept definition, research and development needs, and methodological considerations are also provided.
Preliminary work is described toward the development of moderator functions for systems analysis that reflect human behavioral limitations and tendencies. In particular, a model of human-machine interaction dynamics in complex systems is introduced to give a moderating influence on overall operator/decision aid performance. A key input to this model is the operator's current workload. To further ground the moderator function in human behavioral considerations, a multiple resource theory-based workload assessment technique is used to provide this input.
The Army developer community needs to be attuned to the need for addressing operator workload issues within the framework of the materiel acquisition process. Brief descriptions are given of: (a) the major acquisition approaches; (b) the Manpower and Personnel Integration (MANPRINT) program; and, (c) operator workload. The interrelationships between these three areas are considered. Workload is important because it affects the ability of the operator to perform required tasks; hence, system performance can be affected by workload. MANPRINT provides a framework for addressing operator workload issues and for formalizing workload analysis requirements.
Humans are complicated devices. Thus, systems in which people are embedded necessarily are complex. In order to better develop such systems, a means to organize and understand human complexity is required. Theoretical models of human information processing are one cognitive-engineering tool to help system development. This paper discusses the kinds of models that might be effective in solving practical problems. Suggestions are given for selecting a useful model from the plethora of available theoretical models. These issues are illustrated in the context of current research aimed at providing a general model of human cognition and action for application to the development, operation, and maintenance of nuclear power plants in Japan.
The use of protocol analysis and behavioral classification for evaluating advanced display concepts is described. Three experienced nuclear power plant operators solved problems in a full-scale simulator which employed an alarm suppression logic in the annunciator display. Verbal protocols and behavioral actions were collected as operators solved the problems, and were compared with protocols in which the alarm suppression logic was not used. The results indicated that more observations of discrete values and more control actions were made in the condition employing the alarm suppression scheme, suggesting more effective diagnosis and control with this type of man-machine interface.
Research time in state-of-the-art flight simulators is in heavy demand. The demand is even more acute when future aircraft and future cockpit systems are the targets of the investigation. Configuring the simulator properly and then training pilots to function in that configuration are substantial time drains. We believe that we have solved many of the problems associated with limiting the cockpit crew station designs to optimal configurations before committing simulator resources to further investigation. Further, we have also resolved a major training requirement in a way that we believe will minimize the need for in-simulator training time.
Safe and efficient use of modern technology often hinges upon the ability of persons operating these systems to perform effectively under a wide variety of conditions. This paper describes several tools developed to investigate the influence of psycho-social variables on cognitive performance under stressful conditions These tools include indirect, non-obtrusive video recording equipment to capture real-time cognitive behavior, and a several multi-dimensional and multi-method techniques to measure cognitive ability and psycho-social conditions These techniques are used to bridge the gap between basic laboratory research and field observation.
The 1990's will present several challenges for human factors in the research, development and deployment of surface navy systems. For several ship classes, Arleigh Burke (DDG-5l) Destroyers, AEGIS Cruisers and NATO Frigates, new workstations and upgrades are planned. These system upgrades present several windows of opportunity to the Navy for improvements in human engineering. This symposium addresses the scope and nature of several research challenges demanding results from human factors. These results must be prepared to address design questions as early as 1991-1992, to influence workstations to be fielded in the mid to late 1990's.
As we approach the 1990's the surface navy is facing critical procurement decisions for the design of consoles for shipboard combat information centers. Studies are being conducted to identify the impact of current designs on performance, and to construct and test prototypes for future designs. The goal is to develop guidelines for future consoles which are performance based and which will guide both near and long-term design strategy. A methodology is being applied which incorporates job description, procedural simplification, and display re-design. Although much progress has been made, and potential design improvements identified for a single user-type in an anti-air warfare capacity, the scope of this effort leaves many design questions unanswered for the numerous types of combat information center operations and personnel.
Very little human performance data has been collected for current navy shipboard consoles. This project identified a source of user behavioral data through the reduction and analysis of data captured with the Aegis shipboard data extraction programs. The data was used to construct a preliminary human performance database for a combat team. This database was used as input to a procedural network model, providing an analysis tool which allows task construction and performance prediction without the difficulty associated with acquiring ship or land-based combat system time. The database and models are currently being evaluated for their predictive validity.
Fourteen U.S. Navy personnel with Aegis Combat System, Naval Tactical Data System (NTDS), and Non-NTDS operational experience participated in an experiment designed to investigate the impact of proposed workstation designs on operator performance, system usability, and training. Human performance data were collected on a sample of operational procedures typically performed in a Combat Information Center (CIC) for a current Navy Combat System and a prototype workstation. The prototype was developed using specific human factors design principles with the goal of reducing training time, improving operator retention of skills for system operation, reducing errors in system operation, improving operator efficiency (e.g., speed & accuracy of performance), and improving user's satisfaction with the user-computer interface. This paper reports only the preliminary results for data collected from seven subjects who performed procedures using the prototype workstation.
The objective of this effort is to identify and resolve some of the human factors issues of tactical symbology in the context of high resolution color raster displays such as those for upcoming navy workstations. The focus of this presentation is the establishment of evaluation criteria and the use of empirical data in that evaluation process. A pilot test was devised and administered to provide data in determining the relative effectiveness of three alternative tactical symbol sets that had been developed by members of an international standards group. This paper discusses the methods and results of this pilot test, and based on those results, an experiment was developed to further test the competing symbol sets. Results of the experiment are presented in the context of naval tactical symbol sets.
The Human Factors Engineer (HFE) is sometimes excluded from the requirements analysis phase of a project when other engineers do not understand how the HFE can contribute to system definition. The Traceability and Engineering Analysis Methodology (TEAM) combines all engineering disciplines, including Human Factors Engineering, into an integrated methodology for systems analysis. TEAM provides an structured mechanism for inter-discipline communication during the early phases of a project. Human Factors Engineers have successfully used TEAM to contribute to requirements analysis early in a project life-cycle. This paper presents the TEAM Concept and identifies how the Human Factors Engineer uses TEAM.
This paper discusses a technique for predicting human workload which is based around task network modeling. Task network modeling allows task analyses to be simulated on a computer to study dynamic system behavior through the addition of information, primarily task time and sequencing. A technique was developed by McCracken and Aldrich (1984) and modified by Drews, Laughery, Kramme, and Archer (1985) which permits the inclusion of workload information into a task network model. From these workload models one can make predictions about where points of excessive operator overload are likely to occur. However the technique has undergone only limited empirical validation. In addition to presenting the basic technique, this paper will briefly describe a software tool for using the technique as well as the perceived theoretical shortcomings of the technique in its current form.
The role played by human factors engineering in the research and development of complex weapons systems is critical to satisfactory operational performance of the system. The role of the human factors engineering manager is no less critical; this individual is not only charged with designing, implementing, and reviewing the human factors engineering effort, but also providing leadership in resolving conflicts building consensus, and championing unpopular positions. This paper chronicles the impact of the human factors engineering program in a moderately-large R&D. The details of the project are analyzed using the an organizational model that focuses on the interactions among the organizational, technical, and political logics prevalent at the time. Using this approach, several Insights regarding the role of the HFE manager are revealed, and some guidelines for managing human factors engineering research and development programs are offered.
This paper presents the systems approach to designing command centers. This is done by integrating all related disciplines into command center prototyping activities for computer-based system design. Central to this approach is an integrated laboratory environment. Our laboratory is a Command Center Laboratory (CCL). This laboratory is a powerful engineering resource that merges formal systems engineering methodology with rapid prototyping capabilities. System prototyping activities span the design and development process of complex system design. Included is a discussion of the laboratory facilities and roles necessary to accomplish this approach. Finally, we discuss how the laboratory aids all system engineering disciplines in implementing the systems approach to command center prototyping.
Since the accident at Three-mile Island, many efforts have been initiated to provide human factors guidance and standards in the power industry. These guidelines and criteria along with military standards and specifications and general guidelines and criteria now exist in a multitude of documents that have been developed over a twenty year time frame. This paper describes efforts to consolidate this guidance into a single design criteria for the development of new reactor plants and to identify "gaps" in human factors standards relating to nuclear reactor design.
This report presents the findings of a Task-Operator study for the Primary Flight Control (Pri-Fly) major operating stations aboard Tarawa class (LHA) ships. The LHA carries a variety of attack and cargo helicopters, plus AV-8A Sea Harrier jet aircraft. Pri-Fly is the area of the ship which controls the landing and recovery of aircraft, as well as flight control when aircraft are in the immediate vicinity of the ship. Two main positions were examined by this study, the Air Administrator (Air-Boss) and the Assistant Air Administrator (Mini-Boss). The purposes of this study were to perform a task-operator study of Pri-Fly personnel task requirements, to identify human-equipment interface design problems given the existing configuration of Pri-Fly within LHAs, and to provide general design recommendations based on the findings of the study. Seven tasks were undertaken to meet the objectives of the project. Overall, the review identified numerous human engineering design problems in Pri-Fly, many of which severely limit the performance of Pri-Fly personnel. Based on this review, it is asserted that significant improvement can be realized, in terms of air operations safety and efficiency, by instituting a Pri-Fly improvement program.
Changing flight tactics and increased use of Night Vision Goggles (NVGs) has focused attention on the limited Field of View (FOV) of the Army UH-60A Black Hawk Helicopter. To improve the FOV in the next generation Black Hawk, the U.S. Army tasked an independent contractor to investigate the problem and propose alternatives. The study involved a comprehensive review of Army requirement documents, existing FOV studies, and accident data. Close attention was given to dynamic flight characteristics that affect FOV. Also, the study team collected technical data related to military rotary wing design, administered a survey to pilots; and interviewed users and other technical experts. The study revealed the current UH-60A design meets the requirements of MIL-STD-850B under static conditions. The only exception is the obstructed view that the door and windshield vertical structures create. However, under dynamic conditions the UH-60A cockpit design and normal flight characteristics substantially reduce the FOV in critical areas. The study produced eleven options that can improve and/or enhance the next generation Black Hawk's if incorporated into the new design. Each option is presented and discussed.
Two operator workload (OWL) subjective rating scales were used to obtain judgments of workload during 48 hours of operation. The Task Load Index (TLX) and Overall Workload (OW) scales were administered to two crews during 48-hour operations. A 16-item symptoms ratings scale was also administered to investigate motion sickness and other physical ailments. Results indicated that workload increases across time. Factor analysis on the symptoms found three significant: (1) Heat; (2) Eyestrain/Headache; and (3) Allergy/Dust. Regression analyses suggest that OWL scores can be described as a combination of hour into mission and job being performed. These findings are discussed in the context of a methodology for assessing.
This paper describes the development and validation of software to automate the authoring of training materials written in controlled English, such as Simplified English (SE). In SE, writers must adhere to many writing rules. While such materials are easy to read, they are very difficult to write. For example, use of short names for equipment must be consistent; also use of synonyms is not allowed. The software provides feedback regarding adherence to stipulated vocabulary and writing rules. Algorithms contained in the software include sentence parsing routines to verify that words are used according to their defined part of speech.
The crew requirements definition system (CRDS) is a computer-based methodology designed to minimize the time required to accomplishment any set of tasks while using the fewest resources. It enables analysts and researchers to study in a timely and cost effective manner the effects of varying crew size, task start times (and hence task sequencing), and task allocation to crewmembers or equipment items during the performance of designated missions without the need to observe crews actually performing their duties. The CRDS is programmed in C-language and is designed to be used on an "XT" or faster class of personal computer. The basis of the system is several automated PERT, GANTT, and critical path method calculations. In addition, the system produces other automated calculations and summaries to aid the user. The user should have some knowledge of these operations research techniques to use the system effectively. Also needed is an understanding of the tasks to be performed, the personnel and equipment items available to perform the tasks, each task's duration, and any requirements for task sequencing. The U.S. Army Research Institute for the Behavioral and Social Sciences (ARI) developed the CRDS for the Force Design Directorate at the U.S. Army Combined Arms Combat Development Activity. However, the system is useful in any military or civilian situations in which there is a need to design and evaluate alternative small unit organizational structures. The system can be used whenever the user has some knowledge, or is willing to venture some guesstimates, of the tasks that need to be performed and the capabilities of various assets to perform those tasks.
The Operator Workload Knowledge-based Expert System Tool (OWLKNEST) is a tool that provides guidance in selecting the most appropriate technique(s) for estimating or predicting Operator Workload (OWL). This demonstration will provide hands-on usage for interested parties in utilizing OWLKNEST to determine the most appropriate OWL technique for their particular situation, interpreting the resulting outputs, and performing sensitivity analysis to assess the impact of changing responses.
This note is in connection with a live demonstration of ITS. ITS is aimed at providing fast prototyping of user interfaces in new computer applications (within a few hours of when a designer begins work); greatly reducing the workload in designing, implementing, testing computer applications; insuring excellent, consistent, well-tested interface styles. ITS is a new, comprehensive approach to application development (see in this proceedings Gould, Boies, Bennett, and Green for references). ITS provides software tools for user interface and application development, and a run-time environment for application execution. There are four key concepts. First, ITS separates the style of an application from the content of an application.... Second, ITS envisions four general work roles in application design and development: application (content) experts, application (content) programmers, style experts, and style programmers.... Third, our informal analysis of computer applications indicates that end users do four operations: make choices, fill in forms, manipulate lists, and read information blocks. All information that flows across the user interface can be thought of in terms of these four operations.... Fourth, ITS aims at creating software tools for each role.... If successful, ITS will: (a) Reduce the main source of errors in application development today, namely poor customer-programmer communication, by allowing content experts to become implementers (not just interviewees). (b) Reduce the risks and major resistance in carrying out interface design today. Separating user interface style and user interface content allows each to be tested independently without unforeseen, dangerous side-effects. (c) Speed up application development through code re-use and productivity enhancing tools. (d) Relieve severe skill shortages of outstanding programmers and not enough usability people. The best work will be leveraged. (e) Provide a framework for formulating human factors work and insuring that it has impact. In contrast to user interface guidelines, which are instantiated in a book, ITS results are instantiated in a computer-based toolkit.
A well-designed user interface is important for the success and acceptance of any software product. Some experts believe that user interface design can be improved through the application of specific rules translated from general design guidelines. Derivation of design rules from guidelines can be aided by computer tools. But storing guidelines in a computer may offer no advantage over printed text unless the computer also provides aids for selecting and applying design guidelines. DRUID development has been sponsored by The MITRE Corporation as a tool for improving user interface design. DRUID is based on the 944 design guidelines proposed in Smith and Mosier's 1986 Guidelines for Designing User Interface Software. But DRUID's capabilities extend beyond that original text and provide further aids for user interface design. Initial DRUID capabilities demonstrated in 1988 support the review of design guidelines as an "electronic book", enabling a user to navigate through structured hypertext to find specific guidelines, to find functionally related guidelines, and to browse through guidelines at will. DRUID also permits ready retrieval of related guideline material by cross referencing and via a topical index. Newly developed DRUID capabilities extend that electronic book and move toward a computer-based design tool. DRUID users can now specify relevant guidelines for a system design application and rate the relative importance of those selected guidelines. Proposed future DRUID capabilities will provide functions to rate design compliance with those selected guidelines, to aid the translation of guidelines into specific design rules, and to develop rule-based templates to support modular design of user interface software. DRUID is implemented on the Apple Macintosh II computer with HyperCard software. The user interface for DRUID is designed to accommodate both expert and novice users. A DRUID user can accomplish sequence control either by pointing (via mouse) or by keyed command entries. DRUID computer aids promise to help expedite and reduce the cost of the development of user interface software. Those aids should also help improve the quality and consistency of user interface software through rule-based design.
At the 28th Annual Meeting of the Human Factors Society, a paper entitled THE FORGIVING STAIR introduced the idea that deaths and injuries caused by stair falls can be greatly reduced. The author suggested that using the model of the automobile interior, stairs can be designed without those elements that are potentially injurious in the event of a fall; and that the stair treads, particularly, can be designed to attenuate the forces of a fall to the degree that is necessary to reduce the severity of an impact. As stairs falls are a common cause of death and injury in the home and in the workplace in all those countries that maintain accident records, a proposal to reduce morbidity from this cause may be of considerable importance. A grant to explore aspects of THE FORGIVING STAIR has led to three interwoven but discrete projects and these three projects are the subject of the symposium. Cumulatively the three projects contribute to a theory of injury reduction from falls, and this is the objective of the symposium. To reduce injuries, the force of the body impacting the stair must be attenuated. Therefore the parts of the body that strike the stair, and the magnitude of the impact must be understood.
A numerical simulation of a mathematical model of a human falling (plane motion) down a stairway is described. The simulation begins with an arbitrary initial state of the falling object, and numerically integrates equations of motion for the object as it falls and repeatedly strikes the stairway. Realistic simulation requires the use of nonlinear resistance to join rotations. This nonlinearity is being investigated.
This research investigated the effect of providing three different simulations of ground terrain on the ability of subjects to accurately determine the aimpoint during a final approach. Several simulations were created to model a straight-in final approach (3 degree glideslope) to a standard FAA runway from several distances. The three levels of terrain realism ranged from a homogeneous surface to farmlands with hills. The subject's task was to estimate the aimpoint which represented an extrapolation of the flightpath to its point-of-contact with the ground as well as the altitude at nine different distances from threshold. The results indicated that increased levels of realism lead to better performance in judging altitude and predicting aimpoint during a simulated final approach.
This study is part of a research program directed at reducing stair injuries by absorbing much of the impact of a fall. Little is known about human kinetics during falls. The paper describes a laboratory stair that induces subjects to fall, but terminates the tumble before the stair is struck. The trajectory of the falling subjects provides insights into the nature of stair falls, and makes it possible to predict the forces that would be generated at impact.
In this paper, research work leading to the development of standardized test techniques for energy absorbing material are discussed. The classification of energy absorbing material is related to the prevention of impact injuries resulting from stairway falls. Standardized tests and their usefulness to a stairway designer are discussed.
This demonstration presents the utility of USI prototyping (of computer-based systems) as a human factors engineering design tool. We will present our USI prototyping tool, its composition, and a sample application. Throughout the demonstration we will illustrate how prototype USIs can be generated quickly and efficiently for user evaluation and immediate insertion into system design. Our tool for prototyping USIs is called a "USI Prototyping System (USIPS)." USIPS is divided into four components: Imagetool, Fonttool, Dynatool, and On-Line Help. Imagetool is used to build static images of text and graphics. Fonttool is used to design the fonts used in the images. Dynatool is used to link these static images into user and event driven USIs that interface to real and or simulated data bases. As a result these USIs appear to work as they would in the target system. On-Line Help is used to provide unfamiliar users with information on how to operate USIPS. We will describe each of these components in the demonstration. USI prototyping is used to formally and informally study design options for the USI before coding takes place. As the system design is being developed, different ways of interacting with users and displaying information is studied at relatively little cost. These studies yield an effective USI design which can then be implemented. A large bonus of USI prototyping is that it enables early and congenial interaction with future users of the system. User working groups can be formed and included early in the USI design process. Since the user is actively involved in the USI design process, user acceptance problems would be kept to a minimum. USI prototyping is also used to provide rapid answers to questions arising during the system development process.
A rapid growth of expert system development in various fields of study will likely occur in this decade. Two prerequisites are needed in order for this to happen: strong social need and technical feasibility. Given that both factors presently exist, a few areas where expert systems can help significantly Include: (1) providing an Interactively accessible source of updated and well-organized knowledge, and (2) assisting a user in decision making. The current research reviews areas of Artificial Intelligence that relate to the process of knowledge acquisition for expert systems. Until very recently, the primary technique for knowledge acquisition has been the time-consuming process of interviews. Typical techniques include: structured and unstructured interviews, questionnaires, and verbal reporting which incorporates protocol analysis. The functions involved in one or more of the techniques encompass extraction of meaning, data inference, and rule induction coupled with retrospective comment analysis, and behavioral observations. The purpose of the current research is to explore different avenues for data acquisition when dealing with multiple knowledge sources with the objective to develop an automated technique for knowledge acquisition. The Delphi Technique is the primary technique investigated in this study, and the result is the Delphi Manager algorithm which is based on the original version of the Delphi Exercise modified to benefit the expert system development process. Other users of the algorithm include: (1) model verification and validation, (2) forecasting, and (3) opinion polls for policy decision making. Although there are additional uses, the Delphi Manager is primarily formulated for the expert system development process. The Delphi Manager was validated by using an existing knowledge base (KB) that was compiled by a paper and pencil version of the Delphi Technique. This existing KB was part of a dissertation by Randall F. Scott entitled "A Computer Programmer Productivity Prediction Model." The Delphi Manager has the potential to reduce significantly the time needed to collect and analyze new data. In addition, its user-friendly interface reduces the need for an advanced computer user either to build a questionnaire or to install a help facility. The program provides context sensitive help which is input by the developer through a series of templates. The Delphi Manager is also flexible enough to accommodate anyone from a novice to an advanced programmer. Improvements are suggested that are designed to provide additional program functionality and applications.
Design and development of software for the Airwing LSO version of the Automated Performance Assessment and Readiness Training System (APARTS) is described. APARTS is a carrier landing training aid designed to assist Landing Signal Officers (LSOs) in their shipboard recovery functions. Pilot and LSO landing performance data are recorded, analyzed and described by APARTS and presented to the LSO in a series of graphic displays which are used to evaluate carrier landing proficiency. Landing performance can be analyzed by pilot, LSO, squadron, aircraft or any combination, across time.
IDEA is an automated system, running on an Apple Macintosh under HyperCard, which provides the HFE/MANPRINT analyst a high-productivity mean of applying HFE/MANPRINT early in the materiel acquisition process and throughout a system's life cycle. MANPRINT is an Army initiative directed toward assuring total system effectiveness by the full and complete integration of system personnel considerations and requirements in system acquisition.
The expansion of microcomputer technology and the expectation of a student and professional population literate in its use has stimulated interest in how and where this PC technology can be applied in training and education. In human factors, there is becoming available commercial and personally developed software designed for research, analysis, and problem solving. The purpose of this demonstration is to present four projects showing the application of the microcomputer and computer-aided instruction (CAI) to graduate and undergraduate training programs. These demonstrations include the use of microcomputers to: 1) collect ergonomics data and simulate performance and tracking tasks; 2) present CAI for a workload lifting program; 3) provide an interactive videodisc and computer-based training program; and 4) utilize artificial intelligence software for problem-solving skills and applied research.
The new generation of inexpensive, powerful, handheld computers allows ergonomists to collect field data more easily and reliable. Typical programs are described for data collection by questionnaires, event timing, and occurrence sampling: These include SEARCH.BA, which tests human visual search capabilities and could be used to estimate visual lobe size; LINES.BA, which tests visual judgment capabilities; HICK.BA, which measures choice reaction time (RT); and FITTS.BA, which measures performance in the Fitts tapping task. Further programs have evaluated basic human capabilities using the keyboard and screen as control and display. None of these programs are complex and should be within the programming skills of most ergonomists. In addition a general purpose tracking task simulator will be demonstrated. This was developed for teaching man-in-the-loop control, and include options for input forcing function, system order, gain, lag, course preview.
PC-based analytical design, assessment and simulation tools for human factors are coming into increased use. However this software, designed for application and the specialist, is usually not well adapted for educational use by the inexperienced student. The purpose of this project was to focus on the instructional interface between available computer-based human factors programs and the student. With the aid of prototype software tools, a computer-aided instruction (CAI) program has been developed that is user-friendly, and provides an interface compatible in format with the program software. The CAI program that will be demonstrated is an instructional interface to a workload assessment program. The CAI also provides definition of workload assessment parameters and information on interpretation of analysis results. The workload analysis program was developed to enable users to efficiently analyze the stresses put on workers required to lift objects of various weight, origins, and destinations. Pilot tests have been conducted, and the preliminary results indicate that the CAI serves as a good refresher prior to executing the workload program. In addition to demonstrating this CAI program, the developers will be available to discuss the use of prototype software tools for facilitating CAI.
Little empirical evidence exists to assist interactive videodisc and computer based training developers in determining optimum user-interfaces. Mouse and keyboard response modes have different instructional, development and cost factor advantages and disadvantages. This paper overviews a presentation related to the author's dissertation experiment covering these issues.
An integrated software system was designed within the Department of Behavioral Sciences and Leadership for the purpose of applying artificial intelligence technology to the teaching of problem-solving skills in applied research. Tutorials on the subject of research design and analysis were developed by cadets and integrated into the software system. The software is capable of collecting and storing tutorial performance data to build student learning histories. Using data in the learning histories, the software is capable of tailoring instruction to the unique strengths and weaknesses students have in the academic domain. Voice recognition and digitized speech technology is used by the software to facilitate user control and input.
This demonstration program shows how human factors design and evaluation principles can be applied to the area of medical device and healthcare systems. The objective is to provide examples of evaluations and new designs for healthcare products which reduce human error and improve medical devices and instructional materials. International performance and design standards incorporating human factors principles are gaining more attention because of the efforts of the European medical device industry to standardize products.
A study of errors associated with the use of portable blood glucose meters, completed in 1989 by the Food and Drug Administration, showed many examples where human factors principles could have an impact on the accuracy of the blood glucose values. This demonstration will show how human factors principles can be applied to the evaluation of the instruction manuals, meter design features, and interface considerations. A 12-minute video highlighting these principles and showing footage of diabetics using these meters will be presented along with mock ups of potential design features and changes in instructions that could enhance the user's ability to obtain more accurate readings. This work may have broader generic application to user error problems for all medical devices and healthcare systems.
The workload of an anesthesiologist has been compared to that of an airline pilot; high stress periods during takeoff (putting the patient to sleep) and landing (waking the patient) that bracket a long period of relative inactivity and boredom. Both the tasks of piloting an aircraft and controlling the condition of an anesthetized patient are subject to human errors that can be life threatening. American Institutes for Research has worked with an international medical product manufacturer to help them design patient monitors, for use by anesthesiologists, that reduce the likelihood of human error and increase the usability of the overall product. Proposed modifications to the manufacturer's existing design process have included conducting more extensive user studies, including (1) focus groups, (2) experiments involving potential users interacting with rapid prototypes, and (3) formal usability tests of the final design to determine the quality of the human interface to the monitor. To underscore the benefit of these changes, a usability test was conducted on behalf of the manufacturer that required an anesthesiologist to try to use a commercially available patient monitor without having prior instruction or experience with the product. This test exposed both obvious and subtle opportunities for improving the usability of the product and reducing the chance of error due to interface design deficiencies. A videotape of the usability test will be followed by a short discussion of the steps followed to integrate human factors engineering processes into the manufacturer's design process.
Accidental breathing system disconnections have been cited as a significant problem in intensive care units. Inadequate attention to human factors design plays a significant part in the creation of these problems. As well, design restrictions caused by construction of new intensive care units in existing structures may compound problems. Examples of three potential causes for breathing circuit disconnection are shown with corrective measures for each problem.
A computer-based wheelchair simulation will be demonstrated. Such a wheelchair simulator could be used in rehabilitation both to prescribe controls for patients and to develop new types of wheelchairs. The selection of a wheelchair controller and control dynamics for a specific patient currently involves the actual use of a variety of power wheelchairs. A valid simulator could reduce costs by eliminating the need to test and evaluate a set of power wheelchairs for each patient, increase safety of the patient by eliminating the risk associated with learning to operate a new power wheelchair, and ease collection of performance data by providing automated data collection. Further, the simulation could be used to test control dynamics as related to the user's perspective view when developing new power wheelchair products. The simulation runs on a personal computer with low resolution color display. The realism of the display is augmented by the use of a Fresnel lens to increase the three-dimensional effect. The display is updated frequently to ensure accurate control feedback. The performance measures used to test the simulation include both time and accuracy to move through a computer simulated course and an identical physical course. The initial results from user testing are being used as the basis for an iterative redesign process before formal testing is initiated.
Designing products for the visually impaired presents a different set of problems than those encountered when designing for the general population. Several research techniques and an integrated design methodology were utilized to address specific user-based issues. Visually impaired diabetics were interviewed throughout the development process to determine their concerns, needs and expectations. Concurrently, the research and design staff were sensitized to the condition of retinopathy so that the experience of being visually impaired directed their design-decision making process. The result, as this demonstration will show, was the development of a superior blood glucose monitoring system for the targeted user group. Design decisions and evaluations based on the extensive research will be demonstrated through the use of models and 35 mm slides -- showing the beneficial results of this process.
This paper describes the development of a new generation of psychophysiological test battery to replace our first battery, the Neuropsychological Workload Test Battery (NWTB). The new battery, the Psychophysiological Assessment Test System (PATS), has a much improved user interface, expanded capabilities for use in simulator facilities, enhanced data reduction and management capabilities, and includes the ability to do statistical analysis.
Many problems encountered during operational test and evaluation of Air Force systems are described in service reports, an important element in the communication of test findings to the system acquisition community. Since many service reports cite deficiencies in the man-machine interface, service report data are an important source of human factors information. However, they are not readily integrated with other test data because of their qualitative character. This paper describes attempts to classify service reports from two complex command, control and communications type systems in order to make the data more usable. Classification schemes emphasizing (1) likely engineering solutions to the problems and (2) human activities affected by the problems were applied. Classification results were consistent with known characteristics of the system, and revealed interesting trends in the data. Some support was obtained for both the reliability and validity of the classification schemes as well. Implications of the results for service report analysis are discussed.
Concept of Operations Experiments (COOPEXs) are conducted at the Naval Underwater Systems Center to evaluate submarine combat system operability through structured walkthroughs of submarine missions in a full-scale replica of the combat system environment. Data were collected from one COOPEX for the purpose of piloting human factors engineering methodologies. Partial results based on a different COOPEX scenario are reported and compared. The data reduction and analysis procedures of digraph analysis, Q-analysis, multidimensional scaling and crew density were applied to assess combat system information flow and configuration effectiveness. Results revealed the potential for significantly enhancing submarine combat system performance when applied to larger, more complex data sets. Plans for subsequent research are discussed.
To date, testing and evaluation of whole-body vibration in ground vehicle systems have not always fully utilized appropriate experimental design methodology, applicable statistical tests, or relevant criteria. A test design and evaluation methodology was developed to eliminate these oversights. This methodology uses inferential statistics, questionnaires, and a comparison of vibration data with representative mission scenarios. The methodology was employed in the evaluation of two alternative tracked ground vehicle designs. The independent variables were track type, terrain, vehicle speed, and crew position. The dependent variables were International Standards Organization (ISO) 2631 whole-body vibration exposure limit times at the lateral, transverse, and vertical axes. Two different multivariate analyses of variance (MANOVAs) performed on the exposure limit data indicated that all main effects, as well as several interactions, were significant (p < .01). A comparison of exposure limits to a representative mission scenario indicated that both track types would exceed ISO 2631 exposure, comfort, and fatigue limits during expected travel over cross-country terrain. Crew questionnaires also indicated crew discomfort when exposed to this type of terrain. The experiment demonstrated that the procedure was useful in helping to determine the extent that vehicle vibration permits the performance of the vehicle mission, within limits dictated by safety, efficiency, and comfort.
Six years ago U. S. Army Natick Research, Development and Engineering Center created a program to obtain soldier feedback on its products -- those currently in the field as well as those under development. This feedback is obtained through field tests and systematic user surveys and is augmented by a number of other procedures. The program is effective it has spurred design changes and has given the Army's product developers substantive data bases to use in their decision making.
An in-flight experiment was performed to investigate the effects of time delay on manual flight control and flying qualities. The experiment was conducted using the USAF/FDL variable-stability NT-33A aircraft. Pure time delay was added equally to the pitch and roll flight control system. Evaluation tasks were presented on a head-up display (HUD). Instrument meteorological conditions (IMC) were simulated which limited the visual cues available to the pilot to the 20 degree foveal scene provided by the HUD. The in-flight time delay data were generated with full fidelity, unlimited range of motion cues. Using the same cockpit and a digital aerodynamic simulation, the in-flight experiment was completely replicated as a fixed-based ground simulation. Thus, the effects of extreme conditions in motion cuing (i.e., full motion versus no-motion) were examined for constant visual cuing.
A basic research was conducted on a sample of twelve right-handed young males for pull actions of the upper limb on a gauge handle. The general purpose is to constitute an atlas of forces for French males, useful for ergonomics studies. Different conditions were tested before to select a standard protocol. Main difficulties concern the elimination of lower limbs contribution, the stability of the posture, the motivation of the subject and the choice of the parameters for the measure. Intra-individual variability in
The present study investigated the use of a classic laboratory paradigm, paired-associate (PA) learning, to assess the ease of learning and transfer of command mnemonics. This paradigm was applied to the ease of learning text editing command language where the stimulus was a command (e.g., Delete Block) and the response was the keystroke sequence associated with that command (e.g., ^DB). Two types of command keystroke sequences were employed; meaningful (M) abbreviations which were mnemonically related to command names (e.g., Delete Block - ^DB), and nonmeaningful (MN) abbreviations which are not mnemonically related to command names (e.g., Delete Block - ^LK). There was evidence for differential transfer only for the average number correct measure but not the trials-to-criterion-measure. For both first and second list learning, lt took significantly fewer trials to criterion to learn the M than NM keystroke sequences. The present results point toward the use of the PA paradigm to standardize the ease of learning of command languages in software usability testing. It may be concluded that the trials-to-criterion measure and the average number correct measures are sufficiently sensitive metrics to differentiate ease of learning good from bad command mnemonics.
Military Standard 1472C is the prevailing standard for human engineering in military systems, possessing literally thousands of design criteria. The breadth and detail of these criteria often prove to be impediments to their effective application. This paper describes a prototype software tool, the Computer-Aided Checklist for Human Engineering (CACHE), designed to automate the generation, administration, and analysis of human engineering compliance checklists.
In a competitive development environment, a method is needed to quantify the usability characteristics of an interface This quantification provides a basis for making human factors design recommendations. A methodology for comparing the usability characteristics of product interfaces with those of the competition is presented. The discussion details the steps of a competitive evaluation methodology: (1) definition of interface objectives, (2) development of a flow chart for each product interface, (3) determination of the categories of comparison based on salient and quantifiable characteristics of the interface, (4) derivation of the metrics used for comparison, and (5) the resulting comparative evaluation. The methodology was used to compare the panel (screen) format and navigation characteristics of two network controllers The value of this methodology and its impact on the way human factors engineers contribute to product development are also discussed.
This report presents the lessons learned from a software usability test for an external customer. An initial evaluation with naive users revealed problems in the user interface and that the customer's objectives were not being met. After initial resistance to making changes in the software, the customer decided to delay release of its product to implement some of the recommendations and changed the focus of initial release to experienced users. The results of a second evaluation conducted on the revised product with experienced users were positive. Several lessons can be learned from the above evaluation: (1) Usability evaluation should be incorporated earlier in the software development cycle to minimize resistance to changes in a hardened user interface; (2) Organizations should have an independent usability evaluation of software products to avoid the temptation to overlook problems to release the product; (3) Multiple categories of dependent measures should be employed in usability testing because subjective measurement is not always consonant with user performance; and (4) Even though usability testing at the later stages of development may not impact software changes, it is useful to point out areas where training is needed to overcome deficiencies in the software.
This paper discusses methods with which one can simultaneously counterbalance immediate sequential effects and pairing of conditions and stimuli in a within-subjects design using pairs of Latin squares. Within-subjects (repeated measures) experiments are common in human factors research. The designer of such an experiment must develop a scheme to ensure that the conditions and stimuli are not confounded, or randomly order stimuli and conditions. While randomization ensures balance in the long run, it is possible that a specific random sequence may not be acceptable. An alternative to randomization is to use Latin squares. The usual Latin square design ensures that each condition appears an equal number of times in each column of the square. Latin squares have been described which have the effect of counterbalancing immediate sequential effects. The objective of this work was to extend these earlier efforts by developing procedures for designing pairs of Latin squares which ensure complete counter-balancing of immediate sequential effects for both conditions and stimuli, and also ensure that conditions and stimuli are paired in the squares an equal number of times.
The present experiments were designed to test predictions from a model of mental workload. The model predicts non-linear increases in mental workload as perceived distance from a task goal grows and effective time for action is reduced. Diminuation of mental workload is achieved by application of effort which brings the task goal into the region of acceptable time/distance constraints for successful resolution. Two experiments are reported which tested these assertions using the timepools performance task. Timepools is unique as a performance task in that it generates a spatial representation of a shrinking temporal target. The independent effects of path length, i.e., the number of sequential targets to be acquired, and shrink rate, i.e., the collapse time during which the circle is halved in are, may be assessed using performance variables such as reaction time (RT), movement time (MT), error rate (E), and the subjective perception of workload. Dat from Experiment 1, indicate systematic effects for task related factors across performance and workload measures, although such a pattern was not isomorphically mapped to the a priori assumed difficulty of the task. In Experiment 2, shrink rate and path length had independent effect on RT and MT respectively, which were reflected in components of the individual workload scales. The ramifications with respect to the model are elaborated.
As a precursor to functional analysis, a content analysis was done to guide improvements for the interface design of a printed circuit board design system. Content analysis, as a design tool, requires users to judge the usefulness of task information and then prioritize it based on one or more specified attributes. For this application, content analysis was completed using 11 judges experienced in printed circuit board layout. All judges were asked to work within the context of a particular printed circuit board example. Two attributes, task relevance and order, were considered by judges as they sorted tasks. An inter-rater reliability check was performed and one judge was eliminated from further analysis. From the remaining pool of 10 judges, 32 tasks central to the activity of board layout were discovered. A model was built using cluster analysis and MDS algorithms which was based on relevant tasks, task order and task concurrence. The model was then compared to the current menu structure of a mature design layout interface and recommendations for interface modifications were made. Notes on using content analysis to do interface design and evaluation as well as recommendations for further use are discussed.
The increasing role of automation in human-machine systems requires modeling approaches which are flexible enough to systematically express a large range of automation levels and assist the exploration of a large range of automation issues. A General Model of Mixed-Initiative Human-Machine Systems is described, along with a corresponding automation taxonomy, which: provides a framework for representing human-machine systems over a wide range of complexity; forms the basis of a dynamic, pseudo-mathematical simulation of complex interrelationships between situational and cognitive factors operating in dynamic function allocation decisions; and can guide methodical investigations into the implications of decisions regarding system automation levels.
An experiment was conducted to examine the potential negative effects of automatic task components in situations requiring re-use or inhibition of those components. Participants trained on a category search task for 8,4OO trials in three consistent (CM) and one varied mapping (VM) conditions. Following training, 2,352 trials were completed in seven transfer conditions. Results suggest that skill transfers to similar task situations. However, the data demonstrate that if the transfer situations are incompatible or prior learning must be inhibited, performance is disrupted. Although each condition improved after 336 transfer trials, performance did not reach pre-transfer levels in incompatible or inhibited conditions. The present data are useful for predicting transfer performance when skill components are trained to automaticity using a part-task methodology.
A study was performed to examine the role of consistency in the development and transfer of automatic processing. Subjects performed a rule-based memory search task in which they compared multidimensional probes to either one, two or three memory set rules. Results indicated that learning occurred in the absence of consistency at lower levels of task description (e.g. mapping of individual task components to responses) as long as higher level consistencies existed in the task (e.g. consistent mapping of task components to a conceptual framework). High positive transfer was obtained despite replacement of the exemplars of the memory set rules, suggesting that learning was not specific to the items encountered during training. On the other hand, the magnitude of positive transfer was reduced when the rules were replaced suggesting that most of the learning took place at the level of specific rules. Some evidence was also obtained for more general process-based learning.
It has been demonstrated that highly trained, automatic processes can transfer across certain memory search tasks; the degree to which these processes may be exhibited in visual search tasks has not been established, however. We examined this issue by testing the transfer of highly trained, automatized components of a semantic category, visual search task to stimulus situations of varying degrees of relatedness. We developed an adaptive version of the multiple-frame detection task (Schneider and Shiffrin, 1977) in order to test performance at the limits of visual search capacity. During training, frame-time was the dependent variable and was determined by each participant's performance ability. Each received 6,090 trials on exemplars from a single semantic category. Transfer consisted of two sessions, 330 trials per session. Transfer performance reveals that participants became highly proficient at the
This investigation addresses fundamental aspects of the reliability and stability of both basic cognitive functions and automatic processing components of skills. In the present experiment we investigated the pattern of component skill retention (or decay) exhibited after training on automatic and controlled processing task components. Participants were trained on a hybrid memory/visual, semantic-category search task and received low (720 trials, moderate (2,160 trials) and high (4,320 trials) amounts of consistently mapped (CM) training plus variable mapped (VM) training (720 trials). Retention was tested at five time interval: one day, 30 days, 90 days, 180 days, and 365 days following training. Critical data for this investigation involve the pattern of performance decay across conditions and retention intervals. After an initial decline in the first 30 days following training, performance in CM conditions remained stable from Day 30 to Day 365. VM performance was erratic. The present research has practical and theoretical significance for elucidation of the mechanisms underlying long-term retention of automatic processes. Specification of these mechanisms is essential in the attempt to predict performance after a period of inactivity, to determine what constitutes appropriate refresher training, and to designate which skill components to emphasize during training.
Deep space missions such as Voyager rely upon a large team of expert analysts who monitor activity in the various engineering subsystems of the spacecraft and plan operations. Senior team members generally come from the spacecraft designers, and new analysts receive on-the-job training. Neither of these methods will suffice for the creation of a new team in the middle of a mission, which may be the situation during the Magellan mission. New approaches are recommended, including electronic documentation, explicit cognitive modelling, and coached practice with archived data.
Operational studies have revealed a need to focus attention on team training, and a need for effective teamwork skills for successful training performance. The present study was designed to develop an assessment scale that can be used by instructors of various training situations, which will yield a measure of the degree of teamwork required in their situations. Data obtained from the scale show psychometrically sound properties of the scale (high internal consistency and high item-total correlations) and initial validity of its (the ability to distinguish various training situations as to the extent of teamwork that is required). Recommendations for future research are also discussed.
This study investigates the usefulness of existing performance measures for evaluating the outcome effectiveness of team tasks. It describes a method to identify the measures most appropriate for evaluating training on different types of tasks and under different performance conditions. Six prototype team tasks served as rating stimuli that were used to evaluate 15 objective and 23 subjective team performance measures. Raters (N=33) assessed the usefulness of these performance measures for evaluating performance on each team task under three different scenarios. These scenarios asked how useful the measure would be for: (1) evaluating the performance of teams that want to improve and develop skills; (2) evaluating the performance of teams that have learned the task and need to maintain performance; and (3) helping a consultant to appraise the performance of the team. Results indicated reliable panel ratings; factor analyses of each objective and subjective performance measure correlation matrix revealed five-factor solutions for each domain, and these solutions were consistent across tasks and scenarios. Performance rating means varied significantly by task type, but generally were consistent across scenarios. The ratings are sensitive to task type and can be used systematically to specify relevant dimensions of team evaluation.
A Measure of Effectiveness (MOE) is a criterion. A system that scores well according to the criterion will be accepted as effective, meaning that it achieves what is intended. This seems simple, but in fact well-meaning managers can fail to find an adequate measure, are unclear about what is intended for the system, and may even misidentify the system. The method described here is a means to help avoid common mistakes. With the method, a manager or analyst (a user) builds a model of his or her own effectiveness assessment strategy using specially designed interactive software. The user enters data prompted by the software and views feedback consisting of graphs and ordered lists describing the user's inputs in various ways. Feedback gives the users alternative viewpoints for their inputs. As a case study, an MOE was constructed for a Soviet artillery unit within an attacking regiment.
Embedded training (ET) is the delivery of instruction utilizing the resources (computational, display/control, etc.) of an actual system. The design and implementation of ET in military systems, especially those with tactical missions, presents unique challenges to the training developer. These challenges relate to issues that run the gamut from training effectiveness through system safety and readiness to user acceptance. Because ET is viewed as a cost-effective training option it is being given high priority in military system development. The objectives of this symposium are to: 1) present a discussion of issues in the authoring and implementation of ET; and 2) provide current examples of how ET is being developed and applied in the three military services.
This paper summarizes lessons learned from several projects related to embedded training (ET) and describes functional characteristics of an embedded training authoring system. Both desired and mandatory features of an ET authoring system are discussed for several applications. The relationship between embedded training and paperless technical manuals is also discussed as are engineering constraints imposed by the host system.
This paper describes a model for the implementation of embedded training (ET) which may find applications in a variety of military systems. In addition, several of the "lessons learned" during the development of this methodology are summarized. Finally, recommendations for model enhancement are discussed.
The ever-increasing capabilities of computers have resulted in a new generation of man-machine systems in which the machine acts in an intelligent manner to enhance the operator's decision-making capabilities in real-time multi-tasking (RTMT) situations. In such situations, the operator's information needs constantly change as he/she must attend to several events simultaneously and often switch from one decision-making task to another. The ability of the intelligent systems to aid humans in a flexible interactive fashion depends on the capability of the machine to predict these switches and the resulting changes in the human's information needs at a given time. These systems must therefore incorporate a model of the human operator's tasks based on information about the individual tasks and the dynamic relationships between the tasks and the occurrence of outside events. This paper focuses on the construction of such a model in the context of mission management problems of airborne Tactical Coordinators (TACCOs) in anti-submarine warfare (ASW). The model is built as a Cognitive Network of Tasks (COGNET) and is based on the integration of GOMS notation and the Blackboard architecture and is now being used to develop an adaptive intelligent interface for TACCOs.
Newly developed cost and training effectiveness models are being used by training developers to control costs and to insure systematic training device design. The problem for the user is how to select the appropriate design aid. Unfortunately, there are no quick objective methods on which to base this selection. The selection decision for a particular application can be made based on three issues. The first issue is how the design aid addresses device instructional and fidelity features. The second issue is how the design aid formalizes the device design decision process. The third issue is to compare the systems on their ease of implementation. Two decision aids are analytically evaluated on their approach to training device design: OSBATS (Optimization of Simulation Based Training Systems), which is in prototype development, and ASTAR (Automated Simulator Test and Assessment Routine), which is ready to be fielded. These decision aids are based on differing theoretical approaches to formalizing training device design. OSBATS's taxonomy of fidelity features relates instructional features to individual tasks. OSBATS contains a tradeoff function which uses historical cost and benefit values for individual features. It uses large amounts of detailed information to drive its algorithms. ASTAR is a management tool which organizes the diverse interests of a design group to address design issues. ASTAR obtains judgments about instructional approach and device similarity for each training objective. ASTAR facilitates communication between members of a design team and insures a consensus on the issues.
A major problem in training device design is specifying the appropriate level of fidelity and required instructional features for learning. This research effort was designed to acquire detailed information about tasks and training device fidelity features. The standard method for developing information about task and fidelity relationships has been to conduct research into training methods using varying degrees of fidelity, or to extrapolate from evaluations of training programs based on newly developed training devices. The rotary-wing operations domain was selected as the basis for gathering detailed relationship data. A Training Device Fidelity analysis was conducted on many of the devices at the Army Aviation School at Ft. Rucker. A survey was then developed that crossed the tasks being trained on the AH-64 CWEPT (Cockpit, Weapons, and Emergency Procedures Trainer) and the UH-1 CPT (Cockpit Procedures Trainer) with the device characteristics present on those training devices. The survey was administered to instructors using the training devices. The survey responses were categorized, and the consensus results are being used in developing expert system rules. The conclusion drawn is that adequate data can be collected using surveys to generate experience-based (versus opinion-based or device evaluation-based) rules for determining necessary and sufficient fidelity aspects for training devices. The method can be used in any training domain that requires training devices, where guidance is inferential and opinion-based, and where those devices are costly and/or need to be very effective.
Currently accessible technologies are providing entirely new display concepts for enhancing helicopter navigation. Yet the effectiveness of such displays depends on the extent to which they are configured according to principles from research on human performance. Computer generated map displays in the present study were configured according to previous research on maps, navigational problem solving, and spatial cognition in large scale environments. Interest centered on the representation of different spatial relationships that would best support helicopter navigational problem solving. One map display emphasized the global relationships between objects in the environment. The other map showed the pilot's relationship to objects as he travelled through the environment. Twenty skilled pilots used the maps to complete several navigational tasks that occurred within a realistic simulation program tailored for helicopter navigation. Findings indicate that the type of task and mode of flight (low level or Nap of the Earth (NOE)) are important determinants of map display effectiveness.
The objective of this investigation was to identify air combat mission tasks that could be trained using existing multiship simulator technology. Forty-two mission ready F-15 pilots and 16 tactical air controllers rated their need for additional training on 41 air combat tasks. These pilots and controllers then participated in four days of air combat training using McDonnell Aircraft Company's simulation facility. This training allowed the participants to practice two-ship tactics in an unrestricted combat environment which included multiple air and ground threats, electronic combat, and real-time kill removal. Following training, the participants rated the value of their current unit training and training provided by the multiship simulation. Pilots rated the multiship simulator training superior to their current unit training for 22 of the 41 air combat tasks. Pilots also rated their need for additional training in those 22 combat tasks from "very" to "extremely" desirable. The controllers indicated that all combat tasks were better trained in the multiplayer simulation than in their current unit training program. Interviews and questionnaires also identified a number of strengths and weaknesses of the simulation that provide "lessons learned" for the development and use of future multiplayer air combat simulations.
This paper draws upon both an extensive review of the literature, and a series of experiments manipulating motion-based (videotaped) versus static (35-mm slide) presentations of instructional material across a variety of instructional conditions. Performance measures in the experiments included both hands-on tasks and conceptual knowledge tests. Results indicated that electromechanical maintenance performance did not differ significantly between statically and dynamically trained groups across a variety of types and complexities of electromechanical maintenance tasks and instructional strategy conditions.
Six college-age male subjects performed one hundred, two-minute trials on a second-order tracking task. After each trial, subjects estimated perceived workload using both the NASA TLX and SWAT workload assessment procedures. Results confirmed an expected performance improvement on the tracking task which followed traditional learning curves within the performance of each individual. Perceived workload also decreased for both scales across trials. While performance variability significantly decreased across trials, workload variability remained constant. One month later, the same subjects returned to complete the second experiment in the sequence which was a retention replication of the first experiment. Results replicated those for the first experiment except that both performance error and workload were at reduced overall levels. Results in general affirm a parallel workload reduction with performance improvement, an observation consistent with a resource-based view of automaticity.
Three tank gunnery trainers were studied to determine learning transfer over repeated trails. Devices included the TOPGUN trainer, a part-task, reduced-fidelity tank gunnery trainer; the Videodisk Gunnery Trainer (VIGS), another part-task, limited-fidelity trainer; and the Conduct-of-Fire Trainer (COFT), a full-fidelity trainer. The objective was to determine the degree of gunnery skills transfer between the part-task gunnery trainers and the full-fidelity simulator. COFT criterion performances were examined for two pretraining groups (either TOPGUN first, then VIGS, or VIGS first, then TOPGUN) and a control group in order to determine which pretraining sequence leads to better performance. Each training group, composed of 20 subjects, received two multiple-mission engagement trials on four consecutive days (2 VIGS-2 TOPGUN, or vice versa) before COFT transfer. Results showed significant Group and Trial effects for transfer between TOPGUN and VIGS and significant transfer to COFT performance regardless of the prior sequence of training.
Two experiments demonstrated that people who receive specific instructions (SI subjects) for using a word processor are able to accomplish initial tasks more quickly than people who receive more general instructions (GI subjects). Experiment 1 found, however, that SI subjects were unable to do a novel transfer task unless they received hints while GI subjects had no trouble with the transfer task. A production rule analysis was used to guide a revision of the specific instructions so that those instructions promoted generalization. Experiment 2 used these revised specific instructions and found that SI subjects were now able to do a novel transfer task about as well as GI subjects. These results suggest that a production system is a useful tool for analyzing instructions and predicting user performance and that specific instructions designed to promote generalization may be the most effective type of instructions.
This study attempted to determine if training and familiarization with a face composite system would improve the quality of the produced composites. Subjects were trained in the use of the Mac-a-Mug Pro system over two sessions during which they constructed eleven composites (six from memory and five with the face in view). The results indicate that the composites produced while the target face was in view were significantly better than the composites produced from memory, and that both improved with practice. Initial training with the composite system prior to exposure to the first face or after the first face did not affect composite quality. These results have implications for the training of personnel at high risk of witnessing a crime.
The area represented by this title is far too broad to cover in a short article. Therefore, rather than trying to summarize the status of the field, I will provide pointers to three recent books in the area that very adequately convey the status of the field. Some major omissions of the current research will be covered under the topic of future directions.
Advanced technologies, including artificial intelligence (AI), hypertext, and natural language processing (NLP), are transforming the Mind/Machine Interface. This presentation focuses on two large development projects underway that use these technologies in unique ways. Their use is guided by the three natural means of communication between people: saying, coaching, and showing; as metaphors for using advanced technology interfaces. The two projects are aimed at developing job and training aids for the Army. The most complete example is the Maintenance Aid Computer for HAWK -- Intelligent Institutional Instructor (MACH-III). This is the largest and most successful implementation of an ITS to date (Psotka, Massey, and Mutter, 1988). MACH-III was developed by Bolt, Beranek, and Newman (BBN), to provide training in organizational maintenance of the main radar of the HAWK air defense guided missile system. Its core is a huge qualitative simulation of the radar. The complexity of the simulation and the troubleshooting problem space demand a unique hypertext interface, whose structure and function are only beginning to be understood. Some preliminary evaluation results from the U.S. Army Air Defense Artillery School (USAADASCH), Ft. Bliss, Texas are beginning to show its effectiveness. The other project, Building Robust Dual Grammar Exercisers (BRIDGE), will begin to explore the architextual structure of hypertext systems within the context of advanced technologies for military machine translation and military foreign language training. From this perspective, hypertext is a bridging technology that links the existing strengths of qualitative simulations with the future power of natural language processing.
Grace is an intelligent tutoring system (ITS) that will be used to teach programming in Cobol to about 300 NYNEX Service Company employees a year. It is the first ITS to be built by an industry laboratory for use within that industry. Grace has been a successful development project primarily because of the focus on usefulness and the use of iterative design. This paper describes Grace as a case study of finding a place for an ITS, ensuring that the users find it useful, and using prototype-evaluate cycles.
Voice input for control of camera functions was investigated in this study. Objectives were to (1) assess the feasibility of a voice-commanded camera control system, and (2) identify factors that differ between voice and manual control of camera functions. Subjects participated in a remote manipulation task that required extensive camera-aided viewing. Each subject was exposed to two conditions, voice and manual input, with a counterbalanced administration order. Voice input was found to be significantly slower than manual input for this task. However, in terms of remote manipulator performance errors and subject preference, there was no difference between modalities. Voice control of continuous camera functions is not recommended. It is believed that the use of voice input for discrete functions, such as multiplexing or camera switching, could aid performance. Hybrid mixes of voice and manual input may provide the best use of both modalities. This report contributes to a better understanding of the issues that affect the design of an efficient human/telerobot interface.
Grace, the NYNEX COBOL tutor, is being built in a corporate environment following the philosophy of iterative design and test. Grace and the student interact in a mixed-initiative dialogue. Grace's side of the dialogue is controlled by a simulation based upon the ACT* theory of cognitive skill acquisition (Anderson, 1983, 1987b). This simulation is theory-driven and largely, but not completely, embodied in a production system architecture. The student-tutor dialogue is mediated by an interface whose design is empirically driven and embodied in a multi-media system of windows, text, hypertext, mouse gestures, menus, node selections, typing-in, and so. Construction of the simulation and the tutor interface are being tested and revised through a series of user trials. The trials are conducted at one of the sites at which the tutor will be used. Students participating in the trial are from the same population as our target audience.
This study presents a critical analysis of the state of current technologies, methods, and tools used in cognitive task-analysis. Methods for cognitive task-analysis, derived from methods used in cognitive science, are relatively new and have not been systematized. Current methodologies demand considerable time and expertise to conduct properly and often yield data which is difficult to readily translate into practical application. This paper examines these problems and proposes some directions for future research and training program development.
Humans have the ability to monitor and control their conscious cognitive processes. This ability, called metacognition, implies that people can learn to optimize their cognitive processes. Recent research in metacognition provides new ways of accelerating learning and skill transfer through an improvement in the decision-making, problem-solving, and attentional skills of trainees. This paper provides a review of recent research in metacognition and presents recommendations for assessing and facilitating metacognitive skill in trainees.
This is an introduction to cognitive task analysis as it may be used in Naval Air Systems Command (NAVAIR) training development. The focus of a cognitive task analysis is human knowledge, and its methods of analysis are those developed by cognitive psychologists. This paper explains the role that cognitive task analysis can play in the development of advanced training systems and presents the findings from a preliminary cognitive task analysis of airborne weapons operators. Cognitive task analysis is a collection of powerful techniques that are quantitative, computational, and rigorous. The techniques are currently not in wide use in the training community, so examples of this methodology are presented along with the results.
As more and more work makes use of computers, the need for simple usable methods for analyzing and documenting computer-based operator tasks is increasing. Computer-based tasks can be very complex and difficult to analyze, describe, and train. Traditional methods for describing tasks are often inadequate. The purpose of this paper is to present two cases where methods were borrowed from software requirements definition and design and applied to analysis and documentation of operator tasks for complex software-based systems. The situation associated with each case is described. The methodology borrowed and adapted is then described and comments are made concerning the effectiveness of the approach. Finally, some summary comments are made.
Everyone knows that the colors of smaller objects are less distinct than the colors of larger images. This fact is of practical importance in the design of visual display formats. Color is useful to speed visual search and to organize categories of information. Because display space is precious, symbols are made as small as possible. The display designer must make a tradeoff between symbol size and operator performance. The purpose of this paper is to provide a quantitative basis for the tradeoff. Methods to calculate the effect of symbol size are evaluated, design tricks are highlighted, and the reader is alerted to pitfalls.
A Signal Detection paradigm was utilized in a symbol recognition experiment designed to determine how far apart, in CIE/UCS color space, symbol and background chromaticities must be in order for observers to reliably recognize the symbol. Hits and were found to increase significantly and false alarms to decrease significantly is a function of increased distance between symbol and background chromaticities. The d' measure of sensitivity was generally found to be 3.0 or greater for symbol/background chromaticity differences of 0.06 units in 1976 UCS color space. However, d' was considerably lower for symbol/background pairs for which increasing distance between symbol and background chromaticity was associated with the background chromaticity having an increasing blue component. The area of application of the research results is in the design specification of color coded symbology to be overlaid on moving map, situational awareness, displays.
A symbol recognition experiment was conducted, with and without PLZT goggles to determine how far apart in color space symbol and background colors must be in order for the symbols to be reliably recognized. Spectral transmittance data showed a reduction of approximately 78 percent in display luminances to the operator wearing PLZT goggles, which was almost uniform across the visual spectrum. All chromaticities, over the entire CRT display gamut, were found to shift markedly toward green when measured through the goggles. This shift was as much as 0.064 1976 UCS units (for fully saturated blue). No criterion shift (beta) was found between the goggle/no goggle conditions. The measure of sensitivity (d') was significantly reduced (from 3.788, without goggles, to 2.910, while wearing the goggles. The probability of hits also decreased significantly (from 0.945 to 0.863) and the probability of false alarms increased significantly (from 0.044 to 0.109) between the no goggle and PLZT cases (all p < 0.05). The effects of the PLZT goggles on the symbol recognition task were lessened as the symbol-to-background chromaticity distance was increased. These results support the development of specialized color display symbol sets in workplaces where PLZT flashblindness protection is worn by the operator.
This experiment tested a detection theory model of visual signal detection and recognition. The task employed a visual display consisting of analog gauges arranged in a horizontal line. The signals to be detected and identified were three unique patterns of gauge values embedded in noise. After viewing the display the observers either reported that any of the signals had occurred (1-of-m signal detection) or specified which of the signals (if any) had occurred (1-of-m signal recognition-detection). The results indicated that performance on 1-of-m recognition and detection tasks can be predicted from performance on the component single-signal detection tasks.
Two experiments were conducted to determine the human's ability to acquire and memorize the spatial locations of stimulus targets using a helmet-mounted display. The experimental task was a two-phase search and replace task in which the size of the field-of-view (FOV) on the helmet-mounted display and the memory load (number of targets) were manipulated. In Experiment 1, all stimulus targets were removed after the search phase. In Experiment 2, only the three stimulus targets to be replaced were removed, leaving the subjects with some contextual information regarding the overall pattern of targets. Results of both experiments showed that: 1) search time increased significantly as the size of the FOV became smaller, and 2) subjects' ability to replace a stimulus target in its original location in space was adversely affected by increases in memory load. These results indicate that the size of the FOV affects one's ability to acquire spatial information of one's surroundings, but once this information has been mapped into spatial memory, humans can use that information independently of the size of their "window" to the world. However, subjects' spatial memory has some limitations, since the ability to remember precise locations becomes poorer as the amount of information to remember increases. The effects of additional context provided in Experiment 2 resulted in a slight increase in the precision with which subjects could remember specific target locations. The results of these studies have implications in two areas: human spatial cognition, and the design of helmet-mounted displays.
The effects of pressure gloves on human hand capabilities is a major concern in the performance of extravehicular activity (EVA) for space maintenance and construction missions. The effects of EVA gloves on six hand performance domains was investigated in this NASA sponsored research. They were range of motion, strength, tactile perception, dexterity, fatigue, and comfort. All tests were designed to be performed in a glove box using the bare hand as well as the glove at 0 and 4.3 pressure differentials. Ten subjects participated in the test in a repeated measures design. The results of the experiments are summarized in this paper.
Two experiments examine the abilities of 10 subjects to visualize directions shown on a perspective display. Subjects indicated their perceived directions by adjusting a head-mounted cursor to correspond to the direction depicted on the display. This task is required of telerobotic operators who use map-like pictures of their workspace to determine the direction of objects seen by direct view. Results show significant open-loop, judgements biases that may be composed of errors arising from misinterpretation of the map geometry and overestimation of gaze direction.
Subjects were given a "god's eye" view of an air battle involving seven aircraft: two were friendly, either one or three were hostile, and the rest were neutral. In one condition (Consistent FFN), which aircraft were friend, foe, or neutral was consistent throughout a trial. In another condition (Variable FFN), the identity of each aircraft changed randomly within a trial. In general, subjects' spatial awareness was best for enemy aircraft and worst for neutral aircraft. Increasing the number of enemy aircraft from one to three degraded spatial awareness for enemy aircraft in both FFN conditions. FFN awareness for was also affected. These results are incorporated in terms of a limited capacity model of attention and subjects' attentional priorities.
Head-up display (HUD) research has centered on modifications to the basic aircraft control symbology -- the pitch-ladder lines. Although some of these modifications have led to minor improvements in attitude recognition, major problems still exist: pilots continue to experience spatial disorientation and to complain of occlusion due to the HUD symbols. This experiment compared four variations of a basic HUD pitch ladder: Display A, double articulation; Display B, single negative articulation; Display C, single negative articulation with gradually increasing thickness: and Display D, single negative articulation with gradually increasing thickness in a global arrangement. Accuracy of bank recognition was best when pitch-ladder symbology incorporated noticeable asymmetry. Double articulation and graduated thickness were associated with greater accuracy of pitch recognition. Studies under dynamic conditions are recommended.
One objective of the project was to determine compare two analytic algorithms for converting judgment matrices into subjective workload ratings. The original eigenvector algorithm used in Saaty's Analytic Hierarchy Process (AHP) was compared an algorithm of calculating geometric means. Also, three methods of identifying excessively inconsistent matrices were compared. Data from nine previous experiments were re-examined in the present analysis. There were no differences between the AHP ratings and the geometric mean ratings in terms of their sensitivity to the experimental manipulations. However, two of the inconsistency measures were successfully used to cull the data-sets of inconsistent matrices and improved the statistical sensitivity of one set of ratings. These findings suggest that: (1) the computationally simpler geometric means algorithm can be used as an alternative to the eigenvector algorithm, and (2) culling inconsistent matrices can sometimes improve rating sensitivity. These findings, along with previous research, demonstrate that judgment matrices can be a very valuable workload assessment tool. The essential steps for the proper use of judgment matrices in workload assessment are reviewed. A user's guide and software are also being prepared to aid researchers and practitioners.
Thirty tank crews were tested in the Ft. Knox COFT tank simulator. The COFT simulator is a gunnery training facility. The crew's task was to shoot specified energy targets. Each crew consisted of a tank commander and a gunner. The commander told the gunner, via an intercom system, which enemy object was the next target. Performance and subjective workload were measured as a function of the speech intelligibility transmitted by the intercom system. Five levels of intelligibility were tested. The measures of operational effectiveness were the number of targets correctly fired upon and the gunner's latency. Subjective workload was measured using the Subjective Workload Assessment Technique (SWAT). Gunner performance and subjective workload covaried across intelligibility levels. Performance was not significantly
The purpose of this study was to evaluate the effects of sentry duty time on the soldier's speed of detection of visually presented targets, his ability to hit targets (rifle marksmanship), and his mood. Prior to the test day, each of eight subjects was Simulator and was familiarized with the targets to be presented during testing. The test session lasted three hours, during which time the subject assumed a standing foxhole position and monitored the target scene of the Weaponeer. The Weaponeer M16A1 modified rifle lay next to the subject at chest height. When a pop-up target appeared, the subject pressed a telegraph key, lifted the rifle, aimed, and fired at the target. Speed of target detection was measured in terms of the time required by the subject to press the telegraph key in response to the presentation of the target. Marksmanship was measured in terms of number of targets hit. Target detection time and rifle marksmanship were averaged every 30 minutes. At the end of the test session, the subject completed the Profile of Mood States rating scale. The results showed that target detection time deteriorated with time on sentry duty; impairments were not evident within the first hour but were clearly evident by 1.5 hours. Marksmanship remained constant over time; soldiers were just as accurate in hitting the targets at the end of the 3 hours of sentry duty as they were at the beginning. Whereas the soldier's predominant mood during baseline practice sessions was one of vigor, during sentry duty the predominant mood was one of fatigue. The results of this study suggest that sentry duty performance may be optimized if it is limited to one hour or less.
The effects of extra-task demands and long hours of work on the performance of simultaneous (comparative judgment) and successive (absolute judgment) type vigilance tasks were assessed in a simulated work environment. For three consecutive 12 hour days, subjects engaged in four 1-hour vigilance sessions interspersed with work at a heavy-load (20 codes/min) or a light-load (10 codes/min) data entry task. For both types of vigilance tasks, performance efficiency varied inversely with the auxiliary workload confronting the subjects. In addition, the quality of vigilance performance improved over the work week in the context of the light auxiliary workload and declined in the context of the heavy load. Subjects reported becoming more drowsy, strained and fatigued and experienced more somatic complaints over the work day and the work week. These mood effects were maximal with the successive task and a heavy auxiliary workload, suggesting that in order to maintain performance standards in the successive task, subjects expended more processing resources which led to a greater cost in fatigue and strain.
With the increased complexity of aircraft systems and their environment, 3-D stereoscopic system/control displays will offer great advantage over conventional two-dimensional (2-D) displays by presenting information more consistent with the pilot's 3-D perceptual experience and stereotypes. For such displays the interaction of Chromostereopsis (perceived depth created by hues) and stereopsis (depth effect created by disparity between the left and right visual fields of the observer) is important. The purpose of this research is to evaluate the interaction of chromostereopsis and artificially stimulated stereopsis on a stereoscopic CRT by determining the level of accuracy with which subjects can properly interpret the relative depth differences of adjacent symbols containing one of a combination of six levels of hue and seven stereoscopic disparities. This research demonstrated that hue, disparity, and the interaction of hue and disparity significantly influenced one's perception of depth on a stereoscopic monitor and that caution should be exercised by the stereoscopic 3-D display format designer when choosing hues to represent images located in close proximity on a stereoscopic display.
This study investigated the temporal sensitivity of crossed and uncrossed stereoscopic mechanisms of 48 observers using stimuli created from dynamic random-dot stereograms. The results showed thresholds were lower and depth was more veridical in the crossed than in the uncrossed direction.
The objective of this preliminary investigation was to quantify the effects of spatially displaced visual feedback on the operation of a camera-viewed remote manipulation task. Operators performed a remote manipulation task while exposed to the following different viewing conditions: direct view of the work site (baseline condition); normal camera view (zero-degree displacement); reversed camera view (180-degree lateral displacement); inverted/reversed camera view; and inverted camera view. The task completion performance times were statistically analyzed with a repeated measures analysis of variance and it was determined that there was statistical significance (p < 0.05) due to the main effect of the viewing conditions. A Newman-Keuls pairwise comparison test was then administered to the data and it was revealed that the performance times for the inverted camera view condition was significantly (p < 0.05) worse than all of the other viewing condition times. The results obtained in this study were not quite as would be expected based upon the review of the direct manipulation/displaced visual feedback literature. The difference observed in this evaluation was that the reversed camera view was ranked third out of the four camera viewing conditions while previously conducted studies have stated that the inversion/reversal was ranked third. The reversed viewing condition not only took over a minute longer, on the average, to complete than the inversion/reversal performance time, but it was also significantly worse than the normal viewing condition performance time. The differences obtained in this evaluation could be due to the fact that the remote manipulation task used in the present study involved the use of axes of movement different from those involved in the direct manipulation tasks reported in the literature. An informal analysis was conducted on the direct and normal viewing condition data and it was determined that the normal viewing condition was significantly slower than the direct viewing condition. This study clearly illustrates the deleterious effects that can accompany the performance of remote manipulation tasks when viewing conditions are less than optimal. An important finding in this evaluation is concerned with the extent to which results from previously performed direct manipulation studies can be generalized to remote manipulation studies. This evaluation has demonstrated that generalizations to remote manipulation applications based upon the results of direct manipulation studies are quite useful, but they should be made cautiously.
The ability of human subjects to mobilize attention and cope with task requirements under dichoptic and binocular viewing was investigated in an experiment employing a target search task. Subjects were required to search for a target at either the global level, the local level, or at both levels of a compound stimulus. The tasks were performed in a focused attention condition in which subjects had to attend to the stimuli presented to one eye/field (under dichoptic and binocular viewings, respectively) and to ignore the stimulus presented to the irrelevant eye/field, and in a divided attention condition in which subjects had to attend to the stimuli presented to both eyes/fields. Subjects' performance was affected mainly by attention conditions which interacted with task requirements, rather than by viewing situations. An interesting effect of viewing was found for the local-directed search task in which the cost of dividing attention was higher under binocular than under dichoptic viewing.
On-orbit servicing of payloads is simplified when a spacecraft has been designed for serviceability. A key design criterion for a serviceable spacecraft is standardization of electrical connectors. The following research investigated the effects of extravehicular mobility unit (EMU) glove size, connector size, and connector type on usability of electrical connectors. An experiment was conducted exploring participants' ability to mate and demate connectors in an evacuated glovebox. Independent variables were two EMU glove-sizes, five connector size groups, and seven connector types. Significant differences in performance times and heart rate changes during mate and demate operations were found between EMU glove sizes, among connector types, and connector sizes. Subjective assessments of connectors were collected from participants with a usability questionnaire. The data were used to derive design recommendations for a National Aeronautics and Space Administration (NASA) recommended EMU-compatible electrical connector.
Five subjects participated in an experiment designed to test if people could selectively attend to either edge rate (frequency of passing texture units) or flow rate (optical velocity of texture units) as the optical basis for controlling their own forward speed. Subjects continued to use edge rate as the basis for controlling forward speed, even when instructed to use flow rate and given feedback about their success in using it. The results are interpreted as evidence of inflexibility in selectively attending to information for self-speed.
Two experiments were performed to investigate the perception of peripherally presented apparent motion as a function of eccentricity of the stimulus, ambient illumination, gender, athletic ability, age, stimuli pattern (diamond, square), and angular extent of stimuli presentation. The experiment task for both studies was to determine the direction of apparent motion for a lighter than background stimulus target presented on a Braumbach perimeter. The results from experiment one indicated main effects for subjects, eccentricity, and age. The results from experiment two indicated main effects for subjects, eccentricity, and angular separation of the apparent motion.
The main objective of this research was to investigate the effects of foveal load on sensitivity in the peripheral visual field. The first experiment was presented at previous meeting of the Human Factors Society. Here, foveal load was manipulated by comparing the fixation of a cross vs. a simple first-order compensatory tracking task display. Peripheral sensitivity was determined simultaneously for light flashes presented at different eccentricities along the horizontal meridian. In general, the results showed no losses in peripheral sensitivity or a "tunnel vision" effect under the experimental conditions employed. Three more experiments have been carried out since that presentation. More complex tracking tasks have been employed in order to vary foveal load and the difficulty of the perimetry task has also been manipulated in one experiment by including lights on the vertical meridian. Whether or not a loss or a gain in peripheral sensitivity depends upon the complexity of the foveal task and to some extent the perimetry task. Results are discussed in terms of arousal and resource theory.
The present study investigates the applicability of an information integration hypothesis developed by Wickens and Boles (1983), to display format and response configuration. Twenty paid subjects performed either a dual-task or an integration task. The tasks were similar in all respects with the exception of information integration requirements. Proximity was manipulated via display format and response configuration. Results of the display format manipulation supported a multiple resources interpretation while the effects of response configuration were consistent with the integration hypothesis. These results point to a possible limitation in applying the integration hypothesis to resource demands of displays, but suggest that the hypothesis may apply to response configuration.
The purpose of this research was to evaluate the effect of system state uncertainty and data reliability on a diagnostic decision task when system data was presented in three different display formats (Digital, Bargraph and Configural). Properties of an actual process control system were simulated in the experiment by varying both system state uncertainty and data reliability. Classification strategy emerged as a major determinant of classification performance across display conditions.
This paper outlines results, both behavioral and methodological, of a pilot study whose objective was to develop a method for learning why experienced technicians' diagnoses of a supposedly self-diagnostic avionics system appeared
A workshop entitled "Visually Guided Control of Movement" was held at NASA Ames Research Center on June 26 - July 14, 1989. The workshop brought together individuals with diverse backgrounds related to the areas of the visual perception and control of motion. During the workshop, participants designed and conducted experiments using NASA Ames flight simulation research facilities. These studies contrasted participants' alternative theoretical approaches to the visual control of self motion. Panel members, drawn from the workshop's participants, will discuss their approaches to the study of the control of self motion and will present interpretations of the outcomes of the workshop.
Projected manpower declines coupled with increases in personnel costs and battlefield sophistication has prompted an increased reliance on high technology equipment in new Army systems. This advanced technology often features highly automated functions and promises substantially increased human and system productivity. However, potential enhancements to system performance may not be realized because the new technology frequently increases human perceptual, cognitive and psychomotor requirements to the point where the system operator may be said to be overloaded. Such a condition not only endangers the mission, but also threatens the safety of the soldier. As a result of these concerns, the Army Research Institute (ARI) has initiated a long-term research program aimed at controlling excessive operator/crew workload in emerging Army systems. The objective of a recently completed three-year work unit of the workload research program was to validate operator workload measures on three Army systems and use the results to develop guidance for controlling operator workload in new Army systems. This research work unit -- the Operator Workload (OWL) Program -- has developed a number of products which contribute to the Army's initiative for Manpower and Personnel Integration (MANPRINT) during the acquisition and continuing development of materiel systems. The objective of this symposium is to present an overview of the approach and accomplishments of the OWL program, highlight two examples of experimental and analytical work which has been completed, describe an expert system developed to provide practical guidance on how best to assess workload levels for a given set of circumstances, and identify several areas for future research. With guidance provided by the discussant and input from members of the audience, the desired impact of this symposium will be a heightened awareness of the importance to the Army MANPRINT initiative of this and other continuing research programs. The long term objectives of these research efforts should be to develop reliable and valid methods which: (1) forecast the impact of operator workload on the design and performance of new Army systems, (2) effectively allocate workload-imposing tasks among soldier, hardware, and software components of systems and assess the influence of workload factors on the organizational design of Army units, and (3) establish procedures for the selection, classification, and training of soldiers to effectively cope with operator workload in operational situations.
The Operator Workload (OWL) Program is a just-completed, three-year, basic and applied research effort sponsored by the Army Research Institute (ARI). As part of the Army's research thrust into workload, the OWL Program was directed to establish guidance for the assessment of OWL associated with the operation of Army systems. Its intent was to identify and integrate the most relevant of workload research into a set of practicable workload assessment methods for Army developers, and then apply and validate these methods on selected Army systems. Lessons learned from OWL studies of these systems formed the basis for guidance for Army system developers. This paper overviews the objectives, the accomplishments, and the future prospects of the OWL Program.
Operator workload (OWL) scales were used to obtain ratings of generic mission scenarios and tasks for a mobile air defense missile system (LOS-F-H) following a candidate-selection field evaluation. NASA TLX, SWAT, Overall Workload (OW), and the Modified Cooper-Harper (MCH) ratings were obtained from both crew and Subject Matter Experts (SMEs). Jackknife factor analysis revealed the presence of only a single "OWL" factor for both crew and SMEs (explaining 75.9% and 82.6% of the respective total variances) and indicated a significant (p < 0.00005) ordering of the mean factor loadings: TLX (0.924) was significantly greater than OW (0.905) and MCH (0.904), which were greater than SWAT (0.778). Subsequent analysis of OWL factor scores indicated that the crew and SMEs yielded essentially equivalent evaluations of OWL for the system variables investigated. This analysis also indicated that the highest levels of OWL were obtained for the track-to-intercept task during dual Rotary-Wing (RW) and Fixed-Wing (FW) attacks although the ID/IFF task during a dual RW attack was almost as high. These findings are discussed in the context of a methodology for assessing OWL.
An empirical study was undertaken to collect real-time workload estimates of pilots and copilots performing a resupply mission in a UH-60A flight simulator. Overall and peak workload (OW and PW) ratings were collected for twelve mission segments. These ratings were compared with OW and PW values predicted by the Task Analysis/Workload (TAWL) simulation model. High correlations were found between TAWL-based predictions and crew results for OW (r = 0.82 to 0.95; p < .01). Lower correlations were found for PW (r = 0.62; p < .05).
The Operator Workload Knowledge-based Expert System Tool (OWLKNEST) is a microcomputer-based tool that provides guidance in selecting the most appropriate technique to use for estimating Operator Workload (OWL) for developing Army systems. OWLKNEST is based on twenty years of workload research and on knowledge gained in the three-year Army Research Institute OWL Program. The design approach is presented along with a general description of targeted users and knowledge representation scheme. The criteria used to evaluate available OWL techniques for inclusion in the system are also presented. Sample system applications are presented which illustrate how OWLKNEST can be used for a variety of needs.
Current economic constraints indicate the need for incorporating the satellite servicing philosophy of commonality within the design of spacecraft subsystems. This philosophy is essential for conserving resources including hardware/software development and implementation costs, on-orbit ground-based manpower, crew training/testing time, and documentation. In addition, spacecraft subsystem commonality may be coupled with standardization of operational procedures, and test and verification technique for spacecraft design. Several spacecraft have adapted this practice, including Hubble Space Telescope, Space Station Freedom, and the Explorer Platform. As these and other programs continue and if effective crew interfaces and procedures are clearly and consistently defined, crew retraining for similar spacecraft subsystems will lessen, and procurement efforts will diminish. A relatively high fidelity zero-gravity simulation using water immersion is available to establish crew interfaces economically. The flexibility and utility of this space simulation medium for planning and assisting on-orbit operations was exemplified by astronaut evaluations of potential extravehicular activity electrical connectors. The testing was conducted at a National Aeronautics and Space Administration underwater neutral buoyancy training facility.
A joint visual search and memory retrieval model of visual inspection is described here to allow improved prediction of inspection time and accuracy. The model includes search between and within regions of a part, and describes decision making as a series of comparisons between a potential defect and a series of probabilistically-ordered attributes. Inspection errors are expected when low probability attributes are not reliably checked, or when poorly organized.
Twelve subjects participated in a study of time stress in visual display search and the relationship between stress and other variables known to affect visual search, such as symbol density, color coding, and search type. Response time (RT) differed significantly for each of these variables and for their two-way interactions. Generally, time stress suppressed the effects of the other variables. Accuracy varied significantly only for the main effects of coding, search type, and density. In addition to RT and accuracy, several ocular measures were collected. Results for the number of eye fixations paralleled the RT results except for the stress and practice (day) variables. Fixations per second approached significance for day and search type effects. Differential patterns of significant effects were observed for eye blink and pupil diameter changes that reflected stress, cognitive load, and search difficulty.
A study of visual tracking was conducted to determine the influence of tracking direction and task type on the pattern of eye movements. The 3x4-factorial experiment with repeated measures showed that smooth eye movements were: (1) longer in the horizontal than vertical plane; (2) longer in the downward than upward direction; (3) influenced by the type of task. The results may be relevant for employee training and design of workstations in respect to electronic information display and hand coordination, visual inspection, and work under the microscope.
This paper describes two studies which used objective and subjective assessments to quantify the effect of target degradation on observers' recognition ability. 'Noise' inherent in a digital infra-red line scan system can result in a static line-to-line variation (pixel jitter) over the displayed imagery. The amount of target degradation is dependent upon both the amplitude and frequency of the pixel jitter. The results showed that, firstly, if an image is affected by pixel jitter, even with an amplitude of only 1 pixels, a significant interference in target recognition performance occurs. Secondly, the results from the subjective scaling mirrored closely the error data and therefore imply that this rating scale may have widespread utility in target acquisition studies. Finally, the effect of pixel jitter appears to be robust. The effect was found not to be specific to a particular type of imagery and is, therefore, likely to generalize to other types of target and other imaging systems. The implication of these results for user-system specification is discussed.
Older persons are a growing proportion in the population, among drivers and those involved in traffic accidents. Changes in visual abilities of older persons are pertinent to night driving glare. Vehicle headlighting and related factors are reviewed which affect visibility and comfort in night driving. Older drivers, in particular, would be aided at night by: increasing the reflectivity of objects, limiting the mounting height of headlamps, appropriate reflectivities of mirrors for control of glare, automatic headlamp alignment, automatic headlamp cleaning and beam patterns that emphasize glare control.
This research was undertaken, in part, to determine the magnitudes of performance decrements associated with automotive instrument panel tasks as a function of driver age. Driver eye scanning and dwell time measures and task completion measures were collected while 24 drivers aged 18 to 72 performed a variety of instrument panel tasks as each drove an instrumented vehicle along preselected routes. The results indicated a monotonically increasing relationship between driver age and task completion time and the number of glances to the instrument panel. Mean glance dwell times, either to the roadway or the instrument, were not significantly different among the various age groups. The nature of these differences for the various task categories used in the present study was examined.
As the Baby Boom generation gradually moves into its later years, that movement will become a Senior Boom that will have a dramatic effect upon the design of products entering the marketplace. To respond to this market, engineers and designers will require a good understanding and awareness of the changes that take place in vision and cognition as a result of the aging process, and how these changes affect the interaction of older adults with their vehicle systems such as controls and displays, mirrors, entry and exit, and lighting. This paper is an attempt to bring that understanding to the designer and engineer, as based upon current research.
This experiment investigated whether well-learned "automatic" processes remain stable as a function of age, as well as whether the ability to modify automatic processes is disrupted for older adults. We used an arithmetic "Stroop" task. Nineteen young (mean 22) and 19 old adults (mean 75) participated in three sessions for a total of 450 trails. The young subjects had faster verification times, overall, than the old adults. Both young and old subjects showed significant Stroop interference. These results support the hypothesis that automatic processes, in this case access of addition and multiplication tables, are maintained for old adults. Furthermore, both groups reduced their RT with practice. For the young adults, there was a decrease in interference with practice suggesting that they were learning to inhibit the automatic process of performing the arithmetical operation. However, the old adults showed no significant decrease in interference, which implies that they were impaired in their ability to inhibit automatic processes, even when those processes interfered with performance. Theoretical and practical training implications are discussed.
Using Sternberg's (1969) Additive Factors Method (AFM), previous investigations in search of the locus of age-related slowing in reactive capacity have found conflicting results possibly due to inconsistencies in research methodologies. This experiment was conducted to examine age differences in the performance of AFM intratask manipulations of a reaction time task using both fixed and variable foreperiod conditions with subject testing at both naive and practiced skill levels. Twenty male subjects, ten young and ten old, performed a visual four-choice RT task with intratask manipulations of stimulus-degradation, stimulus-response compatibility, and response-stimulus intervals (RSIs were fixed at 0, 2, and 5 sec and variable with random presentations at 0, 2, and 5 sec), once when subjects were naive and again when practiced. The results varied by level of practice and RSI, but clearly the older subjects had difficulty with the intratask manipulations. The older subjects took twice as long, on the average, to respond. Interactions of age by compatibility suggest that, according to the AFM, with age comes inordinately long delays in the response selection stage of information processing. Conclusions are made with caution since this research points to limitations and methodological confounds which serve to explain many of the equivocal findings in previous studies.
Research indicates that older adults have difficulty acquiring text-editing skills. The data suggest that the cognitive demands associated with text-editing programs create problems for older learners given the age-related changes in cognitive abilities. This study compared the learning efficiency of older adults for three text-editing programs which varied in format and command structure. A total of 45, computer naive, women ranging in age from 40 to 70 years participated. The results indicated significant differences in learning efficiency as a function of text-editing program. Participants using a full screen editor with pull down menus demonstrated significantly better performance than did those using other programs. Data was also collected regarding types of difficulties encountered by subjects during learning. This type of information can be used as input into the design of future software and training programs.
The Peripheral Vision Horizon Display (PVHD) is an expanded artificial horizon line that is intended to provide the pilot with orientation information through peripheral vision. The potential advantage is a reduction in the requirement to constantly refer to the attitude indicator (AI) in order to maintain awareness of orientation during instrument meteorological conditions (IMC). Four helicopter pilots flew two types of instrument approaches to determine whether the degree to which pilots rely on the AI would be altered when the PVHD was in operation. Only two pilots showed a reduction in the visual workload associated with the need to scan the AI. The general trend in the data indicated an increase in subjective workload with the PVHD. It was argued that motion of the PVHD distracted the pilots from their routine instrument scan, although this result might not generalize to pilots more experienced with the display. It was concluded that the PVHD might be of significant benefit in situations where the pilot must look outside the cockpit and stable orientation cues are not visible.
Human factors research, which focuses on matching human capabilities and limitations with different environmental and task demands, has been wanting in the areas of elderly population. The main objective of this research was to use a task analytic approach to identify the demand profiles for a list of daily activities of the elderly. Sixty-six independently living elderly persons were videotaped performing 25 separate activities of daily. A computerized task analytic approach was used to analyze the activities. Tasks were described through a set of descriptors such as action (e.g., reposition), demand (e.g., carry), object (e.g., broom), body part (e.g., hand), posture (e.g., bend), location (e.g., wash room) and frequency. Crosstabulations were performed on the data to determine pattern of relationships amongst the various task descriptors, both within, and between activities. In terms of demands, few activities account for a large proportion. Lifting/lowering, push/pull appear to be the predominant actions. Relationship among task, posture, body part and demands were significant.
Demographics announce the rise of an array of small and bigger challenges, which cannot be taken up adequately by a single research discipline, industrial branch or central administration. The segregation and segmentation of our society causes limiting conditions in facing the most substantial and acute societal problems. The complexity and versatility of these problems require a policy that has to be conceptualized. The concept "gerontechnology" is introduced to cover and provide some coherent elements in order to establish a strategy, that is aiming at an efficient and effective use of essential resources, to match developments induced by an aging population. Normal aging processes can be described within this concept using the man-machine-environment interaction model that is loaded with sets of variables that are characteristic for an aging human function. Sets of variables are distinguished at three levels with increasing complexity from basic research on parameters of aging, via human factors research on grey human factors, to market research on daily consumer needs of the elderly. Two projects are presented as examples of respectively an industrial and a basic research approach in this domain.
The presentation features beginning efforts on a 5 year project concerned with the identification of home safety problems and technological solutions for older disabled persons. First, the multi-dimensional model guiding the project is presented. Second, pilot survey data concerned with 72 older persons' home accident histories, risk perceptions associated with everyday activities, and tendencies to engage in risky in-home behaviors are discussed. Finally, preliminary survey data from 30 "in-home assessment" professionals will be discussed. These data identify environmental, functional, and psychological in-home safety problems for elderly persons for the given daily activities stipulated by the project's model.
The normal process of swallowing is an extremely complicated and highly integrated process, only part of which is under voluntary control. The normal process of swallowing requires that the neuromuscular structure, the cartilaginous and bony elements and their innervation will be intact. A number of high technology and semi-technology procedures have been developed which will help elucidate the cause of the swallowing problem. These include: videofluoroscopy, scintigraphy, manometry, fiberoptic endos-copy, ultra-sound, and clinical auscultation. A deglutition team consisting of otolaryngologist, neurologist, speech-language pathologist, radiologist, nurse practitioner, and dietitian can have a major impact on determining the cause of the swallowing difficulty and correcting it.
Most populations of the world are aging. In developed countries the older population will double in the next 30 years; in underdeveloped countries it is expected to quadruple. This has led to world wide interest in the application of human factors to problems of aging (Smith, 1988). The purpose of this panel is to bring a perspective on research and design from outside the U.S.
Mobile telephony exhibits transmission characteristics and user-interface features distinct from traditional telephony. To study these differences in systems designed for use in commercial airplanes, trains, and automobiles, we used a variety of techniques, including both laboratory and field observations. We found that mobile telephony, viewed from the user's perspective, is quite different from traditional telephone service. In the present paper, we review the assessment techniques that we employed, and consider their strengths and weaknesses for characterizing the performance of mobile telecommunication systems. Our results indicate that there are five major sources of potential user-interface problems in mobile telephony: (1) use of credit cards; (2) system delays; (3) lack of coordination among multiple sources of feedback; (4) the mechanism for completing multiple calls without credit-card reentry; (5) voice dialing. Because solving the problems we have identified does not require new or overly expensive technology, solutions are fairly straightforward to implement during the early design period. However, once units have been manufactured and installed, it can be both very difficult and very expensive to recover from the problems we have identified.
This paper describes the Human Factors evaluation of a new generation of trader turret, a highly specialized telephone used by brokers on the floor of the New York Stock Exchange. The evaluation consisted of three parts: (1) environmental and acoustical studies at the New York Stock Exchange, (2) job analyses, task analyses, and interviews that helped designers determine and satisfy the brokers' needs, and (3) laboratory experiments at Bell Labs in Holmdel that evaluated alternative designs and features.
Network and system management tasks revolve around the activities necessary to establish and maintain data communication networks. This can include tasks such as: * monitoring the network hardware and software components for problems and performance information * identifying and correcting problems in the network * planning and making changes to the network Task analysis was performed on data available from a customer case study database (Beith, Moore, Pendley, and Percival, 1985). Models were developed for several network and system management tasks. This paper discusses one of the task models, problem determination, and presents recommendations for information structuring and task flow.
This paper presents Living Systems Theory (Miller, 1978) as a conceptual framework for human-computer interface (HCI) design. Many researchers and practitioners in the field of HCI design have used systems terms and concepts in their work; however, it is not clear that an integrated systems approach has been taken in the field of HCI design. Living Systems Theory (LST) is proposed as the means for obtaining a conceptual framework for the study of the HCI. Miller clearly defines terms and concepts that can serve as a "common language" to improve communication within and across disciplines. It is likely that a multidisciplinary field such as HCI design could benefit from LST. Specifically, by adopting this "common language", researchers and practitioners in the field of HCI could improve communication with other disciplines which could facilitate the sharing of information across disciplines.
A case is made tor using low-fidelity prototypes early in the design phase of new services. The rationale for this is based upon (1) a model of how user interface designs progress and (2) a call to expediency. The design process is viewed as the successive application of constraints that serve to prune the space of all user interfaces. Some constraints are external (i.e., placed on the service by limits of technology or cost). Other constraints are derived by application of heuristic design principles. Even after these constraints have been applied, the design is still not fully constrained and the designer must make high-level design decisions. At these choice points, I propose that low-fidelity prototyping is an appropriate means of gathering design information as it is an expedient solution and may serve as a method of testing the central tendency of entire classes of user interfaces.
Previous literature has consistently shown that affirmative messages are understood more quickly and accurately than negative messages and that redundancy facilitates comprehension and response selection. However, it was unclear which of these two variables is the most important in the design of system-status messages that must be understood and acted upon. In our experiment, we compared affirmative and negative system-status messages when negative messages contained a word redundant with the appropriate action. The results indicated that affirmative messages were responded to more quickly and more accurately than negative messages only when the negative messages did not contain a redundant word. Redundant negative messages were more accurate than affirmative non-redundant messages.
Sixteen military pilots flew simulated air-to-air and air-to-ground missions in a simulated fighter-attack cockpit. Three of the five color CRTs in the cockpit were capable of displaying retinal disparity and the major independent variable was presence or absence of disparity. Performance, workload, and opinion data were collected. A second objective of the study was to continue development of the display formats, which had evolved through earlier projects. The disparity results and the recommended format revisions are presented.
As digital voice data is increasingly replacing analog in system applications, user interface requirements supporting this technology must be established. This experiment was conducted to determine whether system response time affected a user's ability to control movement of recorded speech while keying in a verbatim report of the speech content. Experienced subjects performed a transcription task under four different response times. Upon completion of the task, the subject ordered the response times from shortest to longest and rank-ordered their preferences for response delay times. Performance data was collected to discover if response time differences affected performance. Subjects were unable to identify the response time delays correctly; and, based on the preference rankings, the subjects were most satisfied with a response time delay range between 100 ms and 150 ms and least satisfied with a response time delay of 250 ms. Subjects stated that with the longest and shortest response time delays they had trouble positioning in the audio. Response time delays did not affect subject performance, although other significant results were found.
The purpose of the present experiment was to investigate the demands placed on the short term memory system by synthetic speech. We compared performance in a typical auditory short term memory task as a function of whether the items were presented by a human voice or by a text-to-speech computer voice generator. Immediate serial recall of digit strings was significantly poorer when presented by synthetic speech than when presented by natural speech. The results are consistent with the idea that comprehension of synthetic speech imposes increased resource demands on the short term memory system.
Computer programming is one of the earliest topics addressed by studies of the human factors of computer systems and studies of how software systems are developed remain one of the most difficult areas of investigation. Early work in the psychology of programming focused on comparisons of time-sharing and batch modes, studies of programming team organization, studies of debugging, and investigations of the differences between novice and expert programmers. As new theories and experimental methodologies were developed, further areas were researched. This panel looks at current research in the psychology of computer programming. Topics include studies of programmer behavior, studies of software design, tools for programmers, and experimental methods. Audience members will have an opportunity to describe other areas of study.
Perhaps the one thing that user interface designers most want is tools that will help them (a) quickly visualize their work; (b) carry it out more efficiently and faster; and (c) do iterative design; and (d) allow them to do more work without the need of programmers. An on-going research project (called ITS) is responding to these challenges by developing software tools for user interface and application development, together with providing a run-time environment for application execution. There are four key concepts. First, ITS separates the style of an application from the content of an application. Human-computer interface styles are general, rule-based, under parameter control, and designed to handle a variety of applications. Second, ITS envisions four general work roles in application design and development: content experts, content programmers, style experts, and style programmers. Third, end users do four operations: make choices, fill in forms, manipulate lists, and read information blocks. Fourth, ITS aims at creating software tools for each work role.
As a result of the popularity of using HyperCard to rapidly prototype equipment and computer interfaces on Macintosh personal computers, the need ensued to evaluate prototype usability by collecting subjects' interactive performance data in real time. Sandia National Laboratories, in collaboration with Stone Design Software, has developed ProtoTymer, a HyperCard stack that can time and record users' interactive sessions with prototypes developed using HyperCard. While operating in the background, ProtoTymer records the times, locations, and targets (objects clicked) of a subject's inputs during an interactive session. At the conclusion of the session, the resultant data file can be reviewed, summarized, printed, or transferred to a spreadsheet for statistical or graphical analysis. This paper describes ProtoTymer's design approach, features, limitations, and considerations for future versions.
Because of the increasing complexity and size of systems for which user interfaces must be designed, manual analysis of user and system requirements are inadequate. Methods for employing database tools in top down design strategies have been developed to manage design information in the development of user interfaces for large and complex systems. These methods have been useful in the design of user interfaces that are internally consistent with the user's model of the system and that are consistent across related software applications.
Increasingly, the design of interactive computing systems appears to be a process of iterative design and re-design. One important factor in successful iterative design is iterative evaluation -- evaluation as part of each design cycle. This paper argues that different evaluation-design cycles may require different types of methodologies and different types of questions or measures to fully satisfy differing evaluation goals. Furthermore, evaluation procedures and measures themselves need to be designed and re-designed, a process more easily accomplished during system development. Examples based upon design projects illustrate some of the ways in which the nature and uses of evaluation procedures and information may change in different cycles of iterative evaluation.
A user interface management system of UIMS is an interactive system for supporting the design, production, and execution of human-computer interfaces. This paper reports on the development and empirical testing of an evaluation procedure to produce quantifiable criteria for evaluating and comparing UIMS. The form-based evaluation procedure results in quantitative ratings along two dimensions: functionality and usability. Specification/implementation techniques used by a UIMS are also quantitatively rated. An empirical study has indicated that the procedure produces reliable, useful results.
Early development of a tutorial fostered a joint effort between human factors professionals, software developers and training consultants that resulted in early resolution of many problems during the development of Wang Freestyle, a new multimedia communication system. It was decided that new users of Freestyle should be able to use the basic annotation features without referring to any hardcopy documentation. To ensure this, iterative tests of evolving prototypes of (1) the software and (2) an on-line tutorial that was designed to teach any features of the system that were not immediately intuitive were carried out. Changes were made in the software and the tutorial, resulting in improvements to both. The methods used and some of the lessons learned from this initial experience with iterative tutorial development are discussed.
This paper reports the results of three iterative usability tests of a security application as it evolved through the application development process and highlights the use of several methodological techniques: 1) reusable color foil prototypes of application panels as an alternative to developing online prototypes during short development cycles, 2) field tests as a complement to laboratory tests, 3) iterative testing of an evolving prototype, and 4) analysis of dollar value of usability work. The techniques used represent an attempt to apply usability engineering to system design (Whiteside, Bennett, and Holtzblatt; 1988) and to provide management with a dollar value estimate of human factors work (Mantei, 1988). Significant improvements in end user performance and satisfaction occurred across the three iterative tests (field prototype test, laboratory prototype test, and laboratory integration test) conducted across 7 months with 27 participants. The product usability objective was met during the third test. By using the reusable foil prototypes of the interface panels, usability staff were able to efficiently and effectively identify problems, make design changes, and retest the panels. The field test furnished unique data necessary to understanding end user issues. Iterative testing provided the opportunity to test the impact of changes made to the interface and a reliability check on previous results. The methodology for computing the value of usability work provided a feasible way of analyzing the cost benefit of the human factors work.
In the history of human factors in computer systems, one of the most significant events of the past decade was the work on GOMS and keystroke models (cf. Card, Moran, and Newell, 1983). While a clear success in causing software developers to focus on the importance of interface design and attracting researchers to this areas, GOMS approaches have not significantly improved the quality of the systems that are developed. Why has this work, that has a great theoretical impact, had so little practical impact on existing systems? Is it that the GOMS formalism is not valid outside of laboratory contexts? Is it that it misses important aspects of behavior such as how people learn to use systems? Is it that GOMS was developed in the context of computer systems that are less powerful and interactive than we have today? Or, are there other reasons? In this panel, we argue that additional cognitive science approaches are needed to improve the quality of developed system. Dr. Gray extends this approach by reporting the first "real world": test of the GOMS-style of system modeling. Dr. Polson extends these models to how people learn to use systems. Dr. Fischer extends this style of research by focusing on cooperative, rather than passive computer systems. Audience members will have an opportunity to describe other approaches to developing theoretical models of system design.
This study examined the effect of varying the amount of information that is presented in either an alphanumeric or iconic display and its effect on how efficiently a pilot can utilize the data. The results from 12 subjects, under self-paced presentation length conditions, indicated that for a small quantity of data (2 or 4 bits) there is no difference in response times between iconic and alphanumeric displays. As the quantity of data presented increases (8, 16, or 32 bits), subjects perform better using iconic displays.
Using the GOMS model (Card, Moran, and Newell, 1983), a help system was developed which was complete and well structured. The content of this help system was determined from the goals, operators, methods, and selection rules needed to perform HyperCard authoring tasks. The index to these methods, which was an integrated part of the system, was determined from the hierarchical goal tree provided by the GOMS analysis. To determine the effectiveness of using GOMS as a design aid for help systems, the GOMS help system was compared to a state-of-the art interface developed by Apple Computer which was modified slightly for experimental purposes (Original help system). Two groups of 14 users, using one of the two help systems, retrieved help information about 56 tasks separated into 4 sessions. The results indicated that the GOMS users were significantly faster than the Original users with the largest speed difference occurring in the first session. However, no reliable differences were found for retrieval accuracy between the two groups. This is not surprising since the Original help system was found to have 85.9% of the procedural information contained in the GOMS help system. Interestingly, participants subjectively rated the GOMS help system higher than the Original help system. Overall, the results from this information retrieval study suggest that a GOMS model can aid in the development of help systems which are easy to use, easy to learn, and well liked.
COGNET (Cognitive Network of Tasks) is a model-building framework for real-time attention sharing cognitive processes. It is particularly designed for the construction of computational models of human-computer interaction. COGNET is unique in that it leads to context-sensitive models of attention switching based on the human operator knowledge of the real-world domain being modeled. A COGNET model combines an augmented version of the GOMS task analysis language with the blackboard architecture of control. This paper discusses the theoretical organization of the COGNET framework, as well as the augmented GOMS/blackboard tools used to build COGNET models.
This paper presents a conceptual discussion of four human operator models that are potentially useful for supervisory control applications: the operator function model (Mitchell, 1987), the problem behavior graph (Newell and Simon, 1972), the decision ladder (Rasmussen, 1986), and goal-means network (Woods and Hollnagel, 1987). These models are characterized along the dimensions proposed by Jones and Mitchell (1987) and are further examined in-depth with the use of verbal protocols collected concurrently with the performance of a supervisory control task.
The Tangora is a large-vocabulary, speaker-dependent, isolated-word speech recognition system. In this paper, we describe a study designed to test this system under a broader range of conditions than had previously been considered. The experiment itself consisted of four experimental sessions: Sessions 1 and 4 were enrollment sessions, and Sessions 2 and 3 were test sessions. During each test session, participants read 32 preselected sentences and dictated 40 sentences of their own composition The results of this experiment indicated that (a) there was a high degree of inter-subject variability and a high degree of intra-subject consistency; (b) users did not improve with limited experience; (c) the style/content of the test sentences affected recognition performance; (d) recognition errors were more common following misrecognized words than following correctly recognized words; and (e) new users had little difficulty with isolated-word speech. We discuss the implications that these findings have for application selection, interface design, user training, and system evaluation.
This paper reports on a study of recognition performance for a group of new users during their first month of experience with the Tangora systems. Tangora is a 20,000 work, speaker dependent, isolated-word system which transcribes speech input into text in real-time. Twelve users, six males and six females, participated in 21 sessions each, during which they read aloud unrelated sentences selected from a corpus of office correspondence. Their goal was to develop a speaking style which minimized Tangora's recognition error. To this end, starting with the third session, the experimenter generated hypotheses about each users' speech habits which may have resulted in high recognition error and made suggestions to the user on how to modify his/her speaking style. In addition, each user produced a new speech sample each of the four weeks of the experiment which was used to "train" the system to recognize the speaker. On average, recognition error decreased by 33% from the first to the fourth week. This improvement was attributable to "retraining" the system with, apparently, more representative speech samples. A number of speech habits brought by users to the recognition task were identified as contributing to poor recognition performance by Tangora. These included: (a) a too fast speech rate, (b) failure to pause between words, (c) hyper-correct articulation of the final phoneme in words. Feedback relating to these speech habits was used successfully by a majority of the users to modify their speaking style into one more successfully recognized by the Tangora system.
A new method for hand printed data entry is proposed for those user tasks where only few characters have to be entered. The basic idea is to use a numeric keypad as writing surface for one finger. For this purpose the keys are equipped with touch sensors, and display elements are fitted to the keys and the intermediate spaces. By means of the arrangement of the keys in the keypad the writing surface has a structure of rows and columns which motivates the user to produce characters with a standard format and uniform line elements. In order to allow characters to be sloppily written to some extent, the method of dynamic programming was applied for a nonlinear adaptation of pattern and prototype vectors. A first series of experiments showed that error rates of the subjects (e.g. omissions, confusions) and of the pattern recognition system (errors, rejections) are very low.
This work tested six techniques for the occasional entry of unstructured numeric data in the context of a primarily mouse-based, cursor-positioning, human-computer dialogue. Two of the techniques used a separate keypad for numeric data entry. The other four techniques used the mouse already being used for the cursor positioning dialogue. The keypad techniques were more efficient than the mouse techniques for all of the numeric sequence lengths considered. There were no significant differences in efficiency between the two keypad techniques. Among the mouse-based techniques, an approach based on a displayed image of a calculator keypad was consistently among the most efficient.
A three-part experiment was conducted to determine the accuracy, repeatability and linearity of a human hand manipulating the DataGlove. Accuracy and repeatability of finger flexure were investigated with repeated measurements of three calibration positions. Linearity of finger flexure was investigated with steady finger and thumb curling motions. Accuracy and repeatability of hand location and orientation were investigated with repeated measurements of six hand positions. Finger flexure mean accuracy was 6{deg} for the four fingers and 11{deg} for the thumb, repeatability was 3{deg} for the four fingers and 9{deg} for the thumb, and linearity varied from 2 to 5{deg}. Although the mean location accuracy was 1 inch and the mean orientation accuracy was 17{deg}, the position and orientation receiver was observed to twist on the glove back. Across all subjects, the location repeatability was 0.5 inch, while the orientation repeatability was 9{deg}. However, the within-subject location repeatability was 0.13 inches, while the orientation repeatability was 2{deg}.
Real-world applications of touch-input technology often do not occur under ideal conditions. Users often must contend with off-axis viewing and non-optimal positioning, introducing the possibility of vertical or horizontal bias error. In the present study the effects of screen angle relative to line of sight and positioning of targets were examined with a high-resolution (1 pixel or about 1/12 mm) resistive touch input device thought to have minimal parallax. Results replicated earlier findings of Beringer & Peterson (1985) in that a 17-degree declination of the touch surface below orthogonal to line of sight induced a high-touch bias error of 9 pixels (about 3/4 mm) whereas orthogonality of the interface to line of sight virtually eliminated bias. Both software and behavioral compensation strategies are discussed.
User interface design has many components. Usable computer interfaces should be easy to learn, result in high user productivity and high user satisfaction. There are a number of components in user interface design that affect the usability of the interface. Within the human factors community we tend to emphasize the ergonomic and cognitive components of the computer interface. There is another component that is frequently ignored, the visual interface design. This panel will present information on the visual component in various user-computer interfaces and will discuss the contributions of the visual designer to the interfaces and usability.
This study presents evidence that a prototype touch interface technology emulating basic interaction techniques of a mouse pointing device is comparable in overall usability to a conventional mouse for a direct manipulation, graphical windowing software environment. The touch technology prototype involves using either a stylus or finger, with an overlay sensitive to changes in capacitance. Users practiced each technique (mouse, stylus, finger, keyboard with no mouse), in the context of carrying out o