James R. Lewis

Picture of James R. Lewis. Copyright unknown.
Has also published under the name of:
"J. R. Lewis"

Current place of employment:
IBM

Began college studies as music major.  Graduated 1975 with BM in music theory and composition, 1978 with MM in music composition.  Switched to experimental psychology, graduating with BA in 1978 and MA (engineering psychology) in 1982 (all degrees from New Mexico State University).  Began work at IBM in 1981 as human factors engineer, primary focus on input methods (keyboards, mice, touchscreens, joysticks).  Started work on speech input/output in early 1990s.  Graduated with PhD in experimental psychology (psycholinguistics) in 1996 (from Florida Atlantic University).  In addition to scholarly publications, has over 50 patents issued by the US Patent Office -- designated an IBM Master Inventor in 2003.

Edit author info
Add publication

Publications by James R. Lewis (bibliography)

 what's this?
2011
 
Edit | Del

Sauro, Jeff and Lewis, James R. (2011): When designing usability questionnaires, does it hurt to be positive?. In: Proceedings of ACM CHI 2011 Conference on Human Factors in Computing Systems 2011. pp. 2215-2224.

When designing questionnaires there is a tradition of including items with both positive and negative wording to minimize acquiescence and extreme response biases. Two disadvantages of this approach are respondents accidentally agreeing with negative items (mistakes) and researchers forgetting to reverse the scales (miscoding). The original System Usability Scale (SUS) and an all positively worded version were administered in two experiments (n=161 and n=213) across eleven websites. There was no evidence for differences in the response biases between

© All rights reserved Sauro and Lewis and/or their publisher

2010
 
Edit | Del

Sauro, Jeff and Lewis, James R. (2010): Average task times in usability tests: what to report?. In: Proceedings of ACM CHI 2010 Conference on Human Factors in Computing Systems 2010. pp. 2347-2350.

The distribution of task time data in usability studies is positively skewed. Practitioners who are aware of this positive skew tend to report the sample median. Monte Carlo simulations using data from 61 large-sample usability tasks showed that the sample median is a biased estimate of the population median. Using the geometric mean to estimate the center of the population will, on average, have 13% less error and 22% less bias than the sample median. Other estimates of the population center (trimmed, harmonic and Winsorized means) had worse performance than the sample median.

© All rights reserved Sauro and Lewis and/or their publisher

2009
 
Edit | Del

Sauro, Jeff and Lewis, James R. (2009): Correlations among prototypical usability metrics: evidence for the construct of usability. In: Proceedings of ACM CHI 2009 Conference on Human Factors in Computing Systems 2009. pp. 1609-1618.

Correlations between prototypical usability metrics from 90 distinct usability tests were strong when measured at the task-level (r between .44 and .60). Using test-level satisfaction ratings instead of task-level ratings attenuated the correlations (r between .16 and .24). The method of aggregating data from a usability test had a significant effect on the magnitude of the resulting correlations. The results of principal components and factor analyses on the prototypical usability metrics provided evidence for an underlying construct of general usability with objective and subjective factors.

© All rights reserved Sauro and Lewis and/or ACM Press

2006
 
Edit | Del

Lewis, James R. (2006): Sample sizes for usability tests: mostly math, not magic. In Interactions, 13 (6) pp. 29-33.

 
Edit | Del

Lewis, James R. and Sauro, Jeff (2006): When 100% Really Isn't 100%: Improving the Accuracy of Small-Sample Estimates of Completion Rates. In Journal of Usability Studies, 1 (3) pp. 136-150.

Small sample sizes are a fact of life for most usability practitioners. This can lead to serious measurement problems, especially when making binary measurements such as successful task completion rates (p). The computation of confidence intervals helps by establishing the likely boundaries of measurement, but there is still a question of how to compute the best point estimate, especially for extreme outcomes. In this paper, we report the results of investigations of the accuracy of different estimation methods for two hypothetical distributions and one empirical distribution of p. If a practitioner has no expectation about the value of p, then the Laplace method ((x+1)/(n+2)) is the best estimator. If practitioners are reasonably sure that p will range between .5 and 1.0, then they should use the Wilson method if the observed value of p is less than .5, Laplace when p is greater than .9, and maximum likelihood (x/n) otherwise.

© All rights reserved Lewis and Sauro and/or Usability Professionals Association

2001
 
Edit | Del

Lewis, James R. (2001): Current Issues in Usability Evaluation. In International Journal of Human-Computer Interaction, 13 (4) pp. 343-349.

In this introduction to the special issue of the International Journal of Human-Computer Interaction, I discuss some current topics in usability evaluation and indicate how the contributions to the issue relate to these topics. The contributions cover a wide range of topics in usability evaluation, including a discussion of usability science, how to evaluate usability evaluation methods, the effect and control of certain biases in the selection of evaluative tasks, a lack of reliability in problem detection across evaluators, how to adjust estimates of problem-discovery rates computed from small samples, and the effects of perception of hedonic and ergonomic quality on user ratings of a product's appeal.

© All rights reserved Lewis and/or Lawrence Erlbaum Associates

 
Edit | Del

Lewis, James R. (2001): Evaluation of Procedures for Adjusting Problem-Discovery Rates Estimated From Small Samples. In International Journal of Human-Computer Interaction, 13 (4) pp. 445-479.

There are 2 excellent reasons to compute usability problem-discovery rates. First, an estimate of the problem-discovery rate is a key component for projecting the required sample size for a usability study. Second, practitioners can use this estimate to calculate the proportion of discovered problems for a given sample size. Unfortunately, small-sample estimates of the problem-discovery rate suffer from a serious overestimation bias. This bias can lead to serious underestimation of required sample sizes and serious overestimation of the proportion of discovered problems. This article contains descriptions and evaluations of a number of methods for adjusting small-sample estimates of the problem-discovery rate to compensate for this bias. A series of Monte Carlo simulations provided evidence that the average of a normalization procedure and Good-Turing (Jelinek, 1997; Manning&Schutze, 1999) discounting produces highly accurate estimates of usability problem-discovery rates from small sample sizes.

© All rights reserved Lewis and/or Lawrence Erlbaum Associates

 
Edit | Del

Wang, H. and Lewis, James R. (2001): Intelligibility and Acceptability of Short Phrases Generated by Embedded Text-to-Speech Engines. In: Proceedings of the Ninth International Conference on Human-Computer Interaction 2001. pp. 144-148.

 
Edit | Del

Lewis, James R. (2001): Psychometric Properties of the Mean Opinion Scale. In: Proceedings of the Ninth International Conference on Human-Computer Interaction 2001. pp. 149-153.

1999
 
Edit | Del

Lewis, James R. (1999): Tradeoffs in the Design of the IBM Computer Usability Satisfaction Questionnaires. In: Bullinger, Hans-Jörg (ed.) HCI International 1999 - Proceedings of the 8th International Conference on Human-Computer Interaction August 22-26, 1999, Munich, Germany. pp. 1023-1027.

1995
 
Edit | Del

Lewis, James R. (1995): IBM Computer Usability Satisfaction Questionnaires: Psychometric Evaluation and Instructions for Use. In International Journal of Human-Computer Interaction, 7 (1) pp. 57-78.

This article describes recent research in subjective usability measurement at IBM, focused on evaluating the psychometric properties of questionnaires designed for use in scenario-based usability evaluation. The questionnaires address evaluation at both a global overall system level and at a more detailed scenario level. The primary goals of this article are to (a) discuss the psychometric characteristics of IBM questionnaires that measure user satisfaction with computer system usability, and (b) provide the questionnaires, with administration and scoring instructions. For scenario-level measurement, the 3-item After-Scenario Questionnaire (ASQ) has excellent internal consistency, with coefficient alphas across a set of scenarios ranging from .90 to .96. For more global assessment, the Post-Study System Usability Questionnaire (PSSUQ) also has excellent internal consistency, with an overall coefficient alpha of .97. Preliminary principal factor analysis of 48 PSSUQ questionnaires suggested the presence of three factors named, after varimax rotation, System Usefulness, Information Quality, and Interface Quality, with corresponding coefficient alphas of .96, .91, and .91. Evaluation of 377 PSSUQ questionnaires (modified to allow mailing to respondents in their offices and referred to as the Computer System Usability Questionnaire, or CSUQ) confirmed the structure of the preliminary principal factor analysis. Consequently, usability practitioners can use these questionnaires to help them measure users' satisfaction with the usability of computer systems in the context of scenario-based usability studies.

© All rights reserved Lewis and/or Lawrence Erlbaum Associates

1993
 
Edit | Del

Lewis, James R. (1993): Multipoint Scales: Mean and Median Differences and Observed Significance Levels. In International Journal of Human-Computer Interaction, 5 (4) pp. 383-392.

Researchers in human-computer interaction (HCI) often use discrete multipoint scales (such as 5- or 7-point scales) to measure user satisfaction and preference. Many knowledgeable authors state that the median is the appropriate measure of central tendency for such ordinal scales, although others challenge this assertion. This article introduces a new point of view, based on a human factors consideration. When decision makers read a usability report or attend a briefing, they may make decisions based on the magnitude of the difference between the measures of central tendency for key dependent variables. A major criterion that should affect the choice of presenting means or medians is the strength of the relationship between this difference and the observed significance levels of appropriate statistical tests. The results from two series of "real-world" usability studies showed that the mean difference correlated more than the median difference with the observed significance levels (both parametric and nonparametric) for discrete multipoint scale data. Therefore, for these scales in this measurement context, the mean can be a better measure of central tendency than the median. The results also provided evidence that mean differences for 7-point scales correlate more strongly with observed significance levels than those for 5-point scales.

© All rights reserved Lewis and/or Lawrence Erlbaum Associates

 
Edit | Del

Lewis, James R. (1993): Problem Discovery in Usability Studies: A Model Based on the Binomial Probability Formula. In: Proceedings of the Fifth International Conference on Human-Computer Interaction 1993. pp. 666-671.

Product developers want their products to be as easy to use as possible, but must consider constraints such as cost and schedule. The primary goal of many usability studies is to discover design problems. After discovery, designers can take steps to eliminate or minimize problem impact. This paper shows that problem discovery in usability studies is consistent with the binomial probability formula. The problem discovery curves from two recent studies lend empirical support to this problem discovery model. One practical application of the model is to help estimate appropriate sample sizes for problem discovery usability studies. This model can help usability researchers simultaneously consider cost (minimized by running as small a sample as possible) and risk (minimized by running as large a sample as possible) to maximize the efficiency of a study.

© All rights reserved Lewis and/or Elsevier Science

 
Edit | Del

Lewis, James R. (1993): Problem Discovery in Usability Studies: A Models Based on the Binomial Probability Formula. In: Smith, Michael J. and Salvendy, Gavriel (eds.) HCI International 1993 - Proceedings of the Fifth International Conference on Human-Computer Interaction - Volume 1 August 8-13, 1993, Orlando, Florida, USA. pp. 666-671.

1992
 
Edit | Del

Lewis, James R. (1992): Psychometric Evaluation of the Post-Study System Usability Questionnaire: The PSSUQ. In: Proceedings of the Human Factors Society 36th Annual Meeting 1992. pp. 1259-1263.

Usability evaluators used an 18-item, post-study questionnaire in three related usability tests. I conducted an exploratory factor analysis to investigate statistical justification to combine items into subscales. The factor analysis indicated that three factors accounted for 87 percent of the total variance. Coefficient alpha analyses showed that the reliability of the overall summative scale was .97, and ranged from .91 to .96 for the three subscales. In the sensitivity analyses, the overall scale and all three subscales detected significant differences among the user groups; and one subscale indicated a significant system effect. Correlation analyses support the validity of the scales. The overall scale correlated highly with the sum of the After-Scenario Questionnaire ratings that participants gave after each scenario. The overall scale also correlated moderately with the percentage of successful scenario completion. These results are consistent with the hypothesis that these alternative measurements tap into a common underlying construct. This construct is probably usability, based on the content of the questionnaire items and the measurement context.

© All rights reserved Lewis and/or Human Factors Society

1991
 
Edit | Del

Lewis, James R. (1991): Psychometric Evaluation of an After-Scenario Questionnaire for Computer Usability Studies: The ASQ. In ACM SIGCHI Bulletin, 23 (1) pp. 78-81.

A three-item after-scenario questionnaire was used in three related usability tests in different areas of the United States. The studies had eight scenarios in common. After participants finished a scenario, they completed the After-Scenario Questionnaire (the ASQ). A factor analysis of the responses to the ASQ items revealed that an eight-factor solution explained 94 percent of the variability of the 24 (eight scenarios by three items per scenario) items. The varimax-rotated factor pattern showed that these eight were clearly associated with the eight scenarios. The benefit of this research to system designers is that this three-item questionnaire has acceptable psychometric properties of reliability, sensitivity, and concurrent validity, and may be used with confidence in other, similar usability studies.

© All rights reserved Lewis and/or ACM Press

 
Edit | Del

Lewis, James R. (1991): An After-Scenario Questionnaire for Usability Studies: Psychometric Evaluation Over Three Trials. In ACM SIGCHI Bulletin, 23 (4) p. 79.

 
Edit | Del

Loricchio, David F. and Lewis, James R. (1991): User Assessment of Standard and Reduced-Size Numeric Keypads. In: Proceedings of the Human Factors Society 35th Annual Meeting 1991. pp. 251-252.

As technology improves, portable computers become smaller and more compact. A clear design challenge is to provide a system that is as compact as possible without degrading system usability. The keyboard is still the primary input device for compact computers. Previous research has indicated that reduced key spacing adversely affects skilled typing. Therefore, a portable computer system should provide a keyboard with full-sized keys in the primary typing area. The purpose of this study was to determine if reducing key size and spacing adversely affects the usability of a numeric keypad. Skilled keypad operators compared a standard-size numeric keypad to two keypads that had reduced center-to-center key spacing. One of these keypads achieved its reduction primarily by reducing the key spacing. The other reduced both key size and spacing. (Note that the small changes in key size and spacing have little effect on the overall device dimensions of a numeric keypad.) Operators typed numbers faster with and preferred the standard keypad over the keypad with both reduced key size and key spacing. If a numeric keypad is offered as part of a portable computer, every effort should be made to provide full-sized keys. If reduced key spacing is unavoidable, wide keys are preferable to narrow keys.

© All rights reserved Loricchio and Lewis and/or Human Factors Society

 
Edit | Del

Lewis, James R. (1991): A Rank-Based Method for the Usability Comparison of Competing Products. In: Proceedings of the Human Factors Society 35th Annual Meeting 1991. pp. 1312-1316.

N/R

© All rights reserved Lewis and/or Human Factors Society

1990
 
Edit | Del

Lewis, James R., Henry, Suzanne C. and Mack, Robert L. (1990): Integrated Office Software Benchmarks: A Case Study. In: Diaper, Dan, Gilmore, David J., Cockton, Gilbert and Shackel, Brian (eds.) INTERACT 90 - 3rd IFIP International Conference on Human-Computer Interaction August 27-31, 1990, Cambridge, UK. pp. 337-343.

In this paper we present a case study of a benchmark evaluation of integrated office systems. The case study includes developing scenarios, benchmark measures, and quantitative and qualitative analysis of user performance and user problems. We studied two systems, one loosely integrated windowing environment and one more tightly integrated (with respect to consistent graphical interface style). Multivariate analyses showed that significant differences were attributable to performance/analytical variables and to patterns of error impact classifications, but not to subjective ratings. Somewhat surprisingly, users experienced serious problems with the seemingly more integrated (consistent) system largely because of a handful of serious problems. This was taken as evidence that improvement of the poorer performing system should be based primarily on an analysis of errors. Some examples are presented to indicate the potential diagnostic value of analyzing of problems and the development of testable behavioral objectives from benchmark measures.

© All rights reserved Lewis et al. and/or North-Holland

 
Edit | Del

Lewis, James R. (1990): The Iowa Silent Reading Test's Comprehension Section: Local Norms and Predictive Validity for Usability Studies. In: D., Woods, and E., Roth, (eds.) Proceedings of the Human Factors Society 34th Annual Meeting 1990, Santa Monica, USA. pp. 922-926.

N/R

© All rights reserved Lewis and/or Human Factors Society

1989
 
Edit | Del

Lewis, James R. (1989): Pairs of Latin Squares to Counterbalance Sequential Effects and Pairing of Conditions and Stimuli. In: Proceedings of the Human Factors Society 33rd Annual Meeting 1989. pp. 1223-1227.

This paper discusses methods with which one can simultaneously counterbalance immediate sequential effects and pairing of conditions and stimuli in a within-subjects design using pairs of Latin squares. Within-subjects (repeated measures) experiments are common in human factors research. The designer of such an experiment must develop a scheme to ensure that the conditions and stimuli are not confounded, or randomly order stimuli and conditions. While randomization ensures balance in the long run, it is possible that a specific random sequence may not be acceptable. An alternative to randomization is to use Latin squares. The usual Latin square design ensures that each condition appears an equal number of times in each column of the square. Latin squares have been described which have the effect of counterbalancing immediate sequential effects. The objective of this work was to extend these earlier efforts by developing procedures for designing pairs of Latin squares which ensure complete counter-balancing of immediate sequential effects for both conditions and stimuli, and also ensure that conditions and stimuli are paired in the squares an equal number of times.

© All rights reserved Lewis and/or Human Factors Society

 
Add publication
Show this list on your homepage

Changes to this page (author)

05 Jul 2011: Author was edited
18 Nov 2010: Author was edited
02 Nov 2010: Author was edited
24 Feb 2010: Enabled abstracts to be shown on James R. Lewis's author page.
04 Jun 2009: Author was edited
04 Jun 2009: Author was edited
09 May 2009: Author was edited
03 Sep 2007: Added a picture of James R. Lewis
29 Jun 2007: Author was edited
28 Jun 2007: Author was edited
27 Jun 2007: Author was edited
27 Jun 2007: Author was added to the bibliography
26 Jun 2007: Author was edited
26 Jun 2007: Author was edited
26 Jun 2007: Author was edited
26 Jun 2007: Author was edited
26 Jun 2007: Author was edited
23 Jun 2007: Author was edited
23 Jun 2007: Author was edited
28 Apr 2003: Added the author to the bibliography

Page Information

Page maintainer: The Editorial Team
How to cite/reference this page
URL: http://www.interaction-design.org/references/authors/james_r__lewis.html

Publication statistics

Pub. period:1989-2011
Pub. count:22
Number of co-authors:5



Co-authors

Number of publications with 3 favourite co-authors:

Jeff Sauro:4
H. Wang:1
David F. Loricchio:1

 

 

Productive colleagues

James R. Lewis's 3 most productive colleagues in number of publications:

Robert L. Mack:17
Jeff Sauro:10
H. Wang:6
 
Dec 16

Where you innovate, how you innovate, and what you innovate are design problems

-- Tim Brown, 2005

Featured chapter

Authoritative overview of End-User Development (EUD) including 4 HD video interviews filmed in Rome, Italy. EUD is really all about democratization of computing.

Read the full chapter

Help us help you!