Building a User-Defined Interface

Dennis Wixon, John Whiteside, Michael Good, and Sandra Jones

Digital Equipment Corporation
Maynard, Massachusetts, USA

Originally published in Proceedings of CHI '83 Human Factors in Computing Systems (Boston, December 12-15, 1983), ACM, New York, pp. 24-27. Included here with permission. Copyright © 1983 by ACM, Inc.

Abstract

A measurably easy-to-use interface has been built using a novel technique. Novices attempted an electronic mail task using a command-line interface containing no help, no menus, no documentation, and no instruction. A hidden operator intercepted commands when necessary, creating the illusion of a true interactive session. The software was repeatedly revised to recognize users' new commands; in essence, the users defined the interface. This procedure was used on 67 subjects. The first version of the software could recognize only 7% of all the subjects' spontaneously generated commands; the final version could recognize 76% of those commands. This experience contradicts the idea that people are not good at designing their own command languages. Through careful observation and analysis of user behavior, a mail interface unusable by novices evolved into one that let novices do useful work within minutes.

 

 

In this research project we attempted to build an interface that would meet a stringent test for ease of use. The interface was to allow a novice user to perform useful work during the first hour of use. However, the interface was to contain no help, no menus, no documentation, and no instruction. The interface was to be so natural that a novice would not have to be trained to use it.

Traditional design methods do not lead to this type of easy-to-use interface. Therefore we had to create a new software interface design method. The method we chose is based on two principles:

  1. An interface should be built based on the behavior of actual users.
  2. An interface should be evolved iteratively based on continued testing.

To carry out this method, we created a situation where users were given an electronic mail task which they were to perform on a computer. Since the users received no instruction on how to perform this task on the computer, they were forced to generate their own command language syntax and semantics. User commands which were not accepted by the software alerted a hidden operator who translated them into recognized system commands. This procedure gave users the illusion of dealing only with a computer while allowing them to issue commands spontaneously.

The computer system kept a complete log of each session. We made changes to the user interface based on analysis of these logs. In this way the software was continuously tested and modified; the effectiveness of our changes was measured by subsequent users reactions to the new software. Thus, user input was the driving force behind the design of the software, creating a user-defined interface (UDI).

This method is based on work by Chapanis (1982) and his associates at Johns Hopkins. They applied this type of technique to the design of checkbook- and calendar-management programs. Gould, Conti, and Hovanyecz (1981) have used a hidden-operator scenario to simulate a listening typewriter.

Not counting pilot subjects which we ran to develop the procedure, we ran 67 subjects between April and October, 1982. During this time, the software was being regularly updated to recognize command forms used by previous subjects. The subjects were novice computer users recruited through posters placed in colleges and commercial establishments. Most of the subjects had less than one year's experience with computers; none had ever used electronic mail. Figure 1 shows the final version of the experimental task.

Welcome. For the next hour or so you will be working with an experimental system designed to process computerized mail. This system is different than other systems since it has been specifically designed to accommodate to a wide variety of users. You should feel free to experiment with the system and to give it commands which seem natural and logical to use.

Today we want you to use the system to accomplish the following tasks:

Figure 1: Task Used in the UDI Experiment

Our most important ease-of-use metric was the percentage of user commands that the software recognized correctly, on its own, without human intervention. We called this percentage the "hit rate." Over the course of the experiment the software grew in complexity, with the changes inspired by the behavior of previous subjects. Did the addition of these changes improve the hit rate?

Figure 2 shows the relation between the hit rate and the number of rules and words (synonyms) in the UDI parser. Number of rules and words is a straightforward measure of parser size and complexity. The figure is a scatterplot showing where each subject fell in the space defined by the two variables. Notice the strong positive correlation shown in the plot; the more complex the parser, the higher the hit rate. The correlation coefficient between these two variables was .70. This means that 49% of the total variability in hit rate can be accounted for by the state of the parser.

chi83fig2.jpg (21936 bytes)

Figure 2: Relation Between Hit Rate and Number of Parser Rules

The lowest hit rate we observed was 0% (no commands at all recognized) with an early version of the parser. The highest was 90% which we achieved near the end of testing. The mean hit rate for our first version was 3%, as opposed to 80% for the last version. The scatterplot shows that as the parser grew, the interface got better, as defined by hit rate. Therefore, the method of adding parser rules based on user behavior seems to work. What sort of rules were added and how effective were they individually?

Table 1 presents brief descriptions of the various changes that were made during the experiment. In Table 1, we also attempt to show the relative importance of each change. We show for each change the percentage of commands which required that change in order to be parsed. This data is based on the set of 1070 spontaneous commands generated by all the subjects during the course of the study.

Change
#
Brief Description % of commands that
require this rule
*
0 Version 1 (starting point) 7
1 Include 3 most frequently used synonyms for all items 33
2 Specify msg by any header field (title, name, etc.)