Users of personal computers often perform a large number of individual steps to carry out routine tasks. We discuss approaches to simplifying routine tasks, and then describe in detail a program which automates iterative tasks.
Keywords: programming by example, demonstrational interfaces, user programming, intelligent interfaces, adaptive systems, agents, programmer assistants, automation.
Users of mainframes and workstations have traditionally worked with custom-tailored programs that were designed for their particular task, such as printing the payroll for company XYZ. Users of personal computers, on the other hand, typically run application programs that were designed to handle generic activities -- for instance, Microsoft WORD for word processing, Informix WINGZ and Microsoft Excel for spreadsheets, Aldus PageMaker and Quark XPress for desktop publishing. Personal computer users must find a way to perform their specific tasks using the pre-established commands that come with an application program. This inevitably means that various routine parts of the user's task will be performed not by pressing a single button, as would be the case with a customized program, but instead by performing a whole sequence of commands in a rote manner.
A major issue in improving personal computers, then, is to find ways for end-users to customize application programs.
The shift from batch processing to interactive computing is also responsible for a large amount of repetitive work performed by contemporary computer users. In batch processing, a long program performs all of the steps in a complicated task. The program makes all of the control decisions. In an interactive environment, the human user makes all of the control decisions, telling the computer at every step of the process what to do next. Essentially, the control structure of the mainframe program is replaced by a human, and the computer simply performs primitive actions in response to menu selections or clicks on icons. This is not a step backwards: it is a step forwards in making computers accessible to a wider audience of users -- people who do not have the option of having their entire task programmed for automatic execution by a computer. Before personal computers became prevalent, these people had to perform their tasks entirely by hand, with the aid, at best, of an arithmetic calculator.
Another important class of personal computer users is comprised of people who perform tasks that are fairly routine, but that contain some highly idiosyncratic and non-uniform steps. It would not be possible to foresee all of the different situations which may arise and to write a program to handle these varying situations. For instance, users who write papers on a word processor perform certain routine tasks, such as setting margins and formatting bibliographies, but the bulk of their work -- entering text -- is idiosyncratic and must be done interactively.
Currently available customizations
In general, customization refers to any capability that makes a generic program more suitable to a specific user need. Several types of customization are currently available in applications.
Preferences. The most prevalent customizing feature is Preference, or default, settings. This feature allows users to choose settings for a few of the variables in an application. The application remembers those settings and automatically applies them whenever the program is started. For instance, word processors often allow users to choose the placement of footnotes, whether words should be hyphenated, and the like. A preference customization saves the user from manually selecting each setting every time the application is used.
Preference settings are useful, but they offer only limited customization power. Their most notable limitation is that they only cover features that the application developer had the foresight to include as preferences. For instance, if a user wants to always use the Helvetica font, and the application developer did not include fonts as a preference, then the user must manually select Helvetica every time he or she starts the application.
Templates. Another limitation of preferences is that they are context-independent. It is common for users to have one set of settings that are appropriate, say, for typing a memo, and another set that is appropriate for technical reports. Preferences are only useful for settings that apply globally across all uses of an application program.
Context-dependent features can be customized through the use of templates. A template is an entire context which can be saved and restored. Users often create their own templates by saving a typical document and creating new documents by copying and editing the typical document. For instance, an accountant who uses a spreadsheet application to produce monthly financial statements may produce the statement for February by making a copy of January's statement and editing the numbers in the entries.
Several word processors have paragraph style templates which allow users to specify common sets of margin, tab, font, size, and indentation settings. A user could have one template which properly formats bibliography entries, one for tables, and another for standard paragraphs of text.
Templates allow the user to give a single "select template" command instead of performing the entire sequence of settings individually. Templates are a very effective means of customizing an application to meet the idiosyncratic needs of users.
Once again, templates are limited to feature sets that the application developer foresaw as being important. If a user wants the first word of a paragraph to be in large, bold type, the current paragraph style templates will not work, since they do not record information about individual words within a paragraph.
Brad Myers [Myers 91] is conducting research on using programming-by-example techniques to automatically determine complex paragraph styles from several examples of the style provided by the user.
Automating activities. Preference sheets and Templates automate the setting of various variables that affect the way an application behaves. Beyond this, users often want to automate repetitive activities that they perform in an application program. If the accountant for company XYZ could buy a custom "XYZ Monthly Report" program, it would be possible to produce a monthly report by clicking on just a few buttons (e.g. a button to "Download Sales Data for February). However, the accountant in reality must use a generic spreadsheet program to produce the monthly report. Therefore, an activity like obtaining the monthly sales data will involve a sequence of commands: opening the February Sales Data document, selecting the relevant columns of data, and copying and pasting them into the appropriate place in the Monthly Report document.
Macros. The most common approach to automating sequences of user actions is to record macros. A macro is a recorded sequence of user commands that can be replayed to perform the sequence again. For instance, the sequence "Move down one cell; Paste" in a spreadsheet program could be repeated many times to initialize the data in a column.
Macro tools generally work by having the user select "Start Recording", perform the desired commands, and then select "Stop Recording". The macro tool then stores the sequence of commands so that the user can invoke it repeatedly, either immediately or in the future.
Macros are an extremely effective means of automating repetitive activities that involve exactly the same sequence of commands on each repetition. Telecommunications applications sometimes contain macro recorders to automate the process of logging on to a remote machine. Some spreadsheet applications contain macro recorders to automate rote formatting or data entry activities.
The Apple Macintosh has several system-level macro tools (e.g. CE Software's QuicKeys, Affinity Microsystem's Tempo, Apple's MacroMaker) that automate activities independent of the application involved. For instance, a user of a word processor that did not have a "paragraph styles" command could accomplish the same thing with a macro tool by recording the actions of selecting a font and font size, adjusting the margins on the ruler, and inserting tab stops. Replaying the macro would perform all of these steps automatically.
The IBM PC has a macro tool called Key Watch which constantly monitors user keystrokes. Whenever a sequence of keystrokes is immediately repeated, the tool beeps to alert the user to the repetition. The user may then press a single key which performs the entire sequence of keystrokes again.
Scripting. Another approach to providing customizability in
applications is to include a scripting language with the application. A
scripting language is a simplified programming language. Much of the
simplicity comes from the fact that the language attempts to be similar to the
user's natural language (e.g. English- or Russian- like syntax and vocabulary).
For instance, in the HyperCard program, the location of the "message box" is
specified in the scripting language by the location of the message box,
whereas in the programming language Pascal, it is specified
GlobalToLocal(messageWindow^.portBits.bounds.topLeft). Another source of simplicity is that the words in the scripting language refer to the objects and actions of that application, rather than to generic computer-science concepts. In the example above, the scripting language refers to the "location" of the application object "message box", while Pascal refers to the "bounds" of the "bits" of the "window" that corresponds to this box.
Scripting languages are a powerful way to allow for "End-User Programming". That is, they can bring the full power of programming to people who use application programs. The extremely significant drawback to this approach is that most end-users are not interested in programming; they are accountants, administrative assistants, and graphic artists who are far more concerned with accomplishing the task at hand than with learning how to program a computer. Nonetheless, since scripting languages are easier to learn than conventional programming languages, there are fewer obstacles to learning how to write a script than there are to learning how to write a program, there is a greater possibility that a colleague in the office will know how to write a script, and it is more likely that an end-user can figure out how to modify someone else's script so that it will suit his or her needs. Bonnie Nardi [Nardi 90] has shown that scripting languages are very successful in the spreadsheet domain, and that this is due in large part to the cooperative way in which end users assist and modify each others' scripts.
The remainder of this paper describes a macro tool, called Eager, which automates repetitive user activities. The goal of the Eager program is to extend the capabilities of macro tools to include some of the power and flexibility that is currently only available through scripting languages. Thus, users will get some of the benefits of a scripting language without having to learn how to write scripts.
Eager is currently implemented for the HyperCard program which runs on Apple computers. Eager differs from other macro tools in that it is able to generalize user actions. This means that it can generate programs with loops and variables, whereas standard macro tools can only repeat fixed sequences of commands.
The Eager program is designed to detect situations where the user performs a sequence of actions and then immediately repeats the sequence. In this respect, it is like the Key Watch program mentioned above: it is always on, always monitoring the users' actions. When it detects a repeated sequence of actions, the Eager character (an icon of a cat) appears on the user's screen. Eager differs from Key Watch in that 1) it records high-level commands rather than keystrokes, and 2) it generalizes.
High-level commands. Rather than recording individual keystrokes and mouse clicks, Eager receives high-level descriptions of user actions from an application, after the application has processed the user actions and converted them into meaningful events. For instance, if the user uses the mouse to move a HyperCard button, the keystroke/mouse-click description of the action might be "MouseDown at (0,21); MouseUp at (80,21)". In contrast, the high-level event description of the action might be "Move button #1 from (0,21) to (80,21)". The low-level description does not even record that a button was located at the position where the mouse was clicked.
Generalizing. The Eager program compares user actions in the first sequence with user actions in the second sequence and applies a pattern-matching algorithm to determine whether two actions are "similar". This allows the program to identify repetitions that are not exactly alike, but that nonetheless follow some regular pattern. For instance, if the second sequence of user actions is the high-level event "Move button #2 from (16,50) to (96,50)", the Eager pattern matcher would infer that the ith sequence is "Move button #i from its current position to a position 80 pixels to the right".
Programming by Example
The technique of recording user actions and writing programs that generalize those actions is known as "programming by example". The first programming by example system was written by David C. Smith [Smith 77] in 1977. Called Pygmalion, it enabled users to construct programs (such as a "factorial" program) by giving an example of how the program should behave. Henry Lieberman extended this technique to generalize from multiple examples in his Tinker program [Lieberman 87]. Tinker was also the first programming by example system to automatically infer the location of conditional branches in a program. Laura Gould's "Programming by Rehearsal" [Gould 84] introduced an effective style of user interaction for programming by example, although it did not generalize from the user's actions. Dan Halbert's SmallStar program [Halbert 84] also did not generalize, but it introduced a mixed text-and-graphics language for displaying recorded actions and for enabling users to explicitly generalize the recorded actions. Brad Myer's PERIDOT program [Myers 88] used programming by example to enable users to construct graphical user interface devices such as menus and scroll bars. David Maulsby's MetaMouse [Maulsby 89] applied programming by example to a simple Draw program. MetaMouse was able to infer iteration and conditionals, and is the most powerful programming by example system to date.
Eager is distinguished from other programming by example systems in that it is very easy to use. It is designed so that it does not intrude on the user's normal interaction with the computer. In comparison, all other PBE (programming by example) systems require the user to interact extensively with the system. For instance, they require the user to explicitly indicate the start and end of each sequence, and to answer questions after each action about whether the system's generalization of the action is correct.
A user study [Karimi 89] showed that first-time users were able to figure out how to use Eager without receiving any instruction.
Ambiguity. A key problem for PBE systems is that there is often ambiguity involved in generalizing from examples. This is the classic problem of inductive inference. In the situation described earlier, where the user first performs the action "Move button #1 from (0,21) to (80,21)" and then performs the action "Move button #2 from (16,50) to (96,50)", it was suggested that the appropriate generalization was "Move button #i from its current position to a position 80 pixels to the right". An alternative generalization would be to "Move the button at (32,79) to a position 80 pixels to the right". Another alternative generalization would be to "Move button #i to the right of its current position until its horizontal position is 80 + 16 * (i-1)." Furthermore, if both button #1 and button #2 were named `Next Step', another generalization would be to "Move all buttons named `Next Step' from their current positions to a position 80 pixels to the right".
Most PBE systems deal with ambiguity by asking the user to confirm each generalization that the system makes. Continually interacting with the PBE system distracts the user from the task of performing the example correctly, and can be both tedious and intrusive.
Anticipation. Eager addresses the ambiguity problem through a novel user interface technique called "anticipation". When the Eager program detects a repeated sequence of actions, it "anticipates" the next action that the user will perform by adding green highlighting to the appropriate item on the user's screen. For instance, a menu item, object, or text selection will appear in green to indicate that the user is expected to select that item. The Eager program does not actually select a menu item or click on a button, it simply changes the item's color. If the Eager program's inference is correct, the user will notice that all of his or her actions are anticipated with green highlighting. If the Eager program makes an incorrect generalization, the user's action will differ from the anticipated action, and the program then uses the actual action as further information for the pattern matcher, and attempts to find a different generalization which conforms with this new information. Anticipation allows users to continue performing their repetitive activity without interruption, and the Eager program will continually modify its generalizations to conform with the users' actions (within the limits, of course, of the pattern matcher's capabilities).
When the user is confident that the anticipated actions correspond properly with the intended actions, the user can click on the Eager icon, and the Eager program will take over and complete the repetitive task automatically.
There are two important advantages to using anticipation to inform the user about the generalizations made by the Eager program. First, the generalized actions are presented to the user in the same modality that the user employed to perform the actions in the first place. That is, if the user clicked on a button to perform an action, the anticipation highlights a button. In contrast, if the user were presented with the written message "Click on the button at (16,50)", the user would have to make a mental translation between this written description and the image on the screen. The user may not know that the objects on the screen are called "buttons", that the one last clicked on was located at "(16,50)", or that "(16,50)" refers to the horizontal and vertical distance in pixels between the top-left corner of an object and the top-left corner of the screen. The second advantage of anticipation is that it uses a concrete example to express an abstract generalization. Instead of presenting the generalization that on the ith iteration, the user is going to click on button #i, anticipation instantiates the generalization and presents the information that on the next iteration, the user is going to click on button #3. It is presumably easier for a user to understand a concrete example than an abstract generalization.
An example automation
Figure 1 demonstrates a repetitive task that is automated by the Eager program. The example shows part of the process of creating a "Help" card: the intent is that clicking on the button to the left of a topic will bring up a screen of information about that topic. In the example, the user changes the appearance of the buttons: they are to be rounded rectangles, displaying the letters A, B, C, etc. The user changes the buttons one at a time. The first button is changed with the following commands in HyperCard:
At this point, the user is confident that Eager has detected the correct pattern. The user clicks on the Eager icon, and the Eager program takes over and performs the iteration automatically on the remaining buttons. The final result is shown in Figure 1(d).
Eager currently does not handle conditionals, nested loops, repetitions that are spread out over time (e.g. inserting the date whenever you start a new letter), or tasks that include some steps that must be left up to the user. These restrictions help to constrain the sorts of patterns that the program must detect. If Eager's search fails to detect a match between an incoming event and the corresponding event in the previous iteration, it assumes that there is no repetitive task. In fact, the correct interpretation could be that there is a conditional branch in the program, or that there is an obscure pattern that must be left up to the user (e.g. addresses that do not contain the name of a corporation). Allowing for these possibilities would greatly increase the ambiguity in generalizing and the complexity of pattern-matching. Eager's success is partly due to the fact that it only tries to automate a limited, yet useful, range of tasks. Although its power is limited, it does handle almost all of the commands in HyperCard, unlike all other PBE systems which only work in limited, experimental domains.
[Cypher 91] Cypher, A. Eager: Programming Repetitive Tasks by Example. In Proceedings of CHI `91 (New Orleans, Apr. 28 - May 2). ACM, New York, 1991, pp. 33-39.
[Gould 84] Gould, L. and Finzer, W. Programming by Rehearsal. Xerox Palo Alto Research Center Technical Report SCL-84-1, May, 1984.
[Halbert 84] Halbert, D. Programming by Example. Xerox Office Systems Division Technical Report OSD-T8402, December, 1984.
[Karimi 89] Karimi, S. and Cypher, A. Eager: A User Study. Apple Computer Human Interface Group Technical Report 89-09, December, 1989.
[Lerner 89] Lerner, B. Automated Customization of User Interfaces Carnegie Mellon University School of Computer Science Technical Report CMU-CS-89-178, September, 1989.
[Lieberman 87] Lieberman, H. An Example Based Environment for Beginning Programmers. In Artificial Intelligence and Education. Ablex, Norwood, N.J., 1987, pp 135-151.
[Maulsby 89] Maulsby, D. and Witten, I. Inducing Programs in a Direct-Manipulation Environment. In Proceedings of CHI `89 (Austin, Apr. 30 - May 4). ACM, New York, 1989, pp 57-62.
[Myers 88] Myers, B. Creating User Interfaces by Demonstration. Academic Press, San Diego, Calif., 1988.
[Myers 91] Myers, B. Text Formatting by Demonstration. In Proceedings of CHI `91 (New Orleans, Apr. 28 - May 2). ACM, New York, 1991, pp. 251-256.
[Nardi 90] Nardi, B. and Miller, J. The Spreadsheet Interface: A Basis for End User Programming. In INTERACT `90 Elsevier Science Publishers B.V. (North-Holland), 1990, pp 977-983.
[Smith 77] Smith, D. Pygmalion: A Computer Program to Model and Stimulate Creative Thought. Birkhäuser, Basel, 1977.
back to ... Publications Allen Cypher