Programming by Demonstration is a technique that can potentially solve this problem. Since by definition most of the steps in a repetitive task are the same as in the previous repetition, it should be possible for a computer program to automate the task by recording the steps and replaying them.
a) copy first subject
Figure 1. In this example, the user makes a list of the subjects of a stack of messages. The user selects (a) and copies the subject of the first message, goes to the list, types "1. " and pastes in the subject (b). Then the user navigates to the second message (c), selects (d) and copies its subject, goes to the list, types "2. ", and pastes in the second subject (e). After the user navigates to the third message, the Eager icon appears (f), indicating that Eager has detected a pattern in the user's actions. Eager does not immediately begin performing actions; rather, it uses green highlighting to "anticipate" what the user is going to do next. In (f), Eager is anticipating that the user will select the subject of the third message, which is correct. The user selects and copies the third subject, goes to the list, and Eager anticipates that the user will type "3. " (g). After the user does this, Eager anticipates that the user will paste (h). The user does the paste and then navigates to the fourth message, which is anticipated in (i). At this point, the user is confident that Eager can perform the task correctly and therefore clicks on the Eager icon (j). A dialog box appears, and the user selects the option "Finish The Task" (k). Eager writes and executes a program which completes the task for the user, and automatically terminates (l).
c) go to next message
For integers, Eager searches for constants, consecutive integers, linear sequences, and linear sequences within a given tolerance (e.g. 50, 72, 91, ... is recognized as the sequence 20*i + 30, with a tolerance +/-2). Some of the numbers that HyperCard passes to Eager are: card numbers, screen coordinates for buttons, line numbers for text selections, and numbers as text and in card names.
For textual data, Eager parses the text into substrings of contiguous alphabetic characters, numbers, delimiters, and spaces. It searches for constants, alphabetic or numeric order, changes in capitalization and spacing, and known sequences such as days of the week and roman numerals. Some of the textual items that HyperCard passes to Eager are: text selections, pathnames, and the names of cards, buttons, and fields. If the user has selected some text and replaced it, Eager looks for patterns not only between the new text and the old text, but also between the new text in the current iteration and the new text in the previous iterations.
In the example shown in Figure 1, the first two similar events are Copy words 2 through 3 ("Trial info") of line 1 of background field 1 of card 2 of stack "Cali:Eager Demo:Mail Messages" and Copy words 2 through 5 ("Some more good ideas") of line 1 of background field 1 of card 3 of stack "Cali:Eager Demo:Mail Messages". HyperCard sends Eager a considerable amount of ancillary contextual information. For instance, the information sent with the above events includes the facts that there are 3 words on line 1 of card 2, and 5 words on line 1 of card 3, which Eager uses to recognize that the selection is from the second through the last word of line 1.
When the user clicks on the Eager icon, the system writes a program in scripting language and sends it off for execution.
Eager specifies a format for application developers to provide information about their commands and their objects. For each object, the developer can specify the test to use for similarity. For instance, the similarity test for HyperCard cards is (OR cardNumber cardName cardID). So if Eager is testing to see whether two Cut Card commands fit into an iterative pattern, it first looks for a pattern in the numbers of the cards. Since cardNumber is specified by the developer to be an integer, the default tests mentioned above for integers are used. If the developer had wanted cardNames to have higher priority than cardNumbers, (OR cardName cardNumber cardID) would have been specified instead.
For any command, the developer can specify other commands that it supersedes. When the user performs a command, Eager will then remove the previous command from the history if the current one supersedes it. In HyperCard, for instance, all navigation commands and all selection commands supersede any previous selection command. So, for instance, if the user selects a region of text, and then selects another region of text, the first selection is removed from the history.
Developers can also specify commands that "chain" together. For instance, there are many equivalent ways to move to a particular card in a HyperCard stack. There are commands to move to the next, previous, first, and last cards in a stack, to jump back to the card last visited, and to select from a miniature display of the 25 most recent cards. In order to detect patterns where users do not navigate in precisely the same way each time, the developer specifies that all of these navigation commands chain together. As a result, Eager will match any navigation sequences which have a pattern in their final destination, or in the relative number of cards moved.
It is useful to note that Apple's AppleEvent architecture allows a recorder to query an application both before and after the application performs a user action. This feature is included in the architecture specifically to meet the needs of "intelligent" recorders like Eager. So, for instance, if an application's Set Name command does not include important contextual information such as the previous name of the object, it is possible for the recorder to query the application for this information in between the time when the user has requested the name change and the time when the application actually performs the change.
Eager uses high-level events when it looks for patterns and user actions when it displays anticipation feedback. Larry Tesler coined the term "mid-level events" to refer to user actions. David Kosbie's chapter on aggregate events (Chapter 22) discusses this matter in greater detail.
The study showed that first-time users were generally able to understand what Eager was doing and to figure out how to use it without instruction. Three subjects clicked on the Eager icon as soon as it appeared on the screen. Three subjects noticed the icon, performed several more steps by hand, and then clicked on the icon. The verbal protocols for these subjects indicated that they were able to figure out that the anticipation highlighting indicated what they were going to do next: "...indicates that I've been using it [the Recent command] and it's probably the next thing I'm going to do"; "It's almost as if it's suspecting what I want to do. Now I get it". One subject performed all of the tasks by hand and never clicked on the icon.
When subjects were able to perform the task correctly in HyperCard (17 tasks out of 21), Eager was able to detect the patterns in their actions, even though different subjects chose different strategies for performing the tasks, and some subjects performed the task somewhat differently on each iteration. For instance, one subject originally used the "Next" button to go to the end of the stack, and on a later iteration switched to using the "Last" command. Also, some subjects made and corrected minor mistakes, such as navigating past the desired card and then backing up.
The study also pointed out significant failings in the user interface. The most striking finding was that all subjects were uncomfortable with giving up control when Eager took over. Two changes were made: Eager now saves a copy of any stacks it is going to modify so that the user can recover from an incorrect automation, and a stepping mode was added, where Eager pauses for user confirmation before each action.
There were a few cases where Eager appeared prematurely, having detected a simple but insignificant pattern. The appearance of the icon is now postponed until two complete iterations have been performed, and for extremely simple patterns, it waits for three iterations.
Some subjects did not realize that the character that popped up on the screen was related to the items being highlighted in green. To remedy this, the icon now appears in green as well.
The original icon for this program showed a man sitting in a chair. When he anticipated an iteration correctly, he would begin clapping. Some subjects expressed confusion about the character, since they could not see what the representation had to do with automating repetitive tasks. The icon was therefore changed to the less evocative image of a cat -- when it takes over, the cat is shown with its paw moving the mouse.
Some subjects did not notice the icon when it first appeared, particularly if it appeared on a rich background. The cat icon now animates when it first appears.
Eager currently does not handle conditionals, nested loops, repetitions that are spread out over time (e.g. inserting the date whenever you start a new letter), or tasks that include some steps that must be left up to the user. These restrictions help to constrain the sorts of patterns that the program must detect. If Eager's search fails to detect a match between an incoming event and the corresponding event in the previous iteration, it assumes that there is no repetitive task. In fact, the correct interpretation could be that there is a conditional branch in the program, or that there is an obscure pattern that must be left up to the user (e.g. deleting the names of people who are no longer friends). Allowing for these possibilities would greatly increase the ambiguity in generalizing and the complexity of pattern-matching. Eager's success is partly due to the fact that it only tries to automate a limited, yet useful, range of tasks. Although its power is limited, it does handle almost all of the commands in HyperCard, unlike all previous PBD systems which only work in limited, experimental domains.
Programming by Demonstration (PBD) systems go beyond macro programs by making generalizations about high-level events. This allows them to automate more complex tasks. The work most closely related to Eager is David Maulsby's Metamouse (see Chapter 7), a PBD system for a simple Draw application. Like Eager, Metamouse watches user actions and writes a program which generalizes those actions. Metamouse is more powerful than Eager in that it infers conditionals, but more restrictive in that the user must explicitly indicate the start of recording, must answer questions about how to properly generalize various steps, and must approve or reject each system action.
Eager is also related to Brad Myers' Peridot, described in Chapter 6. Peridot is a PBD system for building user interaction devices, such as menus and scroll bars. After the user places the first few items of a list into a menu, Peridot is able to infer that the entire list should comprise the menu. Like Metamouse, Peridot requires the user to confirm each inference it makes. Peridot was inspirational to me in showing the potential for PBD as a practical tool for end users.
The Predictive Calculator (Chapter 3) is similar to Eager in that it constantly monitors the user's actions. Barbara Staudt Lerner's Lantern [Lerner 89] system for automated customization is similar to Eager in its concern with minimizing user interaction. Lantern adopts the radical approach of performing customizations without asking for the user's consent. It abandons a customization if it detects the user undoing its effects.
The first user study investigated users' first experiences with Eager. To complement this data, a field study will be conducted to learn more about regular, experienced use of Eager. Most importantly, the study will investigate how often users perform tasks which Eager can automate. The study will also investigate how often Eager's automations do not coincide with the user's intentions, and how users react in such situations. The field study should also be valuable for determining whether experienced users of Eager have needs which the current user interface does not address. Finally, it seems significant that three of the subjects in the first user study clicked on the Eager icon as soon as it appeared, without following the anticipation highlighting. The field study will be able to determine whether some users remain unaware of the highlighting feature even after extensive use of Eager, and whether some users persist in invoking Eager without first validating its intended actions.
It would be possible to apply Eager's interaction style to Intelligent Help. If Eager detected a sequence of commands for which it knew a shortcut, it could guide the user through the shorter procedure. If it detected a pattern of activity that typified a common error, it could guide the user through the correct procedure. In either case, users would see a general principle applied to their particular situation -- Eager would do the instantiation, and they would not be required to puzzle over an abstract textual description or an example in an unfamiliar domain.
The technique of anticipation is important in meeting Eager's user interaction design goals. By turning an item green before it is selected, anticipation is unobtrusive, it represents actions in the modality employed by the user, and it presents generalizations concretely.
A user study showed that first-time users were able to use Eager without instruction.
Tasks within the domain: Repetitive tasks
Intended users: Non-programmers
Feedback about capabilities and inferences:
Eager uses anticipation highlighting to show the user how it has generalized.
Types of examples: Eager bases its generalizations on multiple examples
Program constructs: Variables. Iteration: set iteration, iteration with a counter, and iteration until a condition is satisfied. For instance, if a loop creates buttons in a column, the loop will terminate when the bottom of the card is reached.
No nested loops. No conditionals.