SmallStar showed that programming by demonstration could be a valuable and practical addition to application software. In this chapter, I will describe the Star and SmallStar systems, and how users create programs by demonstration in SmallStar. SmallStar introduced the idea of data descriptions, which describe the data to be used when running a program recorded by demonstration. I will explain the motivation for data descriptions, and how they are used. I will also show how SmallStar allows conditionals and iteration to be added to a recorded program by editing its transcript. Finally, I will discuss some user tests of SmallStar, and summarize the results of this research.
In my original research, I used the term "programming by example". This book uses "programming by demonstration" to mean the same thing, and I will use that term in this chapter.
The pioneering Pygmalion system (Chapter 1) and the Tinker system (Chapter 2) demonstrated more sophisticated use of programming by demonstration, by showing how it could be used in general-purpose programming systems. Pygmalion uses a special visual language in which operations are performed using direct manipulation. Tinker uses Lisp as its base programming language. These systems assume some familiarity with programming and are not directed toward any specific application.
I had different users in mind. I wanted to show that programming by demonstration could be widely understood and employed by users of application software who were not programmers. With programming by demonstration, they could automate the repetitive tasks they performed in their daily work.
Figure 1 shows a typical Star desktop. The icons on the desktop include documents, folders, file drawers (directories on remote file servers), printers, and in- and out-boxes. Every icon has a name. Two icons may have the same name even if their contents differ. In Figure 1, a document and a folder have been opened; their contents are shown in the two windows.
Star documents can contain text and graphics, tables (with named columns and numbered rows), and fields, which are named regions in the text that hold data.
Along the top of the desktop is a one-line message area that displays errors and informational messages. To the right of the message area is a small box with three horizontal lines, which indicates that a pop-up menu is available at that spot.
Four keyboard commands can be applied to nearly all Star objects. These universal commands are Move, Copy, Delete, and Properties; they apply to the current selection. To delete some text, one selects the text and presses Delete. To copy a document, the user selects the document and then presses Copy. The mouse cursor changes into a small document icon. The user then sets down the copy by clicking the mouse over the desktop, or over, say, a folder; the latter action inserts the copy into the folder.
Pressing the Properties key brings up a dialog box that displays the properties of the current selection, and allows properties to be changed. For instance, if a paragraph is selected, pressing Properties will display a dialog box that describes the paragraph's margins, line spacing, and so forth.
This description does not begin to do justice to Star, which was remarkable in the richness of its functionality at its introduction. Even today, its seamless integration is enviable. Users did not think of it as a collection of different applications, but instead as a powerful, multifunctional whole.
IF CreditBalance > 0 THEN
MOVE THE Document WHOSE NAME IS "Please Pay"
TO
THE Printer WHOSE NAME IS "Gutenberg";
A Cusp program can be put into a Cusp button, which lives in a document, and which executes the program when pressed.
SmallStar also adds features which Star does not have. Most notably, it has program icons and buttons, which are used to hold programs written by demonstration. Figure 2 shows a typical SmallStar desktop.
SmallStar not only records the program, but also displays the resulting transcript in a human-readable form. Figure 3 shows a program window containing a very simple program. If the window is open during program recording, SmallStar adds user actions to the program transcript as they are performed.
Figure 3. A program to move a named document out of a folder.
The
transcript language is English-like and was inspired by Cusp. It imposes little
non-natural-language syntax or semantics on the user. Small pictographs stand
for icons and other Star objects, such as text selections, components of
tables, and the like. The transcript language is meant to be read, not written;
the user cannot write a program without invoking the recording mechanism.
SmallStar records only the user actions that cause state changes, and does not record simple selection of Star objects. For instance, when an icon is selected and deleted, SmallStar simply records the Delete action, instead of the selection of the icon and then the Delete. An earlier version of SmallStar did record Select as an action, but Select actions turned out always to be redundant to the operation of the program; they made the transcript longer but no clearer to the reader.
The user can edit the resulting program in simple ways, by recording new steps at any point in the program or by deleting existing steps. Such edits may make the program impossible to execute, but SmallStar does not attempt to ensure that it will be able to run an edited program.
To execute a recorded program, the user presses Run in the program window, or chooses Run from the program icon or button pop-up menu. Single-stepping the program is also possible, using the First Step (start from the beginning) and Step (continue stepping) commands. If the program encounters an error while running, it stops, displays an error message, and indicates which statement caused the error.
This problem is particularly acute when programming by demonstration in a direct manipulation system like Star. In a system with a typescript command language, a user may wish to delete all files ending in ".BAS", and can do so by typing something like "delete *.BAS". However, in a direct manipulation system, the user may simply select all the files ending in ".BAS" from a list and delete them. Suppose the list consisted of the files A.BAS, LETTER.TXT, and A1.BAS. The user would select A.BAS and A1.BAS, and then delete them. The user never gave an explicit search pattern for which files to delete. The system could make a guess, but the pattern could be "*.BAS", "A*.BAS", or even "A*.*". The system cannot know exactly what the user intended.
The problem can be stated in terms of the intensional (choosing by description) and extensional (choosing by direct reference) capabilities of the system. In a typescript system, the user is more likely to give intensional specifications (like "*.BAS"), which describe salient characteristics of the desired data objects. In a direct manipulation system, the user specifies most data objects extensionally by pointing at them.
A program that is recorded by demonstration in a direct manipulation system is therefore often incompletely specified. The user has chosen some data, but the computation to make the choice (a search, a pattern match, etc.) has been done in the user's head, and has not been communicated to the system. Yet it seems natural to the user, since it is exactly what the user would do when performing the task manually.
Somehow the user must communicate this missing intensional information to the system. One common method is to use inductive inference techniques. The system examines the user's choices, and tries to draw conclusions about the data the user chose. The user may do several examples, not just one. The system notes the differences and commonalities between the examples, and tries to determine what the user meant [Bauer 79] [Biermann 76b].
Inductive inference can work well, but if the number of possible interpretations is large, the user may have to give many examples. It may even be impossible for the user to make his or her intentions unambiguous just by giving examples. For instance, consider the list of files given in the example above. If the user deleted A.BAS in one example, and A1.BAS in another, the user's intentions would still be ambiguous, since a number of different search patterns could be derived from these two examples.
SmallStar programs, such as the one shown in Figure 3, consist of operations, like Move and Copy, and operands, like the documents and folders listed in the program. The operands do not refer to data objects directly; instead they point to data descriptions. There are no constants in a SmallStar program. Every data object has a corresponding data description. Data descriptions are editable; the user may change them after recording the program.
As a program is being recorded, the system creates a data description for every newly referenced object. The system chooses a reasonable default description for the object. The choice is fixed, and does not change based on the example being given. For instance, SmallStar always assumes that icons are chosen because of their names. In an earlier version of SmallStar, the user had to specify the choice as the program was being recorded. This turned out to be distracting and intrusive, so I adopted the current technique of making a default choice during recording.
The program in Figure 3 contains references to two different data descriptions created during program recording. One description specifies a folder named "Negotiations"; the other is for a document named "Treaty". SmallStar originally chose descriptions that specify these icons by name. However, the user could later change these descriptions to choose the icons by other criteria.
Descriptions are created for all the objects referenced by the user during program recording, including those that are referenced indirectly. For instance, if the user selects a document in a folder, SmallStar creates a data description for the folder as well as the document. Every item on the path to an object is included, and the descriptions on a path are linked. Thus, in Figure 3, the description for the document contains an (invisible) reference to a description for the folder that contains it. When the program runs, SmallStar looks for a document named "Treaty" inside a folder named "Negotiations".
If an object is used more than once in a program, a new data description is not created for each use of that object. Instead, the second and subsequent references to the object all point to the same data description. If the user selects a use of a data description, all other uses of that description are outlined in gray. In Figure 3, the first use of the description for "Negotiations" is selected, and its other uses are highlighted.
When a SmallStar program runs, it uses its data descriptions to find the data objects needed by the program. The first time a data description is referenced, SmallStar searches for an object that matches the description, and remembers that object. If more than one object matches the description, one is chosen arbitrarily. Later uses of that same description skip the search process and use the object already found. The reuse of data descriptions is a fundamental heuristic. If the user refers to a data object more than once, the system assumes the uses are connected in intent; it does not assume that a second use of the same object is accidental. This heuristic works very well. Rarely does a user refer to the same object for a completely different purpose. However, exceptions occur, and there is a way to defeat the heuristic. Suppose, for instance, a user records a program to print two documents on two printers, and uses the same printer to print both documents when giving the example. Subsequently, the user may wish to use the same program to print on separate printers. The user can edit the program and change the second use of the printer to refer to a new, separate description. Conversely, there is also a way to merge two different operands so that they use the same description.
A
number of different kinds of descriptions are provided for each kind of data
object. The choices are listed in the description sheet. Some kinds of
descriptions take parameters of various kinds, and those are listed as well.
Inapplicable choices are grayed out. As an example, Figure 4 shows a data
description sheet for a document. The main description choices are listed in
the Choose using item; they are NAME, POSITION,
PROMPT, and ANY.
If NAME is chosen, the system looks for a document whose name matches the pattern given in the Name Pattern parameter. The pattern may be a simple text string (here, "Treaty"), or may be a pattern such as "Chapter*", which would match any document whose name begins with "Chapter". Since icon names need not be unique, the Version choice further narrows the possibilities by allowing the user to specify date and time information constraining when the document contents were last changed.
If POSITION is chosen, the user can choose the FIRST or LAST document in a folder. Position on the desktop is not well-defined (there is no labeled grid), so this choice is displayed only when the document is inside a container such as a folder.
PROMPT requests SmallStar to ask the user to choose a document when the program runs. This makes the document a parameter in the program. The editable Prompt item specifies the text to use when asking.
Finally, ANY specifies that the user does not care which item is picked. For a single item, this is not very useful, but later I will show how this choice is used to indicate set iteration.
Consider this example of the use of a description sheet. Suppose the user wants to write a program that will move the first document out of the "Negotiations" folder. When the user first records the program, SmallStar will record the program shown in Figure 3, which specifies the document "Treaty" by name; that is the default. The user must change this description to reflect his or her intent. To do so, the user selects the description for "Treaty" in the program, opens its description sheet, changes the description from NAME to POSITION, and makes sure that Position is set to FIRST. After the user presses Done, the data description changes, and the program transcript is updated. The resulting program is shown in Figure 5.
Figure 5. A program to move the first document out of a folder.
As
mentioned, data descriptions offer a simple alternative to multiple-example
inductive inference, especially in systems where multiple examples are
inconvenient or impossible to demonstrate. During user trials of SmallStar, I
found that data descriptions were easy for users to understand.
One issue with data descriptions is that possible descriptions for a data object are chosen from a limited menu of possibilities. In SmallStar, I attempted to cover a number of possibilities on each description sheet, but the choices are by no means exhaustive. The user may want to use some description that is not covered by the description sheet.
A way around this problem, not implemented in SmallStar, would be to extend description sheets to allow new descriptions to be specified. However, some method must be provided for programming the search algorithms for new kinds of descriptions. If the system were powerful enough, the searches could be programmed by demonstration. But this will not always be possible in many systems, including SmallStar. Some other programming language may need to be provided. Unfortunately, users must then learn that language. The advantage of being able to program by demonstration, in the user interface of the system, will then be compromised.
As an example, consider again the simple program of Figure 3, which moves a single document out of a folder named "Negotiations". Suppose the user instead wanted to move all the documents out of the "Negotiations" folder. The user would start by recording a program that moved just one document out of the folder. The resulting program would resemble Figure 3. The user would then edit the data description for the document, choosing the ANY choice in the description sheet (see Figure 4).
After the user changed the description sheet, the program would look like Figure 6. When this program runs, the data description for the document will generate a set consisting of all the documents in the folder, since any and all documents match the description. But until the user adds a loop to the program, only the first document in the set would be moved by the program.
To edit the program to add a loop, the user selects the statements that form the body of the loop, and invokes Repeat... from the pop-up menu in the program window. The selected statements are wrapped in a loop, and the transcript is updated. The program with the added loop is shown in Figure 7.
In Figure 7, the loop statement is present, but the set to iterate over has not yet been specified; hence there is a blank space labeled -fill this in-. The user then indicates which description will generate the iteration set, by selecting the "Negotiations-any" document description in the program, pressing the Move or Copy key, and dropping the result into the -fill this in- space. The final program is shown in Figure 8.
Now there are two references to the "Negotiations-any" data description: one is in the Repeat statement, and the other is in the Move statement. Selecting one would highlight the other. Interestingly, there is no iteration variable like that found in a typical for-each construct. For instance, there is no variable like the "D" in "For each document D in Negotiations do Move D to the Desktop." The data description both substitutes for the variable and describes the iteration set.
Figure 6. Program to move any document out of a folder.
To solve this problem, some programming by demonstration systems require the user to give multiple complete examples. The user records different examples that cover all the alternate paths in the program, and specifies predicates that choose between the paths. I did not choose this method for SmallStar. Instead, the user records a single straight-line program, and then edits it to add if-then-style conditionals. Program steps that were not recorded in the initial example are added later as necessary by recording extra actions. This saves the effort of having to construct and record a complete example for the alternate program paths.
Consider a simple example. The user receives a filled-in mailing form ( Figure 9), which is to be printed for use in the mail room.
Normally, items are shipped by fourth-class mail. However, if an item weighs less than one pound, it is shipped by first-class mail. The user wants to write a program to automate this policy.
To do so, the user first writes a program that unconditionally changes "Fourth" to "First" in the Class field in the mailing form, and then prints the form. The user then selects the statements that should be executed conditionally. Figure 10 shows the program, with those statements selected.
Using the pop-up menu in the program window, the user wraps the statements in an if statement. The edited program is shown in Figure 11.
The user has not yet specified the predicate for the if statement. The user selects the predicate (- fill this in- = -fill this in-), and opens its property sheet.
Figure 9. A mailing form.
The
user then selects the contents of the Weight field in the mailing form
and invokes a Make Description command. The description created is
"Mailing Form-Weight-all"; it specifies all the text in the
Weight field. The user puts this description in the property sheet as
the first operand. The user changes the second operand to "1" (one pound), and
finally changes the operator from "=" to "<". The resulting property sheet
is shown in Figure 12.
When the user closes the property sheet, the program changes, and is shown in Figure 13.
Figure 14. The calculator window.
The
conditional property sheet only allows for simple comparisons. To construct
more complicated predicates, the user can use a special calculator that has
been augmented with relation and Boolean operators ( Figure 14), and then test
the result with an if statement.
SmallStar provides only for if-then, not if-then-else conditionals. If-then-else must be simulated by testing the negation of a predicate. Since only one branch of the conditional can be recorded at a time, the user must record the steps in the other branch at a later time.
SmallStar's provision for conditionals is adequate, but not entirely satisfactory. The user must decide which program steps will be executed conditionally. In addition, recording new steps in a program may require extra setup of SmallStar data objects.
Considerably more information about SmallStar can be found in my doctoral dissertation [Halbert 84]. It covers the evolution of SmallStar in detail, including earlier versions and blind alleys.
SmallStar shows how it is possible to add sophisticated, practical programming by demonstration to a richly functional, end-user commercial software system. I hope that future researchers and software designers will find its ideas and techniques useful in their work.
Intended users: Users of application software who are not programmers
Program constructs:
Variables (via data descriptions). If an object appears more than once in a program, all references to it use the same data description.
Set iteration and conditionals can be added by hand to a recorded program.