"Watch What I Do" Chronology


A Programming by Demonstration Chronology:
23 Years of Examples

David Maulsby
Alan Turransky

with contributions from

Allen Cypher, Bill Finzer, Henry Lieberman, Brad Myers and Ian Witten


The following chronological timeline presents a historical look at systems, techniques, and ideas which have served as significant milestones in the birth, maturity, and development of the programming by demonstration field throughout the past 23 years. Since programming by demonstration has drawn upon so many disciplines, we have included relevant work from other fields in order to provide an appropriate context in which to view this work. This timeline is an attempt to properly place work which has grown into programming by demonstration, in relation to other fields, revealing the many connections between them. Those systems mentioned in this book are highlighted in boldface. Other programming by demonstration systems are italicized and related machine learning research is underlined.

1970 - 1975

Learning structure descriptions from examples. Patrick Winston's early work on a machine learning concepts from human teachers [Winston 70].

Instant Turtle. Cynthia Solomon creates a "one-finger" LOGO language called TEACH to instruct children about problem solving. A vocabulary of single alpha-numeric commands manipulate a graphical turtle (See [Abelson 80]), where numbers are used to set default values. The system remembers the steps of a procedure [Papert 80].

Button Box. Radia Perlman creates a hardware input device for a robotic turtle to demonstrate and teach children about LOGO programs. The language used with this is known as TORTIS [Smith 75].

Pygmalion. David Canfield Smith creates the first system for programming by demonstration. It introduces icons and mixed initiative computing (the defining of conditional branches on demand) [Smith 75].


Autoprogrammer. A system for creating Lisp programs from traces. The user types in several examples in the form of a Lisp script, stating which values are constants or variables; the system matches similar steps. Mismatches are treated as branches and the user explicitly states the conditional test [Biermann 76a].


PURR-PUSS. A multiple context learning system which can be trained to respond to events on various streams of input. The system uses feedback of its own outputs into input to remember state [Andreae 77].

Thesys. A system for constructing Lisp programs from examples. Users present a series of transformations of an example and the system generates the code, including recursive calls [Summers 77].

QBE. Query by example. A system for specifying database queries by (possibly fictitious) examples. The user states field names and gives example values, specifying which are constant or variable. Variables are used to link one stage of a query with another [Zloof 77b].

EOM. An environment for writing computer code by creating animated sequences. It contains a two-dimensional way of specifying animation scripts, which eliminates number specifications and typing of initial conditions [Pangaro 77].


Programming by abstract demonstration. Programs containing only constants are demonstrated on concrete data; those with variables are demonstrated on icons standing for abstract values. The phrase "programming by demonstration" is coined [Curry 78].

EP. Exemplary programming. A system to automate sequences of commands to programs which use simple keyboard input. It queries the user for names and values of parameters, and substitutes variables for example data. Users demonstrate actions, explicitly stating control flow [Waterman 78].


VisiCalc. The first commercially available spreadsheet program [Frankston 80].

ID3. A polynomial algorithm for inducing DNF (Disjunctive Normal Form) concepts from examples [Quinlan 86].

Tinker. A system capable of evaluating user written Lisp code fragments on the fly (See [Brown 81]). It merges multiple examples into conditional structures, asking the user to state tests and distinctions between cases [Lieberman 80].

Programming by example with inferencing based on application knowledge. The system classifies example values as constants, variables, or inputs according to the operational context in which they occur [Bauer 79].


Tom Mitchell introduces the notion of "bias" in learning systems as necessary for forming generalizations [Mitchell 80].

Scribe. A text formatter. The system has knowledge of many complex document types (reports, etc.) and can automatically structure a document's text (including hyphenation, footnotes, tables of contents, indexes, and bibliographies) to fit into these structures. The system is also capable of producing documents formatted for different output devices [Reid 85].

Model Inference System. A theoretical framework for program debugging with a practical implementation of an interactive debugger. Contains diagnosis algorithms that can isolate an erroneous procedure given a program and an input on which it behaves incorrectly. The algorithms are interactive and can query the programmer for the correctness of intermediate results of procedure calls. It uses these queries to diagnose the error [Shapiro 86].

EMACS. A text editor with programming language and macro recorder built into it [Stallman 87].


ThingLab. A system containing graphical and demonstrational techniques for programming constraint networks [Borning 81].

Build. Dynamic program building. A user writes code fragments which the system evaluates on the fly (See Tinker [Lieberman 81]) while keeping track of "loose ends" such as empty branches or loops. The user is able to edit any part of the program at any time [Brown 81].

XPROBE. A experimental system for robotic programming by demon-stration (See NODDY [Andreae 85]) [Summers 81].

Autoprogramming calculator. The system infers variables from operational traces performed on a hand-held calculator and is capable of predicting subsequent steps (See Reactive Keyboard [Darragh 92]) [Witten 81].


PHD. Programming by homomorphisms and descriptions. The user forms queries for selecting data and iteration over sets using techniques similar to QBE (See [Zloof 77]), and creates procedures to be performed on selected data by writing code evaluated on the fly (See Build [Brown 81], Tinker [Lieberman 81]) [Attardi 82].

Generalization as search. Tom Mitchell formalizes the notions of generalization and specialization, and classifies learning as orderly search of a lattice, terminating when maximal and minimal descriptions coincide [Mitchell 82].


The phrase "direct manipulation" is coined [Shneiderman 83].

TED. A system which infers text transformations from input/ output examples by finding constants and variables in text patterns [Nix 83].

Kurt van Lehn introduces the notion of "felicity conditions," which are necessary for a teacher to meet in order to successfully instruct a system: A) show all steps, B) do all steps correctly, C) make invisible objects and relations visible, and D) introduce at most one new branch per lesson [van Lehn 83].

Menulay. A program for the demonstrational specification of interface components. A designer demonstrates graphical feedback to be given for specific user inputs [Buxton 83].

Theory of machine learning. Dana Angluin and C. H. Smith prove that concepts can be learned from algorithmic demonstrations which cannot be identified from input/ output pairs alone [Angluin 83].


Rehearsal World. A system for object-oriented programming by demonstration using the "theater" metaphor (See Appendix C). Enables users to define new constructs and demonstrate new behaviors [Gould 84].

SmallStar. The use of programming by demonstration in multiple applications with graphical user interfaces. The system uses the notion of data descriptions and icons in scripts. It records macros with high-level semantic information for selected objects and allows users to edit data descriptions and control structures [Halbert 84].

Juno. A system for the demonstrational programming of geometric constraints. Drawings are represented procedurally and new constraint procedures can be applied to different data [Nelson 84].

Trillium. A environment for designing the layout and logic interaction of Control/ Display operator interfaces for machines such as copiers. Specific items within the system are either primitive or composite, where composite items are formed from subassemblies of other items. The system contains editors for the descriptions of all items and provides simple tools for inferring the description of composite items from an example [Henderson 86].

Study of how people draw freehand. P. van Sommers establishes that people work outward from areas of high constraint; a standard way of decomposing computer procedures [van Sommers 84].

ThinkPad. A system for graphical programming with examples. Data structures are defined by drawing and labeling their components. Functions (which alter values or data structures) are defined by demonstrating edits on the graphics. The editor includes numerical and string operations, and commands for manipulating hierarchical structures [Rubin 85].


Reactive Keyboard. A program for predicting keyboard inputs based on previous entries [Darragh 92].

NODDY. A prototype robotic programming by demonstration system. "Justified generalization" (inferences involving extensive search) allocates resources and guides search according to multiple sources of sufficient evidence. Contains algorithms for inducing loops and branches, and a general (though generally intractable) algorithm for inducing functions [Andreae 85].


Grow. An object-oriented graphical interface builder. The system is capable of creating icons, adding interactivity and animation, and reusing interfaces with other applications [Barth 86].

MIKE. User interface management system which allows macros to be specified by examples over a wide range of interfaces [Olsen 86].

Peridot. A program which uses demonstrational techniques for creating user interface components. The system uses rule-based plausible inferencing to suggest pictorial constraints and infer active-value responses, number ranges, and set iteration [Myers 88].

L.E.G.O. A system which uses demonstrational techniques for programming geometric constructions (See Geometer's Sketchpad [Jackiw 92]). The system includes heuristics for defining variables by pointing and is able to produce parameterized Lisp macros from demonstrations [Fuller 86].

Snap-dragging. A demonstrational technique for applying constraints in graphics which uses heuristics to place guiding lines and circles relating probable mouse target locations to nearby objects [Bier 86].


Cobweb. An algorithm for mapping examples into classes (the matching problem in programming by demonstration) based on weighted counts of features common to examples [Gennari 90].

OfficeClerk. A instructible prototype system with a metaphorical apprentice for mail-sorting and window placement tasks. Users can "focus attention" on aspects of an example by pointing at them [MacDonald 87].

Theory of cognitive models of systems. "Physical," "design," and "intentional" stances are levels of abstraction from actual mechanisms [Dennett 87].

QuicKeys2. An application-independent macro recorder with semantic event handling (See Tempo II [Pence 88]). Users edit events to select objects by absolute or relative position, menu items by name or location. Sequences can be built one step at a time or by recording a user's actions [Negrino 92].

B. J. Krawchuk and Ian Witten develop machine learning techniques for asking users relevant questions about examples [Krawchuk 88].

Microsoft Word. A text editor capable of defining "styles" from user selected examples [Young 89].


Garnet. User interface development environment for creating graphical, direct manipulation, mouse-based interfaces [Myers 90d].

Graphical search and replace. A demonstrational technique for specifying, searching, and replacing complex graphical search patterns; a graphical form of the UNIX command "grep." Precursor of MatchTool2 (See [Kurlander 92]) [Kurlander 88a].

Metamouse. A system that learns repetitive graphical edits from demonstrations. It is designed to help the user meet felicity conditions for teaching (See [van Lehn 83]) [Maulsby 89b].

Tels. A system for programming repetitive text edits by demonstration which can infer loops and conditionals. Improves upon Nix's gap grammars (See [Nix 83]) by using "longest common subsequence" matching [Mo 89].

Programmer's Apprentice. A system for automating the programming process based on theories of how expert programmers create, describe, analyze, and synthesize programs. The overall goal of the project is to study the fundamental issues in knowledge representation and reasoning [Rich 88].

Tempo II. An application-independent macro recorder with some semantic event handling (See QuicKeys2 [Negrino 92]). Users can direct it to select menu items by name or location, and are able to edit loops and branches into a macro [Pence 88].

Perplex. A system for logic programming with a spreadsheet interface. Users can start with example values, replacing them with abstract expressions as their understanding of the problem increases. Constraints are bi-directional and all values are continuously updated (See C32 [Myers 91b]) [Spenke 89].

Reactive planning systems with an agent-centered sensory focus of attention. [Agre 88].


ETAR. Robot programming by demonstration. The system uses rules for focusing attention on relevant objects and key events. It improves upon NODDY's (See [Andreae 85]) algorithms for inducing control structures [Heise 89].

Clint. A dynamic bias learning system. The system tries to use the simplest theories first and maximize the relevance of questions asked to the user (See [Krawchuk 88]) [de Raedt 89].

Protos. A highly interactive case-based learning system [Porter 90].

NoPumpG. Graphical behaviors are specified within a spreadsheet (See C32 [Myers 91b]) [Wilde 90].

Jade. A system for automatically creating dialog boxes and other types of windows from high-level specifications [Vander Zanden 90].

Lapidary. The system extends Peridot (See [Myers 86a]) to allow application-specific graphical objects to be created, in addition to creating new widgets [Myers 89a].

Lantern. Automatic customization of structured text editors. The system uses heuristics to evaluate the usage of particular constructs over time. It proposes several kinds of customizations, including templates, re-ordering, and filling with default values. Its inference mechanism examines alternative hypotheses, ordering them according to confidence measures [Lerner 89].


Eager. A system for programming repetitive tasks by demonstration in HyperCard. The system infers a variety of textual and numerical relationships with minimal user effort to instruct [Cypher 91].

Turvy. A experiment in prototyping an instructible system by simulating it using the "Wizard of Oz" technique (See Appendix C). Contains a quasi-natural language interface and rules for inferring text structures (Maulsby, Ch.11).

Tourmaline. Text formatting by demonstration. Rule-based inference of text structures and properties [Myers 91b].

ART. An adaptive robot training program. Users write program sketches with partially filled in control structures, then give examples of conditional tests and straight-line procedures [Pauli 90].

Bruce MacDonald theorizes the unimportance of built-in intelligence or extensive use of inference in instructible systems [MacDonald 91].

Fumble. An environment for creating programs by demonstration using a visual language. It represents programs as nested tiles (See Pygmalion) which are evaluated only as required, so the user can create examples that include abstract elements. The system fills these in with generalizations based on run-time example values supplied by the user. [Lau-Kee 90].

OPUS. A user interface editor. Presentation components of a graphical interface are specified without explicit programming, using special interactive graphical notations. The system is capable of performing incremental computations (based on a series of "one-way" constraints) on a set of specified spreadsheet cells with named values [Hudson 90].

VPL. An incrementally interactive dataflow system for creating image processing programs by demonstration [Lau-Kee 90].

NoTech. Pre-processor to the text formatter TeX. Uses rules to parse input data [Lipton 90].

ChemTrains. A system for programming based on graphical rewrite rules. Users define rules by drawing one picture as a search pattern and another as its replacement (See MatchTool2 [Kurlander 92]). The pattern is a collection of polygons and their containment relations (whether object A is inside object B); other spatial relations are ignored. The user can force rules to fire in sequence by inserting invisible labels into patterns and the scene they match (See Triggers, Ch. 17) [Bell 91].


Geometer's Sketchpad. A geometrical drawing system which uses the "sketching" metaphor (See Appendix C). Includes automated figure labeling, an integrated scripting facility, expandable user construction set, and dynamic updating of constraints [Jackiw 92].

Constraints from multiple examples. An algorithm for determining scene constraints by comparing different versions of a graphic structure after various parts are moved. Invariant relationships are identified as constraints [Kurlander 91].

Macros by Example. Users create macros by selecting panels from an editable graphical history and declaring arguments [Kurlander 93].

Mondrian. An instructible graphical editor. Users declare arguments and demonstrate graphical edits. The system infers constraints, generalizes parameters, and is capable of re-using previously taught commands [Lieberman 92b].

Triggers. A non-inferential, application-independent system for programming by demonstration, using the "Find & Do" (See Appendix C) model of programming. Users define pixel patterns to be found and demonstrate macros to be performed. The system uses markers to remember locations and flags for program state (Potter, Ch. 17).

Gilt. A widget's graphical style and action responses are inferred from demonstrations in an interface builder [Myers 91e].

LEDA. A prototype visual programming-with-examples system. The user defines new data types in the form of graphical search patterns. To create a pattern, the user gives a sequence of search commands, each specifying a single attribute; together they form a conjunction [Mima 91].

Tango. An algorithm animator with demonstrational techniques for specifying some behaviors of animated objects [Stasko 91].

DEMO II. A system for creating user interfaces by drawing the components and specifying their behavior using stimulus-response demonstrations. The system can infer constraints from one or several examples [Fisher 92].

C32. Users specify constraint generalizations by giving example values in a continuously updated spreadsheet (See Perplex [Spenke 88]) [Myers 91b].


Instructible Unix shell. A shell capable of generalizing command parameter strings (filenames, etc.) by using an elaboration of the gap-grammar technique (See [Nix 83]) [Baltes 92].

Moctec. A working prototype of an instructible interface. Infers search patterns from multiple examples using feature saliency and natural language hints. Data description dialogs focus on features common to examples given [Maulsby 92a].

First workshop on programming by demonstration is held March 11 - 13 at Apple Computer Inc., Cupertino, CA. Sixteen people attend to discuss the issues and development of programming by demonstration, recapping past events and achievements (Allen Cypher, organizer).

Rockit. A system for inferring graphical constraints [Karsenty 92].

Voice input is incorporated into Mondrian (See [Lieberman 92b]) to provide users with a convenient means of guiding the execution of mouse actions, allowing them to influence the system's generalization process (Turransky, Ch. 24).

Work In Progress

PbD. A general architecture for programming by demonstration [Sassin 92].

Pursuit. A programming by demonstration system with abstracted graphical histories [Modugno 93].

Katie. A macros by example system based on abstract events [Kosbie 93].

AIDE. A workbench for constructing programming by demonstration systems (Piernot, Ch. 18).

RAD. Requirements Acquisition by Demonstration. A system which helps non-programming domain experts describe some aspects of a system by building animations of their behavior (Zorman).

Selina. A system for encoding graphic design knowledge by demonstration (Turransky).

Cima. Inference mechanisms for programming by demonstration (Maulsby).


The authors would like to thank Allen Cypher, Bill Finzer, Henry Lieberman, Brad Myers, and Ian Witten for their important contributions and wonderful historical memories.

The Department of Computer Science at the University of Calgary where David Maulsby works is sponsored in part by research grants from Apple Computer Inc., and the National Sciences and Engineering Research Council of Canada. The Visible Language Workshop at MIT, where Alan Turransky works is sponsored in part by research grants from Alenia Corporation, Apple Computer Inc., DARPA, Digital Equipment Corporation, NYNEX, and Paws Inc.

back to ... Table of Contents Watch What I Do