back to ... Table of Contents Watch What I Do

NOTE: Some of the images in this chapter could not be converted properly for use on the web




Chapter
14

Tourmaline:
Text Formatting by Demonstration[ ]

Brad A. Myers

Introduction

We have developed a text formatter that combines the best features of what-you-see-is-what-you-get (WYSIWYG) text formatters, such as MacWrite and Microsoft Word, with batch-oriented, embedded-command formatters such as troff, TEX, and Scribe. Our formatter is WYSIWYG, but in addition, the user can define styles by demonstration using the regular user interface of the editor, to describe how various portions of the document should look. These are used for the same things as macros in embedded-command formatters. The users do not need to learn a complex programming language like those built into embedded-command formatters. However, they can still ensure that all parts of the document look consistent, and make it easy to edit the formatting of all parts at the same time.

For people who only want to follow a pre-defined style sheet, and therefore do not want to create their own styles, this editor should still be easier to use than conventional WYSIWYG editors, because demonstrational techniques make it easier to create tables, citations for bibliographic references, and to use the styles created by others.

Some word processors, such as Microsoft Word 4.0, provide a "styles" mechanism. The user selects a piece of text and gives the command "Define Styles..." This brings up a dialog box that shows all the properties of the selected text. The user can give a name to this set of formatting properties, and then apply them as a group to a different piece of text. However, this is too limited to solve the problem since a style only represents a single set of formatting commands. Figure 1 would require at least 3 styles which the user would have to create and maintain separately. (Another example is the chapter titles for this book.) Also, the styles mechanism cannot usually insert general text, such as the word "Chapter," so some other mechanism would have to be used if the user wanted to change it to "Section" in all the headings.

Figure 1. A chapter heading that includes different formatting for different parts and the constant word "Chapter."

Unlike most of the other systems in this book, the Text Formatting by Demonstration system does not create macros from a transcript of the user's actions. Instead, it looks at the final result (the example of the desired style), and infers the generalization from that. It is irrelevant what sequence of steps was used to create the example.

Furthermore, unlike systems such as Turvy (Chapter 11) and TELS (Chapter 8), this system contains significant knowledge about the typical formats used in documents. Whereas Turvy must be instructed how to find the author field in a bibliography by looking at the punctuation, the Text Formatting by Demonstration system has the concept of the author field built-in, and so it knows where to look for it. The use of this domain knowledge makes the inferencing in this system much more successful and more likely to be correct.

Nevertheless, the system does work by example, since it uses heuristics to infer generalizations from a formatted example of text. Therefore, the reader should not assume that all "by example" or "by demonstration" systems have to be based on macro recorders.

Design

Demonstrational techniques are used in a number of places in our text formatter. The following sections provide an overview of how demonstration is used.

Headings

The user can type in an example chapter or section heading and apply any desired formatting to it, as in Figure 1. This example can include special words, including "Chapter," "Section," "SubSection," or "Appendix" (and the list of special words is easily changed), numbers in many different formats (2, 4.3.2, ii, II, etc.), separators (such as colons, hyphens, commas, etc.), decorations (such as lines and boxes), and other text. The "other text" is assumed to be a parameter to the heading. The user selects the example heading, and gives the "Save Style" command (see Figure 2-a). This command prompts for a name, and then invokes the parser which tries to guess the parts of the example. If there are any ambiguities or problems with the parsing, the user is queried. Since there is very little variability in the content, the examples are rarely ambiguous. However, if the text is something like "The 1st 4 Chapters of Computer Graphics" the system reports that it is confused. (It is not using natural language understanding; only simple pattern matching.)

From the results of the parsing, the system creates a template which is associated with the supplied name. The example is removed from the document, and replaced with a generated heading based on the template and the example text as the parameter. If the inferencing was performed correctly, this will not change the appearance of the text, except that the numbers might be updated. If the system has guessed wrong, the user can undo the guessing with a command, and supply a different example. The first example will then be ignored by the system. If the user simply cannot get the system to guess correctly, the standard WYSIWYG techniques are still available.

Each heading created with a particular style is marked with information about what style name is being used, and what the parts are. These markings are inserted into the document using invisible text so they will be preserved when the file is written out and retrieved.

All of the templates are added to the end of the document when it is written out, also in a hidden format. A command is available to write out only the styles (and not the text of the document), so the user can easily create a style sheet by example. Other commands can be used to make the templates visible at the end of the document, in case the user wants to copy them to a different document or explicitly edit them (but we do not expect that to be necessary).

Headings often use multi-level numbering schemes, so that, for example, sections in chapter two would be numbered 2.1, 2.2, etc. When the user gives an example of a heading, the system must know what level it is for (chapters are at level 1, sections are at level 2, subsections are at level 3, etc.). This can be specified explicitly by the user, or the system will try to guess using various heuristics. If the example number contains multiple parts (such as "2.2"), then the style is assigned to the appropriate level. If the number has only a single part (such as "2"), then if the style contains a special word (such as "chapter"), the level normally associated with that word is used. If none of these rules can be applied, then the number is assumed to be at level one. When the system renumbers the sections, a lower level number is always reset to zero whenever a higher level number is changed.

After a style is created, the user can select a piece of text and invoke the command that applies the named style (see Figure 2-b). This will use the text as the parameter of the style (the chapter or section title), and add the other parts of the template around the text, such as the constant words and the number. The formatting of the text will also be changed based on the style (except that any explicit formatting commands applied to the text will not be overridden, in case, for example, the user wants a word in the title to be italic). The numbers used in the heading will be updated appropriately. In addition, the system searches for all other headings and updates their numbers appropriately.

Unfortunately, in this implementation, it is not possible to detect when the user has deleted a heading. Therefore, we include an explicit "Renumber" command that can be invoked to fix up the numbering of all headings. This command is automatically run before printing or saving a file.

Figure 2. Creating a style by example. (a) The user types in the example, selects it, then selects "Save Style" from the pop-up menu. This prompts for the style name. Later (b), the user can select a different piece of text and apply this style to it, to get (c).

(a)                                                     
(b)                                                     
(c)                                                     

Rather than apply the style to some selected text, the user can instead copy and paste an existing style, and then edit the title portion of the heading. In this case, the numbering of all sections will be automatically updated.

The user can also edit any of the headings at any time. If the user only wants to change the parameter (the title), then nothing else needs to be done. To change the formatting of the template, the user invokes the "Save Style" command again. If the user specifies a new style name, then a new style can be created from the old one. If the same name is used however, then the system asks the user to confirm that the style template should be modified. In the future, we also plan to add a facility for exceptions to previously defined style, for example to handle a heading that happens to be at the top of a page, or when the parameter string is too long. If this is confirmed, then the system re-parses the style and creates a new template. All the existing headings using that style are immediately updated. Note that the user can select any heading in the document and edit it to change the style; it is not necessary to edit a special prototype or style sheet. Also, the editing can use the standard editing and formatting commands directly on the example text.

Tables

One of the most difficult parts of documents to format has been tables. In embedded-command formatters, complex commands are required. In WYSIWYG editors that support tables, such as Microsoft Word 4.0, tables must be formatted and constructed indirectly using many dialog boxes, rulers, and commands.

We have developed a prototype table editor that allows the users to draw tables in the same way they would draw them on paper: a few quick slashes for the lines, type the text into the fields, and the table is complete. The lines jump to appropriate places so they are connected nicely, and the text is always lined up inside the fields (see Figure 3). Also, the user can draw some of the fields, and then type in the data for the rest, and the same formatting will be applied. These uses of inferencing to automatically neaten the picture as it is being drawn, and to maintain the appropriate relationships if the picture is edited, differentiates our table editor from a conventional drawing tool. In fact, it is much quicker to use the table editor than a tool like MacDraw, since the user does not have to be careful to put the lines in the correct places.


Figure 3. The user specifies a table by drawing an example picture. As each line and string is drawn, it snaps to an appropriate position. The window at the bottom is visible if the "Explain" button is checked, and it describes the last inference.

If the user wants some lines to be thick and some thin, or double lines around some fields, they can simply be drawn. All fields do not have to have lines around them. The system will infer fields if separate strings are placed in different parts of the picture.

The table system uses a few simple rules to infer the properties of a table. Lines are assumed to attach if they are close together and perpendicular, rows and columns are assumed to be the exactly the same size if they are nearly so, and the placement of strings is adjusted to be exactly centered or justified, depending on where they appear to be in the field. If the right edge of a column is close to the end of a string inside it, then the width of the column is assumed to be the width of the widest entry in it, and similarly for the rows. If the strings are edited, the lines will move. If the right edge is far from the widest string or if the widest string contains an explicit carriage return, then the column width is assumed to be fixed, and the text is word-wrapped inside it, if necessary.

The inferencing for the tables is much more likely to be correct than for a conventional drawing program such as PED [Pavlidis 85] due to the restrictions on the ways tables are usually presented. All lines are horizontal or vertical, lines rarely are left dangling, and strings are only allowed in certain places. However, the system does occasionally guess wrong. Unlike Peridot (Chapter 6), the system immediately performs the inference, and provides "Undo Guess" and "Guess Again" buttons. These can be used to remove the inference and to try other guesses. This method is faster and seems warranted since the first guess is more often correct than in Peridot. Further study on the appropriate kinds of feedback are ongoing.

Once the table is created with example data, the user can just use it as it is (if the example data is the real data), or define it as a style. In this case, the system will allow the real data to have a different number of rows and columns than the sample data, and will replicate the example formatting as needed. The system tries to be smart about headings in the table, so if they do not appear to be supplied in the data, the example headings will be used. Also, if there are headings which span multiple columns or rows (as in Figure 3), the system will try to replicate them appropriately.

Bibliographic references

Defining the formatting for bibliographic references is quite difficult in most systems. In our editor, the user can specify the formatting by supplying examples of how different types of entries should look. For instance, to define the formatting for a conference article, the user might supply the following example: From this, the system can infer the formatting of all conference proceedings: the key should be the author's last name, first names should be abbreviated and in forward order, there should be a period after the name, etc. Similar inputs can be given to define the formatting for other types of entries, such as books or journal articles.

The data used to drive the bibliography can come from a database in Unix Refer or Scribe format, for example:

@InProceedings(Sketchpad,

Key="Sutherland",

Author="Ivan E. Sutherland",

Title="SketchPad: A Man-Machine Graphical Communication
System",

BookTitle="AFIPS Spring Joint Computer Conference",

Volume=23,

Year=1963,

Pages="329-346")

This will be automatically converted to:

Alternatively, the user can create the bibliographic database by typing the entries using the full formatting. Having a database of bibliographic references allows the system to create a bibliography using different formatting, for example:

3. Sutherland, I. E. SketchPad: A Man-Machine Graphical Communication System. AFIPS Spring Joint Computer Conference, (1963), 329-346.

Other parts

In a similar way as chapter and section headers, the formatting for page headings and footings, entries in a table of contents and index, figure captions, itemized lists, and even the indenting and spacing for normal paragraphs, can be specified by demonstration. The user simply gives an example of the desired formatting, and the system tries to guess the appropriate style information.

Feedback

Since any inferencing system will sometimes guess wrong, it is important that the system supply appropriate feedback so the users are confident that they are in control and know what the system is doing. An important component of the research into the demonstrational formatter is to determine an appropriate form for the feedback.

If the "Explain" button is checked (Figure 3), then all the inferences in the table subsystem are reported to the user. There are also buttons available for undoing and trying other guesses. In the bibliographic subsystem, a window is popped up with the inferred formatting shown. The user can move around the labels and edit the definition, if necessary. In the headers subsystem, however, there is currently no feedback for the user, although we are considering a window like that used for the bibliography. Future research will be aimed at evaluating the best forms for all feedback.

Status and Future Work

The headers part of the formatter has been implemented twice. The first time used the Ness language [Hansen 90] which is part of the "ez" WYSIWYG text editor in the Andrew environment [Morris 86]. This is the version discussed in this paper and shown in Figure 2. Ness is an experimental interpretive language which supports text formatting and parsing. The Ness code can be attached to custom menu commands. This version was not very robust. The second implementation, called Tourmaline, uses Microsoft Word for Windows and is implemented in the WordBasic extension language. Tourmaline stands for Text-formatting Ought to Use and Rely on Macrostyles And Layout Inferred Nicely by Example. This system uses the term "macrostyle" to refer to the complex, composite descriptions of entire headings. Tourmaline supports automatic layout for repeating fields [Werth 92]. For example, the example the style is defined on might only have one author (Figure 4-a), but the next chapter might have two authors (Figure 4-b). Tourmaline parses the example heading looking for various parts (the author, the title, the affiliation, etc.) using heuristics such as: the title is usually before the author, the title is often in a larger font than the author, words such as "Company" and "University" signify an affiliation, different parts are usually separated by newlines, tabs, or special characters like commas, etc.

A prototype of the table subsystem was implemented separately using the Garnet environment (Chapter 10). This was chosen because it allowed the table editor to be created very quickly.

Figure 4. Tourmaline allows styles to contain multiple parts. The user formats an example heading using the regular editor commands, selects it, and defines it as a "macrostyle." For the next chapter, the user enters the text of the heading without formatting, and applies the macrostyle to it. Tourmaline uses a heuristic that since the initial example centered the author, the new current chapter should center both authors horizontally.

INTERCHI'93 GUIDE FOR SUCCESSFUL
SUBMISSIONS

D. Austin Henderson

Xerox Corporation

3333 Coyote Hill Road

Palo Alto, CA 94304

INFERRING TEXT MACROSTYLES BY EXAMPLE

Andrew J. Werth Brad A. Myers

Bell Communications Research School of Computer Science

444 Hoes Lane, Room RRC 4A-542 Carnegie Mellon University

Piscataway, NJ 08854 Pittsburgh, PA 15213

E-mail: ajw1@navaho.cc.bellcore.com E-mail: bam@cs.cmu.edu

We want now to look at how to make the system use inferencing in more places. For example, if the user types "Chapter 5: Conclusion", the system might automatically guess that the Chapter style should be applied to it, rather than requiring the user to explicitly apply it. It may also be possible to create a more general mechanism, so that users can create their own inferencing system for custom parts of documents. More work is also needed on feedback and editing. Although Tourmaline will allow the user to correct incorrect classification of parts (e.g., if an author is thought to be a title), there is currently no way to correct incorrect formatting (e.g., if the user did not want the two authors on the same line in Figure 4).

An important aspect of the future work will be investigating people's reactions to this system. We will provide various feedback mechanisms and various levels of guessing, to determine the right balance between helping and interfering with the user.

Conclusions

The demonstrational formatter described here combines the best features of WYSIWYG and embedded-command formatters, while often being easier to use than either. This editor, along with demonstrational systems in other domains, show that it is possible to extract useful semantics from the surface form of the user's input. This is primarily because the systems are constrained in their domain, and the level of semantics to be derived is limited. Further research will be needed to determine the boundaries of this technique, and to what extent users are willing to accept a program that guesses what they are doing and occasionally makes errors. However, based on its successes so far, we believe that demonstrational techniques will be a significant advance beyond the direct manipulation interfaces of today. The demonstrational formatter described here will help explore this exciting new user interface paradigm.

Acknowledgments

This research was partially funded by NSF grant number IRI-9020089, and by Apple Computer, Inc., and Siemens Corporate Research. The development of Tourmaline was partially funded by Bell Communications Research.

The Andrew portions of the demonstrational formatter were implemented by Richard Chung. Many thanks to Fred Hansen who helped us understand the Ness system and added special features for us. Tourmaline was entirely implemented by Andrew Werth.

For help with this chapter, I want to thank Brad Vander Zanden, Brad Lincoln and Bernita Myers.

Tourmaline

Uses and Users

Application domain: Word processor

Tasks within the domain: Text formatting

Intended Users: Non-programmers

User Interaction

How does the user create, execute and modify programs?
The user creates an example of the desired style and then directs Tourmaline to parse the style and apply it elsewhere.
Any instance of the style can be edited to modify the style.

Feedback about capabilities and inferences:
Inferences about tables are applied immediately, so the user can see if they are correct. The user can also view a textual explanation of each inference, as in Peridot. For styles, a dialog box shows the inferred classification of the parts of the document.

Inference

Inferencing:
Works from a completed example of the desired style, not from a history trace of how the style was created.

Program constructs: Variables. Single parameter for headings.

Knowledge

Types and sources of information:
Knowledge about typical heading, table, and bibliography formats (e.g. how to locate the "author" field in a bibliography).

Implementation

Machine, language, size, date:
Headings in WordBasic for Microsoft Word for Windows. Tables in Garnet. 1990 and 1992.



back to ... Table of Contents Watch What I Do