For people who only want to follow a pre-defined style sheet, and therefore do not want to create their own styles, this editor should still be easier to use than conventional WYSIWYG editors, because demonstrational techniques make it easier to create tables, citations for bibliographic references, and to use the styles created by others.
Some word processors, such as Microsoft Word 4.0, provide a "styles" mechanism. The user selects a piece of text and gives the command "Define Styles..." This brings up a dialog box that shows all the properties of the selected text. The user can give a name to this set of formatting properties, and then apply them as a group to a different piece of text. However, this is too limited to solve the problem since a style only represents a single set of formatting commands. Figure 1 would require at least 3 styles which the user would have to create and maintain separately. (Another example is the chapter titles for this book.) Also, the styles mechanism cannot usually insert general text, such as the word "Chapter," so some other mechanism would have to be used if the user wanted to change it to "Section" in all the headings.
Figure 1. A chapter heading that includes different formatting for different parts and the constant word "Chapter."
Unlike most of the other systems in this book, the Text Formatting by Demonstration system does not create macros from a transcript of the user's actions. Instead, it looks at the final result (the example of the desired style), and infers the generalization from that. It is irrelevant what sequence of steps was used to create the example.
Furthermore, unlike systems such as Turvy (Chapter 11) and TELS (Chapter 8), this system contains significant knowledge about the typical formats used in documents. Whereas Turvy must be instructed how to find the author field in a bibliography by looking at the punctuation, the Text Formatting by Demonstration system has the concept of the author field built-in, and so it knows where to look for it. The use of this domain knowledge makes the inferencing in this system much more successful and more likely to be correct.
Nevertheless, the system does work by example, since it uses heuristics to infer generalizations from a formatted example of text. Therefore, the reader should not assume that all "by example" or "by demonstration" systems have to be based on macro recorders.
From the results of the parsing, the system creates a template which is associated with the supplied name. The example is removed from the document, and replaced with a generated heading based on the template and the example text as the parameter. If the inferencing was performed correctly, this will not change the appearance of the text, except that the numbers might be updated. If the system has guessed wrong, the user can undo the guessing with a command, and supply a different example. The first example will then be ignored by the system. If the user simply cannot get the system to guess correctly, the standard WYSIWYG techniques are still available.
Each heading created with a particular style is marked with information about what style name is being used, and what the parts are. These markings are inserted into the document using invisible text so they will be preserved when the file is written out and retrieved.
All of the templates are added to the end of the document when it is written out, also in a hidden format. A command is available to write out only the styles (and not the text of the document), so the user can easily create a style sheet by example. Other commands can be used to make the templates visible at the end of the document, in case the user wants to copy them to a different document or explicitly edit them (but we do not expect that to be necessary).
Headings often use multi-level numbering schemes, so that, for example, sections in chapter two would be numbered 2.1, 2.2, etc. When the user gives an example of a heading, the system must know what level it is for (chapters are at level 1, sections are at level 2, subsections are at level 3, etc.). This can be specified explicitly by the user, or the system will try to guess using various heuristics. If the example number contains multiple parts (such as "2.2"), then the style is assigned to the appropriate level. If the number has only a single part (such as "2"), then if the style contains a special word (such as "chapter"), the level normally associated with that word is used. If none of these rules can be applied, then the number is assumed to be at level one. When the system renumbers the sections, a lower level number is always reset to zero whenever a higher level number is changed.
After a style is created, the user can select a piece of text and invoke the command that applies the named style (see Figure 2-b). This will use the text as the parameter of the style (the chapter or section title), and add the other parts of the template around the text, such as the constant words and the number. The formatting of the text will also be changed based on the style (except that any explicit formatting commands applied to the text will not be overridden, in case, for example, the user wants a word in the title to be italic). The numbers used in the heading will be updated appropriately. In addition, the system searches for all other headings and updates their numbers appropriately.
Unfortunately, in this implementation, it is not possible to detect when the user has deleted a heading. Therefore, we include an explicit "Renumber" command that can be invoked to fix up the numbering of all headings. This command is automatically run before printing or saving a file.
Figure 2. Creating a style by example. (a) The user types in the example, selects it, then selects "Save Style" from the pop-up menu. This prompts for the style name. Later (b), the user can select a different piece of text and apply this style to it, to get (c).
(a) (b) (c)Rather than apply the style to some selected text, the user can instead copy and paste an existing style, and then edit the title portion of the heading. In this case, the numbering of all sections will be automatically updated.
The user can also edit any of the headings at any time. If the user only wants to change the parameter (the title), then nothing else needs to be done. To change the formatting of the template, the user invokes the "Save Style" command again. If the user specifies a new style name, then a new style can be created from the old one. If the same name is used however, then the system asks the user to confirm that the style template should be modified. In the future, we also plan to add a facility for exceptions to previously defined style, for example to handle a heading that happens to be at the top of a page, or when the parameter string is too long. If this is confirmed, then the system re-parses the style and creates a new template. All the existing headings using that style are immediately updated. Note that the user can select any heading in the document and edit it to change the style; it is not necessary to edit a special prototype or style sheet. Also, the editing can use the standard editing and formatting commands directly on the example text.
We have developed a prototype table editor that allows the users to draw tables in the same way they would draw them on paper: a few quick slashes for the lines, type the text into the fields, and the table is complete. The lines jump to appropriate places so they are connected nicely, and the text is always lined up inside the fields (see Figure 3). Also, the user can draw some of the fields, and then type in the data for the rest, and the same formatting will be applied. These uses of inferencing to automatically neaten the picture as it is being drawn, and to maintain the appropriate relationships if the picture is edited, differentiates our table editor from a conventional drawing tool. In fact, it is much quicker to use the table editor than a tool like MacDraw, since the user does not have to be careful to put the lines in the correct places.
If the user wants some lines to be thick and some thin, or double lines around some fields, they can simply be drawn. All fields do not have to have lines around them. The system will infer fields if separate strings are placed in different parts of the picture.
The table system uses a few simple rules to infer the properties of a table. Lines are assumed to attach if they are close together and perpendicular, rows and columns are assumed to be the exactly the same size if they are nearly so, and the placement of strings is adjusted to be exactly centered or justified, depending on where they appear to be in the field. If the right edge of a column is close to the end of a string inside it, then the width of the column is assumed to be the width of the widest entry in it, and similarly for the rows. If the strings are edited, the lines will move. If the right edge is far from the widest string or if the widest string contains an explicit carriage return, then the column width is assumed to be fixed, and the text is word-wrapped inside it, if necessary.
The inferencing for the tables is much more likely to be correct than for a conventional drawing program such as PED [Pavlidis 85] due to the restrictions on the ways tables are usually presented. All lines are horizontal or vertical, lines rarely are left dangling, and strings are only allowed in certain places. However, the system does occasionally guess wrong. Unlike Peridot (Chapter 6), the system immediately performs the inference, and provides "Undo Guess" and "Guess Again" buttons. These can be used to remove the inference and to try other guesses. This method is faster and seems warranted since the first guess is more often correct than in Peridot. Further study on the appropriate kinds of feedback are ongoing.
Once the table is created with example data, the user can just use it as it is (if the example data is the real data), or define it as a style. In this case, the system will allow the real data to have a different number of rows and columns than the sample data, and will replicate the example formatting as needed. The system tries to be smart about headings in the table, so if they do not appear to be supplied in the data, the example headings will be used. Also, if there are headings which span multiple columns or rows (as in Figure 3), the system will try to replicate them appropriately.
The data used to drive the bibliography can come from a database in Unix Refer or Scribe format, for example:
@InProceedings(Sketchpad,
Key="Sutherland",
Author="Ivan E. Sutherland",
Title="SketchPad: A Man-Machine Graphical Communication
System",
BookTitle="AFIPS Spring Joint Computer Conference",
Volume=23,
Year=1963,
Pages="329-346")
This will be automatically converted to:
3. Sutherland, I. E. SketchPad: A Man-Machine Graphical Communication System. AFIPS Spring Joint Computer Conference, (1963), 329-346.
If the "Explain" button is checked (Figure 3), then all the inferences in the table subsystem are reported to the user. There are also buttons available for undoing and trying other guesses. In the bibliographic subsystem, a window is popped up with the inferred formatting shown. The user can move around the labels and edit the definition, if necessary. In the headers subsystem, however, there is currently no feedback for the user, although we are considering a window like that used for the bibliography. Future research will be aimed at evaluating the best forms for all feedback.
A prototype of the table subsystem was implemented separately using the Garnet environment (Chapter 10). This was chosen because it allowed the table editor to be created very quickly.
Figure 4. Tourmaline allows styles to contain multiple parts. The user formats an example heading using the regular editor commands, selects it, and defines it as a "macrostyle." For the next chapter, the user enters the text of the heading without formatting, and applies the macrostyle to it. Tourmaline uses a heuristic that since the initial example centered the author, the new current chapter should center both authors horizontally.
INTERCHI'93 GUIDE FOR SUCCESSFUL
SUBMISSIONS
D. Austin Henderson
Xerox Corporation
3333 Coyote Hill Road
Palo Alto, CA 94304
INFERRING TEXT MACROSTYLES BY EXAMPLE
Andrew J. Werth Brad A. Myers
Bell Communications Research School of Computer Science
444 Hoes Lane, Room RRC 4A-542 Carnegie Mellon University
Piscataway, NJ 08854 Pittsburgh, PA 15213
E-mail: ajw1@navaho.cc.bellcore.com E-mail: bam@cs.cmu.edu
We want now to look at how to make the system use inferencing in more places. For example, if the user types "Chapter 5: Conclusion", the system might automatically guess that the Chapter style should be applied to it, rather than requiring the user to explicitly apply it. It may also be possible to create a more general mechanism, so that users can create their own inferencing system for custom parts of documents. More work is also needed on feedback and editing. Although Tourmaline will allow the user to correct incorrect classification of parts (e.g., if an author is thought to be a title), there is currently no way to correct incorrect formatting (e.g., if the user did not want the two authors on the same line in Figure 4).
An important aspect of the future work will be investigating people's reactions to this system. We will provide various feedback mechanisms and various levels of guessing, to determine the right balance between helping and interfering with the user.
The Andrew portions of the demonstrational formatter were implemented by Richard Chung. Many thanks to Fred Hansen who helped us understand the Ness system and added special features for us. Tourmaline was entirely implemented by Andrew Werth.
For help with this chapter, I want to thank Brad Vander Zanden, Brad Lincoln and Bernita Myers.
Tasks within the domain: Text formatting
Intended Users: Non-programmers
Feedback about capabilities and inferences:
Inferences about tables are applied immediately, so the user can see if they are correct. The user can also view a textual explanation of each inference, as in Peridot. For styles, a dialog box shows the inferred classification of the parts of the document.
Program constructs: Variables. Single parameter for headings.