Abstract: This paper describes an on-going research project investigating the design of a user-interface toolkit composed of sonically enhanced widgets. The motivation for this work is the same that motivated the creation of graphical interface toolkits, which was to simplify their construction, allowing designers who are not experts to create such interfaces; to ensure the sonically enhanced widgets are effective and improve usability; and to ensure the widgets use sound in a clear and consistent way across the interface.
Why should sound be added to human-computer interfaces? There is a growing body of research indicating that the addition of non-speech sounds to human-computer interfaces can improve performance and increase usability (Brewster, Dix, Edwards and Wright 1995; Gaver, O'Shea and Smith, 1991). For example, speech sound is an important means of communication in the everyday world, and the benefits it offers should be taken advantage of at the interface. Such multimodal interfaces allow a greater and more natural communication between the computer and the user. They also allow the user to employ the appropriate sensory modalities to solve a problem, rather than just using one modality (usually vision) to solve all problems.
In spite of the increased interest in multimedia, little systematic research has been done on the most effective ways to combine graphics and sound, even though many computer manufacturers now include sound producing hardware in their machines. Arons & Mynatt (1994) suggest one reason for this: "the lack of design guidelines that are common for the creation of graphical interfaces has plagued interfaces designers who want to effectively build on previous research in auditory interfaces."
Using sound can be beneficial; however, because this area is still in its infancy, sounds can be added in ad hoc ways by individual designers and this can lead to them being ineffective. The aim of my research is to help designers create effective sonically enhanced interfaces, as I describe an on-going research project to construct a sonically enhanced interface toolkit.
Aims of the Toolkit
The four main aims of the toolkit are similar to those that motivated the development of graphical interface toolkits:
- To simplify the implementation of applications that include sound in their interfaces. Currently it is difficult to create sonically enhanced applications, as the code necessary to include the sounds is usually device-dependent and time-consuming. This is a similar problem to that graphical interface designers faced before graphical toolkits were available. Myers (1991) suggests that the use of graphical toolkits significantly reduces the development time of graphical interfaces. This toolkit will do the same for sonically enhanced interfaces.
- To allow designers who are not sound experts to create sonically enhanced interfaces. Interface designers are not often skilled in sound design. A toolkit that has the sounds included would remove the need for detailed knowledge of sound design. This follows the same approach as graphical toolkits in that an interface designer without a detailed knowledge of graphic design can create an interface using a standard graphical interface toolkit.
- To ensure that the sounds added are effective and enhance the user’s interaction with the computer. The sounds added will not be gimmicks. Detailed investigations of usability problems will show where sounds can help usability. Sounds added will overcome these problems and improve usability.
- To ensure the sounds are used in a clear and consistent way across the interface. This consistency will avoid the problems of each application having its own sounds that mean different things in other applications. In graphical interface toolkits, the widgets look consistent across different applications; e.g., a scrollbar looks the same in any application where it is used. In the sonically enhanced toolkit, widgets will sound consistent across different applications.
This project brings together previous work on individual sonically enhanced widgets to form a complete interface toolkit. In each of the widgets, sound supports graphics. Part of the motivation for this research is that users’ eyes cannot do everything. The visual system has a small area of focus. If users are looking at one part of the display then they cannot be looking at another at the same time. In highly complex graphical displays the user must concentrate on one part of the display to perceive the graphical feedback, so that feedback from another part might be missed. Some information should be presented in soundthis will allow users to continue looking at the information required but to hear information that would otherwise not be seenor would not be seen unless they moved their visual attention away from the area of interest, thereby interrupting the task they are trying to perform. Sound and graphics used together will exploit the advantages of each.
The non-speech sounds used for this investigation are based around structured audio messages called Earcons (Blattner, Greenberg, and Sumikawa 1989). Earcons are abstract, synthetic tones that can be used in structured combinations to create sound messages that represent parts of an interface. Detailed investigations of earcons by Brewster show that they are an effective means of communicating information in sound (Brewster, 1994).
Overall Structure of the Toolkit
Each widget in a standard widget set will be enhanced with sound. The overall structure of the sounds will be as follows: Each application will have its own timbre and spatial location (via stereo) as a base for all of its sounds. All widgets within an application will use these and modify them by changing the rhythm, pitch, etc. Figure 1 shows such a hierarchy. At level 1, the three applications all have different timbres and spatial locations. These are inherited by level 2 and modified with pitch, rhythm, etc. These modifications are constant across applications so that widgets in different applications sound consistent (similar to graphical widgets that look consistent across applications). I hope that after using the system, users would come to associate a certain timbre with a particular application.
As an example, consider a button widget in the three applications in Figure 1. If a button was used in the Write application it would have the Write instrument, for example, an organ, and the stereo position; for example, on the left. The button would have its own rhythm and note structure, for example, a two note chord. In the Write application this chord would be played by an organ on the left of the stereo space. If the Spreadsheet application had a piano timbre and a right stereo position, then the same two note chord would be played but modified by that instrument and stereo position. In this way the earcons for each widget would be consistent across the whole interface (the button would always use the same two note chord) but would also fit with the sounds of the application of which it was part.
The sounds will be controlled by MIDI. Almost all current computer systems either have built-in MIDI-controlled synthesisers (for example on sound cards or via DSP chips) or can easily be connected to them. Using MIDI will also provide an easy way for users to customise the sounds. Standard synthesiser control software can be used to change the timbre or intensity of the sounds in any widget.
Sonically Enhanced Widgets
Earcons will be used to overcome usability problems in standard graphical widgets. The earcons will be designed using the guidelines proposed by Brewster (1994). Each widget in a standard toolkit will be analysed to discover any usability problems. From this analysis earcons will be created for the auditory feedback. These new widgets will then be experimentally tested to ensure the sonic enhancements improve their usability.
Figure 1: A hierarchy of sonically enhanced widgets across applications.
So far, sonically enhanced buttons, scrollbars, windows and menus have been implemented and tested. The results were very promising, as shown in my articles listed below (Brewster et al., 1995; Brewster, Edwards and Wright, 1994). For example, the sonically enhanced buttons were given a significantly higher overall preference rating than graphical buttons by users. The time and number of mouse-clicks needed to recover from errors were both significantly reduced. There was also no difference in terms of annoyance between standard buttons and the sonically enhanced ones.
Any references by Brewster can be found at his webpage.
Arons, B. and Mynatt, E. (1994). The future of speech and audio in the interface. SIGCHI Bulletin 26(4).
Barfield, W., Levasseur, G. and Rosenberg, C. (1991). The use of icons, earcons and commands in the design of an online hierarchical menu. IEEE Transactions on Professional Communication 34(2), 101-108.
Blattner, M., Greenberg, R. and Sumikawa, D. (1989). Earcons and icons: Their structure and common design principles. Human Computer Interaction 4(1), 11-44.
Brewster, S.A. (1994). Providing a structured method for integrating non-speech audio into human-computer interfaces. PhD Thesis, University of York, UK.
Brewster, S.A., Dix, A.J., Edwards, A.D.N. and Wright, P.C. (1995). The sonic enhancement of graphical buttons. In Proceedings of Interact’95 (pp. 43-48). Lillehammer, Norway: Chapman & Hall.
Brewster, S.A., Edwards, A.D.N. and Wright, P.C. (1994). The design and evaluation of an auditory-enhanced scrollbar. In Proceedings of CHI’94 (pp. 173-179). Boston, Ma.: ACM Press, Addison-Wesley.
Gaver, W., O’Shea, T. and Smith, R. (1991). Effective sounds in complex systems: The ARKola simulation. In Proceedings of CHI’91 (pp. 85-90). New Orleans: ACM Press, Addison-Wesley.
Myers, B.A. (1991). State of the art in user interface software tools. In Hartson, H. and Hix, D. (Eds.), Advances in Human-Computer Interaction. Ablex Publishing.
Portigal, S. (1994). Auralization of document structure. MSc. Thesis. The University of Guelph: Canada, 1.
Stephen A. Brewster
Glasgow Interactive System Group
Department of Computing Science
The University of Glasgow
Glasgow, G12 8QQ, UK
Tel: +44 (0)141 330 4966