Albright, L., A. J. Jackson, and J. Francioni. "Auralization of Parallel Programs." SICHI Bull. 23(4) (1991): 86--87.

Reasons why the auditory systems excels at various tasks are outlined in this article describing parallel program debugging with sound.

Allen, J. B., and D. A. Berkley. "Image Model for Efficiently Modeling Small-Room Acoustics." J. Acoust. Soc. Am. 65 (1979): 943--950.

One of the core papers discussing the image model for the simulation of reverberant rooms which has been applied to the interactive synthesis of virtual acoustic sources.

Arons, B. "Interactively Skimming Recorded Speech." In Proceedings of the User Interfaces Software and Technology (UIST) Conference. Reading, MA: ACM Press/Addison-Wesley, 1993 (in press).

A nonvisual user interface for interactively skimming speech recordings is described. SpeechSkimmer uses simple speech-processing techniques to allow a user to hear recorded sounds quickly, and at several levels of detail. User interaction through a manual input device provides continuous real-time control of speed and detail level of the audio presentation.

Arons, B. "Techniques, Perception, and Applications of Time-Compressed Speech." In Proceedings of 1992 Conference, American Voice I/O Society, held Sep. 1992, 169--177.

A review of time-compressed speech including the limits of perception, practical time-domain compression techniques, and an extensive bibliography.

Arons, B. "A Review of the Cocktail Party Effect." J. Am. Voice I/O Soc. 12 (Jul. 1992): 35--50.

A review of research in the area of multichannel and spatial listening with an emphasis on techniques that could be used in speech-based systems.

Arons, B. "Hyperspeech: Navigating in Speech-Only Hypermedia." In Hypertext '91, 133-146. Reading, MA: ACM Press/Addison-Wesley, 1991.

Hyperspeech is a speech-only (nonvisual) hypermedia application that explores issues of speech user interfaces, navigation, and system architecture in a purely audio environment without a visual display. The system uses speech recognition input and synthetic speech feedback to aid in navigating through a database of digitally recorded speech segments.

Arons, B. Hyperspeech (videotape). ACM SIGGRAPH Video Rev. 88 (1993). InterCHI '93 Technical Video Program.

A four-minute video showing the Hyperspeech system in use.

Arons, B. "The Design of Audio Servers and Toolkits for Supporting Speech in the User Interface." J. Am. Voice I/O Soc. 9 (Mar. 1991): 27--41.

An overview of audio servers and design thoughts for toolkits built on top of an audio server to provide a higher-level programming interface. Arons describes tools for rapidly prototyping and debugging multimedia servers and applications. He includes details of a SparcStation-based audio server, speech recognition server, and several interactive applications.

Asano, F., Y. Suzuki, and T. Sone. "Role of Spectral Cues in Medial Plane Localization." J. Acous. Soc. Am. 88 (1990): 159--168.

A study of localization cues using simulated transfer functions simplified via auto-regressive moving-average models in order to study what cues are critical for median plane localization. The conclusion was that macroscopic patterns above 5 kHz are used to judge elevation, and macroscopic patterns in the high frequencies as well as microscopic patterns below 2 kHz are used for front-rear judgment.

Astheimer, P. "Sonification Tools to Supplement Dataflow Visualization." In Third Eurographics Workshop on Visualization in Scientific Computing, held April 1992, in Viareggio, Italy. (Also in Scientific Visualization--Advanced Software Techniques, edited by Patrizia Palamidese, 15--36. London: Ellis Horwood, 1993.)

Astheimer presents a detailed concept for the integration of sonification tools in dataflow visualization systems. The approach is evaluated with an implementation of tools within the apE-system of the Ohio Supercomputer Center and some examples.

Astheimer, P. "Realtime Sonification to Enhance the Human-Computer Interaction in Virtual Worlds." In Proceedings Fourth Eurographics Workshop on Visualization in Scientific Computing, Abingdon, held April 1993, in England.

An overview of IGD's virtual reality system "Virtual Design." Several acoustic rendering algorithms are explained concerning sound events, direct sound propagation, a statistical approach, and the image source algorithm.

Astheimer, P. "Sounds of Silence--How to Animate Virtual Worlds with Sound." In Proceedings ICAT/VET, held May 1993, in Houston, Texas, USA.

The author presents a concept for an audiovisual virtual reality environment. The facilities of IGD's virtual reality demonstration center and the architecture of the proprietary system "Virtual Design" are introduced. The general processing and data interpretation schema is explained.

Astheimer, P. "What You See is What You Hear--Acoustics Applied to Virtual Worlds." IEEE Symposium on Virtual Reality, held October 1993, in San Jose, California, USA. Los Alamitos, CA: IEEE Computer Society Press, 1993.

This paper concentrates on the realization and problems of the calculation of sound propagation in arbitrary environments in real time. A brief overview over IGD's virtual reality system "Virtual Design"; the basic framework is given. The differences between graphic and acoustic models and rendering algorithms are discussed. Possible solutions for the rendering and subsequent auralization phase are explained. Several examples demonstrate the application of acoustic renderers.

Ballas, J. A. "Delivery of Information Through Sound." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

Ballas presents an overview of how different forms of information can be effectively delivered through nonspeech sound. The coverage is organized by linguistic devices. In addition, some details are presented on the importance of listener expectancy, and how it may be measured.

Ballas, J. A. "Common Factors in the Identification of an Assortment of Brief Everyday Sounds." J. Exp. Psych.: Hum. Percep. & Perf. 19 (1993): 250--267.

Ballas presents five experiments conducted to investigate factors that are involved in the identification of brief everyday sounds. In contrast to other studies, the sounds were quite varied in type, and the factors studied included acoustic, ecological, perceptual, and cognitive. Results support a hybrid approach to understanding sound identification.

Ballas, J. A., and T. Mullins. "Effects of Context on the Identification of Everyday Sounds." Hum. Perf. 4(3) (1991): 199--219.

The authors present the results of four experiments conducted to investigate the effects of context on the identification of brief everyday sounds. The sounds were nearly homonymous (i.e., similar sounds produced by different causes). Results showed that context had a significnt effect, especially in biasing listeners against a sound cause that was inconsistent with the context.

Ballas, J. A., and J. H. Howard, Jr. "Interpreting the Language of Environmental Sounds." Envir. & Beh. 19 (1987): 91--114.

The authors present some comparisons between the perceptual identification of environmental sounds and well-studied speech perception processes. Comparisons are made at the macro level, as well as in the details.

Begault, D. R., and E. M. Wenzel. "Headphone Localization of Speech." Hum. Factors 35(2) (1993): 361--376.

An empirical study of subjects judging the position of speech presented over headphones using nonindividualized HRTFs. Subjects expressed their judgments by saying their estimate of distance and direction after each speech segment was played. Patterns of errors are described, and it is concluded that useful azimuth judgements for speech are possible for most subjects using nonindividualized HRTFs.

Bidlack, R. "Chaotic Systems as Simple (but Complex) Compositional Algorithms." Comp. Music J. 16(3) (1992): 33--47.

Bidlack describes his portrayal of nonlinear mathematical events within his music much the same as earlier composers utilized such phenomena as prime numbers and the Fibonaci series.

Blattner, Meera M., Ephraim P. Glinert, and Albert L. Papp, III. "Sonic Enhancements for Two-Dimensional Graphic Displays." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

By studying the specific example of a visually cluttered map, the authors suggest general principles that lead to a taxonomy of characteristics for the successful utilization of nonspeech audio to enhance the human-computer interface. Their approach is to divide information into families, each of which is then separately represented in the audio subspace by a set of related earcons.

Blattner, Meera M. "Sound in the Multimedia Interface." In The Proceedings of Ed-Media '93, held June 23--26, 1993, in Orlando, Florida, 76--81. Association for the Advancement of Computing in Education, 1993.

The focus of this article is on recent developments in audio; however, the motivation for the use of sound is to provide a richer learning experience. This article begins with a description of the flow state, that state of mind in which we are deeply involved with what we are doing, and proposes some techniques for achieving the flow state through our use of audio.

Blattner, M. M., and R. M. Greenberg. "Communicating and Learning Through Non-speech Audio." In Multimedia Interface Design in Education, edited by A. Edwards and S. Holland. NATO ASI Series F, 133--143. Berlin: Springer-Verlag, 1992.

This article begins with an examination of the way structured sounds have been used by human beings in a variety of contexts and goes on to discuss the how the lessons from the past may help us in the design and use of sound in the computer-user interface. Nonspeech sound messages, called earcons, are described with an application to the study of language.

Blattner, M. M., R. M. Greenberg, and M. Kamegai. "Listening to Turbulence: An Example of Scientific Audiolization." In Multimedia Interface Design, edited by M. Blattner and R. Dannenberg, 87--102. Reading, MA: ACM Press/Addison-Wesley, 1992.

The authors discuss some of the sonic elements that could be used to represent fluid flow.

Blattner, Meera M., and R. B. Dannenberg, eds. Multimedia Interface Design. Reading, MA: ACM Press/Addison-Wesley, 1992. To be published in Chinese by Shanghai Popular Press, 1994.

Eight of the 21 chapters of this book are focused on sound in the multimedia interface. Many of the other chapters consider the role of sound as one of the elements of components of the multimedia interface.

Blattner, M. M., D. A. Sumikawa, and R. M. Greenberg. "Earcons and Icons: Their Structure and Common Design Principles." Hum.-Comp. Inter. 4(1) (1989): 11--44.

This article describes earcons, auditory messages used in the computer-user interface to provide information and feedback. The focus of the article is on the structure of earcons and the design principles they share with icons.

Blauert, J. Spatial Hearing: The Psychophysics of Human Sound Localization. Cambridge, MA: MIT Press, 1983.

The author provides a thorough overview of the psychophysical research on spatial hearing in Europe (particularly Germany) and the United States prior to 1983. Classic text on sound localization.

Blauert, J. "Sound Localization in the Median Plane." Acustica 22 (1969): 205--213.

The author describes a series of experiments that demonstrated the role of linear distortions caused by pinna filtering in localizing sound in the median plane. He demonstrates that the "duplex theory," which postulated interaural time and intensity differences as the cues for localization, was not sufficient to explain all localization phenomena.

Bly, S. "Sound and Computer Information Presentation." Unpublished doctoral dissertation, University of California, Davis, 1982.

Bly evaluates auditory displays for three classes of data: multivariate, logarithmic, and time-varying. A series of formal experiments on multivariate data displays were conducted, demonstrating that in many cases auditory displays elicited human performance equal to or greater than that elicited by conventional visual displays.

Bly, S., S. P. Frysinger, D. Lunney, D. L. Mansur, J. J. Mezrich, and R. C. Morrison. "Communication with Sound." In Readings in Human-Computer Interaction: A Multidisciplinary Approach, edited by R. Baecker and W. A. S. Buxton, 420--424. Los Altos: Morgan Kaufmann, 1987.

Contributors discussed their approaches to communicating data via sound at CHI '85 and this chapter is a result of that presentation. This is the first time that there was a national conference session dedicated exclusively to the general use of nonspeech audio for data representation.

Boff, K. R., L. Kaufman, and J. P. Thomas. Handbook of Perception and Human Performance. Sensory Processes and Perception, Vol. 1. New York: John Wiley & Sons, 1986.

Various sound parameters are delineated and discussed, including their interpretation by individuals having auditory pathologies. An excellent first source for the definition of sound parameters and inquiry into the complexities of sonic phenomena.

Boff, K. R., and J. E. Lincoln, eds. Engineering Data Compendium: Human Perception and Performance. Ohio: Armstrong Aerospace Medical Research Laboratory, Wright-Patterson Air Force Base, 1988.

This three-volume compendium distills information from the research literature about human perception and performance that is of potential value to systems designers. Plans include putting the compendium on CD. A separate user's guide is also available.

Borin, G., G. De Poli, and A. Sarti. "Algorithms and Structures for Synthesis Using Physical Models." Comp. Music J. 16(4) (1993).

This is the introductory article to two special issues on physical modeling for sound synthesis in this excellent journal; this article reviews techniques.

Bregman, A. S. "Auditory Scene Analysis." In Proceedings of the 7th International Conference on Pattern Recognition, held in Montreal, 168--175, 1984.

Classic paper in which the concept of a positive assignment of components of a complex acoustic signal into multiple perceptual streams was first introduced

Bregman, A. S., and Y. Tougas. "Propagation of Constraints in Auditory Organization." Percep. & Psycho. 46(4) (1989): 395--396.

The authors present psychoacoustic evidence that grouping occurs on the basis of all evidence in the acoustic signal. This is not consistent with grouping as a consequence of the output from particular filters.

Bregman, A. S. Auditory Scene Analysis. Cambridge, MA: MIT Press, 1990.

Bregman provides a comprehensive theoretical discussion of the principal factors involved in the perceptual organization of auditory stimuli, especially Gestalt principles of organization in auditory stream segregation.

Brewster, S. A., P. C. Wright, and A. D. N. Edwards. "A Detailed Investigation into the Effectiveness of Earcons." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

The authors carried out experiments on structured audio messages, earcons, to see if they were an effective means of communicating in sound. An initial experiment showed earcons to be better than unstructured sounds and that musical timbres were more effective than simple tones. A second experiement was carried out to develop ideas from the first. A set of guidelines are presented.

Bronkhorst, A. W., and R. Plomp. "The Effect of Head-Induced Interaural Time and Level Differences on Speech Intelligibilty in Noise." J. Acous. Soc. Am. 83 (1988): 1508--1516.

Spoken sentences in Dutch with noise were recorded in an anechoic room with a KEMAR manikin, and the role of interaural time delay and headshadowing on intelligibility was studied.

Brown, M. H. "An Introduction to Zeus: Audiovisualization of Some Elementary Sequential and Parallel Sorting Algorithms." In CHI '92 Proceedings, 663--664. Reading, MA: ACM Press/Addison-Wesley, 1992.

Visualization and sonification of parallel programs demonstrate that sound can reinforce, supplant, and expand the visual channel.

Brown, M. L., S. L. Newsome, and E. P. Glinert. "An Experiment into the Use of Auditory Cues to Reduce Visual Workload." In CHI '89 Proceedings, 339--346. Reading, MA: ACM Press/Addison-Wesley, 1989.

Sound is presented as a means to reduce visual overload. However, subject testing revealed a doubled reaction time for sound cueing vs. visual cueing. The authors recommend aural training for effective implementation.

Buford, J. K. "Multimedia Systems." Reading, MA: ACM/Addison-Wesley, 1993.

Provides a technical overview of multimedia systems, including information on sound and video recording, signal processing, system architectures, and user interfaces. Covers fundamental principles, applications and current research in multimedia, as well as operating systems, database management systems, and network communication.

Burdic, W. S. Underwater Acoustic System Analysis. Englewood Cliffs, NJ: Prentice-Hall, 1984.

This book provides a good general background in the fundamentals of sonar systems, including a historical background, basic acoustics, transducers, ocean acoustics, sonar signal processing, decision theory, beamforming, and active and passive systems. Although the presentation is sometimes mathematically technical, no specific background is assumed.

Burgess, D. "Techniques for Low Cost Spatial Audio." In UIST '92: User Interface Software and Technology. Reading, MA: ACM Press/Addison-Wesley, in press.

Burgess describes a technique for synthetic spatialization of audio and the computational requirements and performance of the technique.

Burns, E. M., and D. Ward. "Intervals Scales and Tuning." In The Psychology of Music, edited by Diana Deutsch. New York: Academic Press, 1982.

A discussion of sensory consonance and dissonance reviews the work of earlier researchers and the conclusions regarding pure and complex tone interpretation. A good base reference for further exploration into consonance and dissonance.

Butler, R. A., and R. A. Humanski. "Localization of Sound in the Vertical Plane With and Without High-Frequency Spectral Cues." Percep. & Psycho. 51(2) (1992): 182--186.

Noise bursts were played over seven loudspeakers spaced 15 degrees apart in the vertical plane, and subjects judged the position of the sources. The authors conclude from the results that, without pinna cues, subjects can still localize low-pass noise in the lateral vertical plane using binaural time and level differences, but that pinna cues are critical for accurate localization in the median vertical plane.

Buttenfield, B. P., and C. R. Weber. "Visualization and Hypermedia in GIS." In Human Factors in GIS, edited by H. Hearnshaw and D. Medyckyj-Scott. London: Belhaven Press, in press.

An overview of sonification types is presented for their implementation into cartographic displays and Geographic Information Systems.

Buxton, W., W. Gaver, and S. Bly. "The Use of Non-speech Audio at the Interface." Tutorial no. 10, given at CHI '89.

A good overview of the use of nonspeech audio, the psychology of everyday listening, alarms and warning systems, and pertinent issues from psychoacoustics and music perception. A number of classic papers are reproduced.

Buxton, B. "Using Our Ears: An Introduction to the Use of Nonspeech Audio Cues." In Extracting Meaning from Complex Data: Processing, Display, Interaction, edited by E. J. Farrel, Vol. 1259, 124--127. SPIE, 1990.

An overview of the classes of audio cue and their utility.

Buxton, W., and T. Moran. "EuroPARC's Integrated Interactive Intermedia Facility (iiif): Early Experiences." In Proceedings of the IFIP WG8.4 Conference on Multi-User Interfaces and Applications, held September, 1990, in Herakleion, Crete.

The authors review the design, technology, and early applications of EuroPARC's media space; a computer-controlled network of audio and video gear designed to support collaboration.

Calhoun, G. L., G. Valencia, and T. A. Furness. "Three-Dimension- al Auditory Cue Simulation for Crew Station Design/Evaluation." In Proceedings of the Human Factors Society 31st Annual Meeting, 1398--1402. Santa Monica, CA: Human Factors Society, 1987.

Researchers from Armstrong Aerospace Medical Research Laboratory at Wright-Patterson Air Force Base compared two methods of generating cues to simulate three-dimensional auditory display for cockpit simulation. The described use of mechanical means to simulate localization cues preceeded DSP-based simulation.

Calhoun, G. L., W. P. Janson, and G. Valencia. "Effectiveness of Three-Dimen- sional Auditory Directional Cues." In Proceedings of the Human Factors Society 32nd Annual Meeting, 68--72. Santa Monica, CA: Human Factors Society, 1988.

The authors compare auditory cues for directing visual attention to peripheral targets. The cues tested were visual symbol, coded aural cue, speech cue, three-dimensional nonspeech audio (spatially cued white noise), and three-dimensional speech. Ordering these cues by mean reaction times, from fastest to slowest, they found: visual bar, three-dimensional tone, three-dimensional speech, speech, and coded tone. The superiority of spatial audio display to coded tone for this task is noted.

Cazden, N. "Sensory Theories of Musical Consonance." J. Aesthetics & Art Crit. 20 (1962): 301--319.

In these two articles, Cazden addresses the cultural and historical preconceptions which underly the consonance interpretation of sound by listeners. Cazden notes that the approach to the problem of consonance given by sensory theories entails fundamental error in its isolation of sound from cultural and historical context.

Chambers, J. M., M. V. Mathews, and F. R. Moore. "Auditory Data Inspection." Technical Memorandum 74-1214-20, AT&T Bell Laboratories, 1974.

The authors investigate the use of sound to represent quantitative data using multiple parameters of sound to encode those dimensions of multidimensional data which were not displayed on a conventional scatter plot. Their auditory display was based on three parameters: frequency, spectral content, and amplitude modulation. Without formal experimentation, they found that their auditorily enhanced scatter plot display system promoted the classification of multivariate data.

Cherry, E. C. "Some Experiments on the Recognition of Speech with One and with Two Ears." J. Acous. Soc. Am. 25 (1953): 975--979.

A classic paper on the cocktail-party effect, demonstrating the role of attention in the ability to track one voice from a crowd.

Chowning, J. "The Synthesis of Complex Audio Spectra by Means of Frequency Modulation." J. Audio Engr. Soc. 21 526--534.

The article that introduced frequency modulation as an efficient and powerful way to synthesize complex spectra. Provides a detailed explanation of the mathematics of frequency modulation.

Clynes, Manfred, ed. Music, Mind, and Brain: The Neuropsychology of Music. ISBN 0-306-40908-9. New York: Plenum, 1982.

A collection of papers based on the conference on Physical and Neuropsychological Foundation of Music which was in Ossiach (wherever that is!) in 1980. It covers topics such as the nature of the language of music, how the brain organises musical experience, perception of sound and rhythm, and how computers can help contribute to a better understanding of musical processes.

Cohen, J. "Monitoring Background Activities." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

This paper describes a design-and-test approach to the question of what kinds of notifications are appropriate for users to monitor background activities on a computer. Users' reactions to sound effects, text-to-speech, and graphical notifications of background "file sharing" events suggested ways to improve the notifications. A second study confirmed the utility of some of these changes, but pointed out other areas needing improvement.

Computer-Human Interface: The Proceedings of ACM-SIGCHI (series). Reading, MA: Addison-Wesley.

The CHI conference proceedings are a primary source for the best, most recent papers describing work on human-computer interaction (including the design of nonspeech audio interfaces).

Computer Music Association. International Computer Music Conference Proceedings (series). San Francisco, CA: ICMA.

Published annually (since 1974), this collection of papers from researchers working in computer music address any topic, within the scope of these conferences, in which a computer and music might intersect (even remotely). Papers on synthesis, real-time systems, analysis, composition, alternate controllers, acoustics, and others.

Computer Music Journal. Cambridge, MA: MIT Press.

CMJ is the most important source for information on new sound synthesis algorithms, computer music composition techniques, computer-assisted music analysis programs, and a host of other issues.

Crawford, C. "Lessons from Computer Game Design." In The Art of Human-Computer Interface Design, edited by B. Laurel. Reading, MA: Addison-Wesley, 1990.

Viewer expectation of sound has primarily been a result of the popularity of television for which the sound track is an essential part of the viewing experience, and through which most of the information is carried. This can be illustrated by watching a program with the sound off, and then listening to a similar program with the picture brightness turned all the way down. The program is usually more comprehensible without picture than without sound.

Crispien, K., and H. Petrie. "Providing Access to GUI's for Blind People Using a Multimedia System--Based on Spatial Audio Presentation." Audio Engineering Society 95th Convention Preprints, October 1993. New York: AES Press, in press.

This paper deals with the acoustic aspects addressed in the GUIB project (Textual and Graphical User Interfaces for Blind people). General aspects and strategies for the use of audio for the development of a nonvisual, acoustic display are introduced and discussed. Technical acoustic developments for a spatial representation of the auditory display, namely a multiple loudspeaker device and initial experiments with a HRTF-based headphone system are described. The results of an initial acoustic evaluation study on 18 blind and sighted subjects are presented. Finally an outlook for future research activities is given.

Dannenbring, G. L., and A. S. Bregman. "Streaming vs. Fusion of Sinusoidal Components of Complex Tones." Percep. & Psycho. 24(4) (1978): 369--376.

The authors describe perceptual consequences of the interaction between relative intensity and onset/offset asynchrony of partials of complex tones.

Das, S., and R. Bargar. "Sound for Virtual Immersive Environments." Notes for Course 23, Applied Virtual Reality, Chapter 4, SIGGRAPH '93, 1993.

The authors discuss issues involved in implementing and using sound in virtual reality, as well as providing a description of the CAVE audio system, from both hardware and software standpoints.

Davies, J. B. The Psychology of Music. Stanford, CA: Stanford University Press, 1978.

Davies provides a clear and decipherable account of the state of research up until the publication date, covering all of the fundamental areas: physics of sound, early psychophysical studies, melody perception, musical aptitude, as well as the basic musical parameters (pitch, loudness, timbre, duration) and their physical correlates. Particularly interesting are Davies' human perspective, e.g., "music exists in the ear of the listener, and nowhere else," and his final chapter on specific musical instrument families and character traits of the individuals who play them.

Davis, D. Computer Applications in Music: A Bibliography. ISBN 0-89579-225-7. A-R Editions, 1988.

A collection of references to other papers in the computer/music domain, over 500 pages of them. Covers aesthetics, composition, music in education, digital audio and signal processing, MIDI, programming languges, synthesis, and many others.

Deutsch, D. "Music Recognition." Psych. Rev. 76(3) (1969): 300--307.

Knowing what we perceive about harmonic intervals is dependent upon how we perceive them. This causality is important if they are to be utilized as words of a natural language for data displays. Harmonic intervals are basic operatives of musical abstraction, and the question arises as to whether or not their recognition is innate or learned.

Deutsch, D. "Organizational Processes in Music." In Music, Mind and Brain: The Neuropsychology of Music, edited by M. Clynes. New York: Plenum, 1982.

The elements of music may be isolated through decomposition but, in practice, are dependent upon each other. They are multicolinear in their perception by the listener.

Diana Deutsch, ed. The Psychology of Music. ISBN 0-12-213562-8. New York: Academic Press, 1982.

A well-known book, covers perception, analysis of timbre, rhythm and tempo, timing, melodic processes, and others.

Deutsch, D. "The Tritone Paradox: An Influence of Language on Music Perception." Music Percep. 8 (1991): 335--347.

Of particular interest, as the author presents evidence that individuals not only perceive the same musical intervals between complex tones differently but also that the perception of each individual is related to his or her own customary speech patterns.

DiGiano, Christopher J., and Ronald M. Baecker. "Program Auralization: Sound Enhancements to the Programming Environment." In Proceedings of the Graphics Interface '92, 44--52, 1992.

The authors identify classes of program information suitable for mapping to sound and suggest how to add auralization capabilities to programming environments. they describe LogoMedia, a sound-enhanced programming system which illustrates these concepts.

DiGiano, Christopher J. "Visualizing Program Behavior Using Non-speech Audio." M.Sc. Thesis, University of Toronto, 1992.

DiGiano addresses the use of sound for software visualization and considers it in concert with the other modalities. The potential of sound to illucidate a program's behavior is investigated. A programming environment is presented which supports the ability to trace control and data flow during program execution using audio.

Doll, T. J., and D. J. Folds. "Auditory Signals in Military Aircraft: Ergonomic Principles Versus Practice." Appl. Ergo. 17 (1986): 257--264.

The authors studied and compared the auditory signals used in a variety of aircraft and found no standardization. They found also that a relatively large number of signals were used to make it difficult for the crew to recall the meaning of the messages.

Doll, T. J., and T. E. Hanna. "Enhanced Detection with Bimodal Sonar Displays." Human Factors 31 (1989): 539--550.

This paper is an examination of the visually and aurally enhanced sonar displays. Signal uncertainty was found to cause significantly greater decrement in performance for detectability in visual displays than in auditory displays.

Doll, T. J., T. E. Hanna, and J. S. Russotti. "Masking in Three-Dimensional Auditory Displays." Human Factors 34(3) (1992): 255--265.

The authors study masking in a three-dimensional display for a simulated sonar task. Found detectability of a tonal signal is greater when background noise is uncorrelated. Head coupling of the three-dimensional display had no significant effect given that the task was simple signal detection rather than localization, classification, or tracking.

Doughty, J., and W. Garner. "Pitch Characteristics of Short Tones II: Pitch as a Function of Duration." J. Exp. Psych. 38 (1948): 478--494.

One of the earliest issues in the psychology of hearing was how long a tone must be in order to have an identifiable pitch. The authors show that when a tone is long enough to have a perceptible pitch, the actual pitch has little or no dependence on duration.

Dowling, W.J., and D. L. Harwood. Music Cognition. San Diego: Academic Press, 1986.

A general text providing an abundance of information concerning the physical characteristics of musical sound and the processes involved in its perception. Topics covered include basic acoustics, physiology of hearing, music perception (e.g., timbre, consonance/dissonance, etc.), melodic organization, temporal organization, emotion and meaning, and cultural context of musical experience; abundant references to research in each of these areas are provided for further reading.

Draper, S., K. Waite, and P. Gray. "Alternative Bases for Comprehensibility and Competition for Expression in an Icon Generation Tool." In Proceedings of Interact '90, held August 27--31, 1990, in Cambridge, UK. Amsterdam: North Holland, 1991.

The authors describe a system for systematically generating families of icons. Notable for suggesting the possibilities of parameterizing visual icons.

Durlach N. I., and L. D. Braida. "Intensity Perception I: Preliminary Theory of Intensity Resolution." J. Acous. Soc. Am. 46(2) (1969): 372--383.

Durlach, Braida, and their colleagues in a series of papers have proposed a general model of acoustic intensity resolution which incorporates the noise of sensory and memory processes. The model addresses factors that affect memory noise, such as the stimulus range, timing of experimental events, and the type of task.

Durlach, N. I., and X. D. Pang. "Interaural Magnification." J. Acous. Soc. Am. 80 (1986): 1849--1850.

A brief examination of the issues involved in super-localization display, i.e., enhancing the normal cues used in localization. Problems with the use of an "enlarged head" (with greater distance between the ears) are addressed, and a signal-processing scheme for interaural magnification is described.

Durlach, N. I., A. Rigopulos, X. D. Pang, W. S. Woods, A. Kulkarni, H. S. Colburn, and E. M. Wenzel. "On the Externalization of Auditory Images." Presence 1(2) (1992): 251--257.

The authors discuss some of the important factors involved in synthesizing virtual acoustic sources beyond the simulation of pinna cues.

Durlach, N. I. "Auditory Localization in Teleoperator and Virtual Environment Systems: Ideas, Issues, and Problems." Perception 20 (1991): 543--554.

The author discusses the use of auditory localization cues for virtual environments and teleoperations, with special attention to the potential for superlocalization (i.e., providing enhanced cues). Schemes for encoding position are described and their difficulties are discussed. This paper is a review of the literature and a position statement, rather than a presentation of empirical results.

Edwards, A. D. N. "Adapting User Interfaces for Visually Disabled Users." Ph.D. Thesis, The Open University, July 1987. (Available on microfiche from the British Library, Shelf number DX 80409.)

Edwards describes how a graphical user interface can be adapted to be accessible to blind people through the use of speech and nonspeech sounds.

Edwards, A. D. N., and S. Holland, eds. Multimedia Interface Design in Education. NATO ASI Series F: Computer and Systems Sciences, Vol. 76. Berlin: Springer-Verlag, 1992.

This book includes several chapters on the use of nonspeech sounds, including earcons and music.

Edwards, A. D. N. "Modeling Blind Users' Interactions with an Auditory Computer Interface." Intl. J. Man-Mach. Stud. 30(5) (1989): 575--589.

Edwards describes a model of the interaction between blind users and a mouse-based interface using nonspeech sounds.

Edwards, A. D. N. "Soundtrack: An Auditory Interface for Blind Users." Hum.-Comp. Inter. 4(1) (1989): 45--66.

Edwards describes how a graphical user interface can be adapted to be accessible to blind people through the use of speech and nonspeech sounds.

Edwards, A. D. N. "Graphical User Interfaces and Blind People." In Proceedings 3rd International Conference on Computers for Handicapped Persons, held July 1992 in Vienna, 114--119.

A summary of developments in making GUIs accessible to blind people by the addition of an auditory channel.

Edwards, A. D. N. "Evaluation of Outspoken Software for Blind Users." Technical Report YCS150, Department of Computer Science, University of York, 1991.

The evaluation of a commercial product that makes the Macintosh accessible to blind users through the addition of speech and nonspeech sounds.

Edwards, A. D. N., and S. Holland (eds.) Multimedia Interface Design in Education. NATO ASI Series F, Computer and Systems Sciences, Vol. 76. Berlin: Springer-Verlag, 1992.

This is a collection of papers from a workshop. As suggested by the title, the emphasis is on use of multimedia in education, though naturally many of the conclusions have a broader significance. The authors come from a wide variety of backgrounds--both theoretical and practical and there is an emphasis on the use of multiple media within the human-computer interface, as well as the use of computers to control multimedia displays.

Edworthy, J., S. Loxley, and I. Dennis. "Improving Auditory Warning Design: Relationship Between Warning Sound Parameters and Perceived Urgency." Human Factors 33 (1991): 205--232.

The authors examine the role of both spectral and temporal parameters in conveying urgency. They identify nine parameters that contribute to perceived urgency and show how selected combinations of these parameters could convey varied levels of urgency. The parameters include spectral and envelope properties of sound bursts as well as temporal and melodic patterns across several bursts which are joined to formed an urgency alarm.

Edworthy, J., and R. D. Patterson. "Ergonomic Factors in Auditory Systems." In Proceedings of Ergonomics International '85, edited by I. D. Brown. Taylor and Frances, 1985.

An important paper on the design of speech and nonspeech sounds for use in aircraft cockpits.

Evans, B. "Correlating Sonic and Graphic Materials in Scientific Visualization." In Extracting Meaning from Complex Data: Processing, Display, Interaction, edited by E. J. Farrel, Vol. 1259, 154--162. SPIE, 1990.

Evans generated variable-pitch domain sonifications from a mathematical abstraction similar to that which generates fractal Julia Sets. He notes, in an analogy to cartographic color selection, that selection of a pitch domain may affect the aesthetic and informative quality of a "sonic map." He sonifies the mathematical model with quarter-tone (24 pitches per octave), chromatic (12 pitches per octave), diatonic (7 pitches per octave), and hexatonic (6 pitches per octave) scales. Informal results reveal that the quarter-tone scale better reflects the actual event, though the diatonic and hexatonic scales, which were more pleasing to listeners, sonified the process with less detail.

Fisher, P. "Hearing the Error in Classified Remotely Sensed Images." Unpublished manuscript in review, University of Leicester, 1993.

Fisher reports using auditory data representations for error detection in classified, remotely sensed images. One of the few applications of sonification to cartography and GIS.

Fisher, S. S., E. J. Wenzel, C. Coler, and M. W. McGreevy. "Virtual Interface Environment Workstations." In Proceedings of the 32nd Annual Meeting of the Human Factors Society, held in Anaheim, CA, 91--95, 1988.

This paper on NASA-Ames Research Center's Virtual Interface Environment Workstation includes an early description of the center's work using binaural auditory display, synthesis of three-dimensional sound cues, speech synthesis and recognition, and associating "sound signatures" with objects or types of information display in a virtual environment.

Fitch, T., and G. Kramer. "Sonifying the Body Electric: Superiority of an Auditory over a Visual Display in a Complex, Multivariate System." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

The authors present an eight-variable auditory interface for anesthesiologists which uses self-labeling streams with data variables "piggy-backed" upon that stream by manipulation of selected acoustic variables. Subjects using the display demonstrated faster and more accurate response using the auditory display than with the visual and the combined auditory/visual displays.

Fletcher, H., and W. A. Munson. "Loudness: Its Definition, Measurement and Calculation." J. Acous. Soc. Am. 5 (1933): 82--88.

The authors refer to "dynamic" as the perceived loudness of a passage of music. This perception of amplitude is discussed in detail.

Forbes, T. W. "Auditory Signals for Instrument Flying." J. Aeronautical Soc. May (1946): 255--258.

After finding that combinations of tones created a confusing display that was difficult to use, the author turned to one signal in which multiple data variables were represented by multiple auditory variables. He found that pilots were able to use the display as well as a visual display after only an hour of training. Four key design points were suggested: (1) Pilots have certain habitual methods of thinking about the airplane, and the signals must be designed to fit these habits of thought. (2) Because most fliers are accustomed to using visual indicators, the auditory indicators must be as simple and self-explanatory as possible. (3) When multiple signals were used, there was a tendency for one signal to "capture" the attention of the pilot, to the exclusion of the other signals. This phenomenon should be avoided. (4) The display should be designed to fit the capabilities of the average pilot and should be subjected to unbiased psychological testing.

Francioni, J. F., L. Albright, and J. A. Jackson. "Debugging Parallel Programs Using Sound." In Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging, 68--73. Reading, MA: ACM Press/Addison-Wesley, 1991.

These two articles describe the same research: the mapping of parallel processor activity to sound parameters. By building structures such as jazz-like chords whose notes' pitch, attack, and crescendo describe the activity of various processors, the authors are able to analyze processor loads, flow of processor control, and processor communication.

Francioni, J. F., J. A. Jackson, and L. Albright. "The Sounds of Parallel Programs." In Proceedings of the Sixth Distributed Memory Computing Conference, held in Portland, OR, 570--577, 1991a.

This paper introduces auralization techniques as a means for studying the run-time behavior of parallel programs. Examples are described of simple sound mappings that directly map run-time events of parallel programs to MIDI sound events. Although the sound playbacks discussed in this paper are not synchronized with any graphical representations, the basic feasibility of the auralization idea is demonstrated.

Frantii, G. E., and L. A. Leverault "Auditory Discrimination of Seismic Signals from Earthquakes and Explosions." Bull. Seis. Soc. Am. 55(1) (1965): 1--26.

Twenty-one observers classified 200 time-compressed, audibly displayed seismic events as either earthquakes or explosions correctly 2/3 of the time (where 1/2 corresponds to chance performance). Experiments were done to determine the receiver operating characteristics of listeners, the effect of training on performance, the effect of epicentral distance, and the effect of dual (horizontal and vertical) component playback. Among the significant conclusions was that observers reached plateau performance with the 1500 decisions and that the performance could be improved by using multiple component (stereo) playbacks.

Freed, D. J., and W. L. Martens. "Deriving Psychophysical Relations for Timbre." In Proceedings of the International Computer Music Conference, held October 20--24, 1986, in The Hague, The Netherlands, 1986.

The authors present acoustic analyses and experiments on the auditory perception of mallet hardness; one of the few examples of studies of everyday listening.

Frysinger, S. P. "Pattern Recognition in Auditory Data Representation." Unpublished Thesis, Stevens Institute of Technology, Hoboken, 1988.

Frysinger, S. P. "Applied Research in Auditory Data Representation." In Extracting Meaning From Complex Data--Proceedings of the SPIE/SPSE Symposium on Electronic Imaging, held February 1990, edited by E. J. Farrell. Springfield, VA: SPIE, 1990.

These two papers include an investigation of auditory/visual representations of multivariate time-series data. Two forced-choice experiments were conducted in which subjects determined which of two data sets was correlated. Subjects' data interpretation performance was found to depend upon detection task. For correlation detection, time-series dimensionality was a significant variable in display performance, and the combined auditory/visual display proved superior to the auditory-only display, while for trained pattern detection, dimensionality was not a factor, and the performance of the auditory/visual display was essentially the same as the auditory-only display.
For a 30-second excerpt of DRI economic indicators from 1948 to 1980, click here.

Gaver, W. W. "Everyday Listening and Auditory Icons." Ph.D. Dissertation, University of California, San Diego, 1988.

A two-part dissertation: Part One explores the basic psychology of everyday listening (a.k.a. auditory event perception) via physical analyses, protocol studies, and a study of hearing the length and material of struck bars. Part Two consists of a collection of papers about auditory icons.

Gaver, W. W. "The SonicFinder: An Interface that Uses Auditory Icons." Hum.-Comp. Inter. 4(1) (1989).

The authors describe the SonicFinder, a modification of the most commonly used Macintosh program which incorporates auditory icons. The paper also contains a discussion of mapping sounds to underlying computer events.

Gaver, W. W. "Sound Support for Collaboration." In Proceedings of the Second European Conference on Computer-Supported Collaborative Work, held September 24--27, 1991, in Amsterdam. Dordrecht: Kluwer, 1991.

Gaver suggests that sound offers a new dimension for awareness in collaborative systems. He uses as examples the ARKola study (which is also described in Gaver et al., 1991) and the EAR system.

Gaver, W. W. "Using and Creating Auditory Icons." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

This paper reviews the author's work on auditory icons from 1988 to his present research in parameterizing the icons by a variety of synthesis techniques. A number of systems are described which illustrate the functions that auditory icons can perform.

Gaver, W. W. "Synthesizing Auditory Icons." In Proceedings of INTERCHI '93, held April 24--29, 1993, in Amsterdam. Reading, MA: ACM Press/Addison-Wesley, 1993.

Gaver escribes a series of algorithms for synthesizing everyday sounds specified in terms of their causal event; he suggests that these algorithms might be useful in creating parameterized auditory icons.

Gaver, W. W. "What in the World Do We Hear? An Ecological Approach to Auditory Source Perception." Ecol. Psych. (5)1 (1993).

Gaver suggests everyday listening as a field of study and develops a framework for describing everyday sounds via physical analyses and protocol studies.

Gaver, W. W., and R. B. Smith. "Auditory Icons in Large-Scale Collaborative Environments." In Proceedings of Human-Computer Interaction: Interact '90, held August 27--31, 1990, in Cambridge, UK, 735--740. Amsterdam: North Holland, 1990.

The authors describe SoundShark, an auditory interface to a large-scale collaborative system designed for distance education. They give examples of sounds used to confirm user actions, to convey information about ongoing processes and modes, to aid navigation, and to support collaboration.

Gaver, W. W., T. Moran, A. MacLean, L. Lvstrand, P. Dourish, K. Carter, and W. Buxton. "Realizing a Video Environment: EuroPARC's RAVE System." In Proceedings of CHI '92, held May 3--7, 1992, in Monterey, CA. Reading, MA: ACM Press/Addison-Wesley, 1992.

A review of EuroPARC's RAVE system, a computer-controlled audio-video network that supports remote collaboration (see also Buxton and Moran, 1990). The authors discuss the nature of collaboration, the emergent functionality of the system, and issues concerning privacy; and describe related systems for collaboration.

Gaver, W. W., R. B. Smith, and T. O'Shea. "Effective Sounds in Complex Systems: The ARKola Simulation." In Proceedings of CHI '91, held April 28--May 2, 1991, in New Orleans. Reading, MA: ACM Press/Addison-Wesley, 1991.

The ARKola bottling factory was a software simulation designed for testing auditory icons in a complex, cooperative task. Observations of participants running the plant with and without sound suggested that auditory icons affected their perception of the plant and their collaboration (see also Gaver & Smith, 1990).

Getty, D. J., and J. H. Howard, Jr. Auditory and Visual Pattern Recognition. Hillsdale, NJ: Erlbaum, 1981.

This edited volume includes five chapters on the perception of complex auditory patterns, two chapters on theoretical approaches to pattern recognition, and three chapters on multidimensional approaches to pattern perception.

Gibson, J. J. The Senses Considered as Perceptual Systems. Boston: Houghton Mifflin, 1966.

Gibson is the founder of the ecological approach to perception. In this book he argues that the senses should be considered systems, extending beyond the primary sensory mechanisms, for picking up information in the environment. This is Gibson's only major work that discusses audition.

Gibson, J. J. The Ecological Approach to Visual Perception. Boston: Houghton Mifflin, 1979.

Gibson's last book includes a classic description of the information available for visual perception, the concept of affordances, and experiments showing how people pick up and use information in the world. As with all of Gibson's work, this book is extremely well written and a pleasure to read.

Glavin, S. "Creating Sound Symbols from Digital Terrain Models: An Exploration of Cartographc Communication Forms." Unpublished Master's Thesis, Carleton University, 1987.

In the cartographic application realm, Glavin has sonified three-dimensional landscapes by creating sound symbols from Digital Terrain Models. This is the earliest citation of sonification in cartography.

Grantham, D. W. "Detection and Discrimination of Simulated Motion of Auditory Targets in the Horizontal Plane." J. Acous. Soc. Am. 79 (1986): 1939--1949.

Using a technique of simulating auditory motion in the horizontal plane with two fixed loudspeakers which is described in some depth in the appendix, Grantham tested subjects' ability to detect and discriminate motion of 500-Hz tones. It is concluded that, for the range of simulated velocities simulated, subjects used spatial change rather than velocity per se in these detection and discrimination tasks.

Green, D. M. "Audition: Psychophysics and Perception." In Stevens' Handbook of Experimental Psychology, edited by R. C. Atkinson, R. J. Herrnstein, G. Lindzey, and R. D. Luce, 377--408. New York: Wiley, 1988.

Covers psychophysical performance in detection and discrimination of intensity and frequency, sound localization, and perception of loudness and pitch.

Grinstein, G., and S. Smith. "The Perceptualization of Scientific Data." In Proceedings of the SPIE/SPSE Conference on Electronic Imaging, Vol. 1259, 190--199. Santa Clara, CA: SPIE, 1990.

This paper was the first to fully describe "Exvis," the integrated visualization and sonification system developed at University of Massachusetts' Lowell group. The accompanying video shows a style of interaction pioneered in Exvis and provides several examples of typical sounds generated by Exvis.

Haddad, Richard A., and Thomas W. Parsons. Digital Signal Processing: Theory, Applications, and Hardware. ISBN 0-7167-8206-5. Computer Science Press, 1991.

An excellent (but sometimes difficult) book covering a wide variety of topics including numerical operations (convolution, Fourier Transform), digital representation of speech, filters (FIR and IIR), FFTs, DSP algorithms, applications (speech, synthesis, recognition, image processing), and some descriptions of DSP chips (mainly from Texas Instruments).

Handel, S. Listening: An Introduction to the Perception of Auditory Events. Cambridge, MA: MIT press, 1989.

This book includes in one source broad and detailed coverage of auditory topics including sound production (especially by musical instruments and by voice), propagation, modelling, and the physiology of the auditory system. It covers parallels between speech and music throughout.

Hawkins, H. L., and J. C. Presson. "Auditory Information Processing." In Handbook of Perception and Human Performance, edited by K. R. Boff, L. Kaufman, and J. P. Thomas, Chap. 26. New York: Wiley, 1986.

The authors focus on topics related to the capacity to process auditory information including attention and memory, and factors that mediate processing capacity such as noise and aging.

Hayward, Chris. "Listening to the Earth Sing." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

Techniques for auditory monitoring and analysis of seismic data are described. Previous published work is limited to two papers now nearly 30 years old. This paper broadens the applications from the discrimination problem of earthquakes and nuclear explosions to training, quality control, free oscillation display, data discovery, large data set display, even recognition, education, model matching, signal detection,a nd onset timing. Simple processing techniques including interpolation, time compression, automatic gain control, frequency doubleing, audio annotation and markers, looping, and stereo are used to create seven audio data sets.

Heine, W-D., and R. Guski. "Listening, the Perception of Auditory Events? An Essay Review of Listening: An Introduction to the Perception of Auditory Events by Stephen Handel." Ecol. Psych. 3(3) (1991): 263--275.

This short essay criticises Handel's book from an ecological perspective and offers suggestions of what an ecological approach to audition might consider.

Helmholtz, H. von Selected Writings of Hermann von Helmholtz, edited by Russell Kahl. Middletown, CT: Wesleyan University Press, 1971.

A classic text on the work of the nineteenth-century scientist into the realm of audiology and sonic phenomena. A necessity for historical continuity of research.

Hirsh, I. J. "Auditory Perception and Speech." In Steven's Handbook of Experimental Psychology, edited by R. C. Atkinson, R. J. Herrnstein, G. Lindzey, and R. D. Luce, 377--408. New York: Wiley, 1988.

Hirsh organizes his coverage of audition into single sounds, sound sequences, and speech, covering the important perceptual attributes of each type of sound.

Howell, P., R. West, and I. Cross, eds. Representing Musical Structure. London: Academic Press, 1991.

The authors present a range of studies of musical structure from a perceptual approach.

Howard, J. H., Jr., and J. A. Ballas. "Syntactic and Semantic Factors in the Classification of Nonspeech Transient Patterns." Percep. & Psycho. 28 (1980): 431--439.

The authors present the results of three experiments conducted to assess the role of syntactic (i.e., temporal) and semantic (i.e., knowledge) factors in the classification of sequences of brief sounds. Their results indicate that both factors and their interaction are important. Previous research has shown these to be important in speech and language perception. These studies demonstrate their importance in nonspeech sound perception.

Human Factors Journal. Human Factors and Ergonomics Society Publications Division, Box 1369, Santa Monica, CA 90406-1369.

This journal publishes original studies about people in relation to machines and environments. Auditory display studies are published only occasionally but usually address the effectivness of the auditory display within the context of a human-machine system.

Hutchins, E. L., J. D. Hollan, and D. A. Norman. "Direct Manipulation Interfaces." In User-Centered System Design: New Perspectives on Human-Computer Interaction, edited by D. A. Norman and S. W. Draper, 87--124. Hillsdale, NJ: Lawrence Erlbaum, 1986.

This paper argues that direct manipulation systems (now often referred to as "GUI") are valuable because they minimize semantic and articulatory distances between humans and computers.

International MIDI Association. MIDI 1.0 Detailed Specification, Version 4.2. Los Angeles, CA: MIDI, 1993.

The official specification for the industry-standard Musical Instrument Digital Interface.

Jaffe, D., and J. Smith. "Extensions of the Karplus-Strong Plucked String Algorithm." Comp. Music J. 7(2) (1983).

The authors explore and extend the Karplus-Strong algorithm, an extremely efficient approximate model of the physics of plucked strings.

Jameson, D. "Sonnet: Audio-Enhanced Monitoring and Debugging." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

Jameson describes a visual programming language for attaching run-time actions to running programs. The run-time actions allow highly controlable sounds to be attached both to programs and to data. Examples include differentiating among different sorting algorithms by their auditory characteristics as well as tracking trends in variables over time.

Jenkins, J. J. "Acoustic Information for Objects, Places, and Events." In Persistence and Change: Proceedings of the First International Conference on Event Perception edited by W. H. Warren and R. E. Shaw. Hillsdale, NJ: Lawrence Erlbaum, 1985.

An exploration of audition from an ecological point of view. Jenkins summarizes the benefits of acoustic information over visual information, particularly in natural settings. The advantages include unobtrusive monitoring, no requirement for an external energy source if natural events are producing the sound, provision of information about the cause of the sound and its source in space, and interrupt capability because sound does not require oriented receptors for effective delivery of the information.

Jones, S. D., and S. M. Furner. "The Construction of Audio Icons and Information Cues for Human-Computer Dialogues." In Contemporary Ergonomics: Proceedings of the Ergonomics Society's 1989 Annual Conference, edited by T. Megaw. Reading, MA: Addison-Wesley, 1989.

Some early experiments into the effectiveness of earcons and auditory icons.

Kanizsa, G. Organization in Vision: Essays on Gestalt Perception. New York: Praeger, 1979.

In this classic gestalt text, Kanizsa gives examples of the effects of gestalt processes.

Karsenty, S., J. A. Landay, and C. Weikart. "Inferring Graphical Constraints with Rockit." In Human-Computer Interaction: Proceedings of CHI '92, held in 1993 at the University of York, UK. Also as Research Report 17, Digital Equipment Corporation, Paris Research Laboratory.

This paper describes a graphical tool that helps in creating constrained graphical objects. Constraints are inferred by the system and shown to the user both graphically and sonically.

Kistler, D. K., and F. L. Wightman. "A Model of Head-Related Transfer Functions Based on Principal Components Analysis and Minimum-Phase Reconstruction." J. Acoust. Soc. Am. 91 (1992): 1637--1647.

One of the first papers to thoroughly describe a systematic approach to simplifying the head-related transfer function.

Koffka, K. Principles of Gestalt Psychology. London: Kegan Paul, 1936.

In this classic gestalt text, Koffka identifies applicable major grouping principles, particularly in vision.

Kramer, G., ed. Auditory Display: Sonification, Audification, and Auditory Interfaces. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison-Wesley, 1994.

A collection of 21 papers, defining the state of auditory display research at the time of its publication. It includes an extensive introductory chapter, foreword by A. Bregman, annotated bibliography, audio CD of sound examples, resources appendix, and informal comments by several ICAD participants.

Kramer, G. "An Introduction to Auditory Display." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

Kramer provides an overview of the field, including history, other uses of nonspeech audio, advantages and difficulties with auditory display, its relationship to music, and possible applications. Introduces the symbolic/analogic continuum as a means of comparing and analyzing display techniques.

Kramer, G. "Some Organizing Principles for Representing Data with Sound." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

Kramer provides a broad set of techniques for sonification display design. He describes parameter nesting and introduces beacons and dynamic beacons, affective and metaphorical association, and data type/data family association.

Kramer, G. "Sound and Communication in Virtual Reality." In Communication in the Age of Virtual Reality, edited by F. Biocca and M. Levy. Hillsdale, NJ: Lawrence Earlbaum, 1994.

Kramer discusses the use of sound in virtual environments from the standpoint of Biocca's Communications Design Matrix. He provides an overview of auditory implementations in virtual reality and suggests a number of extensions of these techniques. He introduces the concept of audible objects as a factor in VR displays and describes work on sonification displays for enriching the formation of mental models.

Kramer, G. "Sonification of Financial Data: An Overview of Spreadsheet and Database Sonification." In The Proceedings of Virtual Reality Systems '93, SIG Advanced Applications, held 1993 in New York, NY. New York: SIG, 1993.

A brief description of sonification, along with a case study of the sonification of five- and ten-dimensional financial data.

Kramer, G. "Sonification and Virtual Reality I: An Introduction." In VR Becomes a Business, the Proceedings of Virtual Reality '92. Westport, CT: Meckler, 1992

An introductory paper on sonification, relating auditory data representation to immersive interfaces.

Kramer, G. "Audification: Using Sound to Understand Complex Systems and Navigate Large Data Sets." Proceedings of the Santa Fe Institute Science Board, Santa Fe Institute, 1990.

Kramer describes his auditory display concepts and research from 1989--1990 and relates them to comprehending complexity and navigating large data sets.

Kramer, G. "Audification of the ACOT Predator/Prey Model." Unpublished research report prepared for Apple Computer's Advanced Technology Group, Apple Classrooms of Tomorrow, 1990.

Kramer describes his work with Apple Computer to bring sonification techniques to a predator-prey model; the integrated hardware/software system for producing the sonifications is described. Realistic and abstract sonifications for the same data set are presented, and their possible impact on the formation of mental models and on students with different learning styles are discussed.

Kramer, G., and S. Ellison. "Audification: The Use of Sound to Display Multivariate Data." In The Proceedings of the International Computer Music Conference, 214--221. San Francisco, CA: ICMA, 1991.

The authors introduce parameter nesting, a technique for developing high-dimen- sional displays, and introduce the Clarity Sonification Toolkit, an object-oriented research and development tool for developing and testing sonification techniques. They provide an in-depth description of how to use these tools to sonify a nine-dimensional Lorenz equation.

Krumhansl, C. L. Cognitive Foundations of Musical Pitch. Oxford Psychology Series, Vol. 17. Oxford: Oxford University Press, 1990.

Study of musical pitch from a perceptual perspective. Krumhansl presents a range of models of pitch phenomena and considers how these might be encoded and remembered.

Lakoff, G., and M. Johnson. Metaphors We Live By. Chicago: University of Chicago Press, 1980.

The authors provide a powerful description of how we use metaphors in everyday language without even knowing that we are doing so. They investigate how these metaphors are not only a reflection of our thinking processes, but how they shape those very processes.

Lamb, H. The Dynamical Theory of Sound, 2nd ed. New York: Dover, 1960.

A classic book on the physics of sound and sound-producing events.

Laurel, B. "Interface as Mimesis." In User-Centered System Design: New Perspectives on Human-Computer Interface, edited by D. A. Norman and S. W. Draper. Hillsdale, NJ: Lawrence Erlbaum, 1986.

The author introduces the analogy between using a computer and attending a drama; she suggests that users' engagement with a system is an important dimension to be considered in design.

Loomis, J. M., C. Hebert, and J. G. Cicinelli. "Active Localization of Virtual Sounds." J. Acous. Soc. Am. 88 (1990): 1757--1764.

As part of a larger project to produce the user interface for a personal navigation system, the authors developed a low-cost, computer-controlled, analog, virtual sound display system that does not use direction-dependent pinna cues. Using only interaural time difference and interaural intensity difference coupled with head orientation, they found that subjects could home to virtual sound sources quite well, and found some indications that the virtual sounds were perceived as being externalized. While the use of HRTFs may produce more realistic displays, this study shows the potential of simple virtual auditory displays, e.g., for navigation tasks.

Lovstrand, L. "Being Selectively Aware with the Khronika System." In Proceedings of ECSCW '91, held September 25--27, 1991, in Amsterdam, The Netherlands.

The author describes an event database that uses nonspeech audio cues amongst its techniques for notifying users about events.

Lunney, D., and R. C. Morrison. "High-Technology Laboratory Aids for Visually Handicapped Chemistry Students." J. Chem. Ed. 58(3) (1981): 228--231.

The authors present analytical chemistry data from infrared spectral to visually impaired students. The pitch of a tone was made proportional to the frequency location of the infrared peak it represents. In informal tests, subjects were able to accurately identify a range of learned compounds.

Madhyastha, T. M., and D. A. Reed. "A Framework for Sonification Design." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

The authors describe Porsonify, a toolkit that provides a uniform network interface to sound devices through table-driven sound servers. All device-specific functions are encapsulated in control files, so that user interfaces to configure sound devices and sonifications can be generated independently of the underlying hardware. Creation of some example sonifications using this toolkit is discussed.

Mansur, D. L. "Graphs in Sound: A Numerical Data Analysis Method for the Blind." Unpublished Thesis, University of California, Davis, 1984.

The author tested the ability of subjects to make certain judgements about x-y "plots" using continuously varying pitch to represent the dependent variable (y), and time the independent variable (x). He was primarily concerned with the development of displays to make exploratory data analysis possible for visually impaired analysts. He found that with limited training, subjects were able to recognize key features of the data, such as linearity, monotonicity, and symmetry, for between 79 and 95 percent of the trials.

Mansur, D. L., M. M. Blattner, and K. I. Joy. "Sound-Graphs: A Numerical Data Analysis Method for the Blind." In Proceedings of the 18th Annual Hawaiian International Conference on System Science, held January 1985 in Honolulu, Hawaii. Los Alamitos, CA: IEEE Computer Society Press, 1984. Also in J. Med. Sys. 9 (1985): 163--174.

Sound-Graphs are composed of three-second periods of continuously varying pitch. They were developed and used to provide the blind with a rapid and intuitive understanding of numerical data (x--y graphs). This work is primarily from the M.S. thesis (with the same name) by Douglass L. Mansur, University of California, Davis, 1984; also published as Lawrence Livermore Technical Report UCRL-53548.

Mansur, D. L., M. M. Blattner, and K. I. Joy. "The Representation of Line Graphs Through Audio Images." Technical Report UCRL-91586, Lawrence Livermore National Laboratory, Livermore, CA, September 1984.

Holistic sound and graphical images bear certain resemblances to the way we manipulate them. This article examines tools that manipulate both graphical and line graphs and their sonic equivalents.

Mansur, D. L., M. M. Blattner, and K. I. Joy. "Sound-Graphs: A Numerical Data Analysis Method for the Blind." J. Med. Sys. 9 (1985): 163--174.

The authors describe how simple line graphs can be translated into nonspeech sounds for presentation to blind people.

Matlin, M. W. Sensation and Perception, 2nd ed. Massachusetts: Allyn and Bacon, 1988.

A good introductory text to perception that distinguishes between the physical responses to stimuli and the perceptual effects.

Mayer-Kress, G., R. Bargar, and I. Choi. "Musical Structures in Data From Chaotic Attractors." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

The authors exhibit parallels between structures of chaotic dynamical systems and music and indicate the possibility of using this connection to enhance the perception of recurrent features in complex signals. They describe three auditory representations of chaotic systems.

McAdams, S. "Spectral Fusion and the Creation of Auditory Images." In Music, Mind and Brain, Chap. XV. New York: Plenum, 1982.

McAdams discusses aspects of musical perception beyond the boundaries of acoustic analysis; and, in evoking an auditory image from an acoustic signal, the roles of familiarity, of learning and context, of synthetic and analytic listening, and of interacting with the primitive grouping processes of harmonicity and coordinated modulation.

McCabe, R. K., and A. A. Rangwalla. "Auditory Display of Computational Fluid Dynamics Data." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

Direct simulation and parameter mapping techniques are discussed in the context of how they can be used to enhance the understanding of data from computational fluid dynamics simulations. Two case studies are presented. The first case describes how parameter mapping techniques were used to help analyze the results from a simulation of the Penn State artificial heart. The second case shows how direct simulation was used to better understand the tonal acoustics of rotor stator interactions inside a jet turbine.

McIntyre, M. E., R. T. Schumacher, and J. Woodhouse. "On the Oscillations of Instruments." JASA 74 (1983): S52.

An account of temporally based physical modeling techniques with examples; an excellent work.

Meijer, P.B.L., "An Experimental System for Auditory Image Representations," IEEE Transactions on Biomedical Engineering, Vol. 39, No. 2, pp. 112-121, Feb 1992.

Meijer presents an experimental system for the conversion of arbitrary images into sound patterns, possibly as a step towards the development of a vision substitution device for the blind. The soundscapes generated by the system provide a resolution of up to 64 x 64 pixels with 16 grey-tones per pixel. The actual resolution obtainable with human perception of these soundscapes remains to be evaluated. Spectrographic reconstructions were made to prove that much of the image content is indeed preserved in the soundscapes.

Meyer, L. B. Emotion and Meaning in Music. Chicago: University of Chicago Press, 1956.

Forming the basis of Meyer's past 35 years of research and theoretical writing, this text is doubtlessly a classic in the field of music psychology. Proceeding from John Dewey's (1894) "conflict theory of emotion," the author provides results of experimental investigations and musical examples to support his premise that "emotion or affect is aroused when a tendency to respond is arrested or inhibited."

Mezrich, J. J., S. P. Frysinger, and R. Slivjanovski. "Dynamic Representation of Multivariate Time-Series Data." J. Am. Stat. Assoc. 79 (1984): 34--40.

Dynamic data representation employing both auditory and visual components for multivariate time-series displays, such as for economic indicators. In their scheme, the analyst is confronted at any moment with one multivariate sample from the time series, rather than the whole data set, producing samples which are displayed in succession rather like frames in a movie. The results of their experiment indicate that the dynamic auditory/visual display outperforms the static visual displays in most cases for the correlation detection task.
For a 30-second excerpt of DRI economic indicators from 1948 to 1980, click here.

Monk, A. "Mode Errors: A User-Centered Analysis and Some Preventative Measures Using Keying-Contingent Sound." IJMMS 24 (1986): 313--327.

Monk uses sound to reduce the number of mode errors in an interface.

Moore, F. Richard. Elements of Computer Music. ISBN 0-13-252552-6. Prentice Hall, 1990.

Moore covers how to analyze process and synthesize musical sound. A lot of digital signal processing, composition techniques (random numbers, Markov Processes, etc.), and the uses of music.

Mulligan, B. E., D. K. McBride, and L. S. Goodman. "A Design Guide for Nonspeech Auditory Displays." Pensacola, FL: Naval Aerospace Medical Research Laboratory, 1987.

The authors provide algorithms that assist the designer in designing auditory signals, especially in ways to enhance detectability of signals in noise and to increase loudness without increasing signal length.

Mynatt, E. "Auditory Presentation of Graphical User Interfaces." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

Mynatt presents work in designing interactive, auditory interfaces that provide access to graphical user interfaces for people who are blind. She discusses a prototype system called Mercator which explores conveying symbolic information and supporting navigation in the auditory interface.

National Research Council. Classification of Complex Nonspeech Sounds. Washington, DC: National Academy Press, 1989.

This report was prepared by the Committee on Hearing, Bioacoustics, and Biomechanics to review and evaluate the literature on the classification of complex nonspeech, nonmusic, transient sounds. Literature on perception, signal processing, auditory object perception, limits of auditory processing, acoustic transients, sonar, and multidimensional analysis is also reviewed.

Oldfield, S. R., and S. P. A. Parker. "Acuity of Sound Localization: A Topography of Auditory Space. I. Normal Hearing Conditions." Perception 13 (1984a): 581--600.

Oldfield, S. R., and S. P. A. Parker. (1984b). "Acuity of Sound Localization: A Topography of Auditory Space. II. Pinna Cues Absent." Perception 13 (1984b): 601--617.

Oldfield, S. R., and S. P. A. Parker. (1986). "Acuity of Sound Localization: A Topography of Auditory Space. III. Monaural Hearing Conditions." Perception 15 (1986): 67--81.

These three studies examine the ability of subjects under different sets of conditions to localize white noise played through a speaker that varied in position. The blindfolded subjects pointed a special gun at the perceived source of sound played in an anechoic chamber over a boom-mounted speaker. In the first study subjects listened normally; in the second study pinna cues were removed by inserting individually cast pinnae molds with access holes to the auditory canal into subjects' ears; and in the third study monaural conditions were created by inserting "ear defenders" into subjects' right ears and covering their right ears with fitted earmuffs. These studies demonstrate the importance of pinna cues for determining elevation and reducing front/back reversals, and show that elevation discrimination was good under monaural conditions, but that azimuth discrimination was reduced.

O'Leary, A., and G. Rhodes. "Cross-Modal Effects on Visual and Auditory Object Perception." Percep. & Psycho. 35 (1984): 565--569.

Using a display that combined a stimulus for auditory stream segregation with its visually apparent movement analog, these Stanford University researchers demonstrated cross-modal influences between vision and audition on perceptual organization. Subjects hearing the same auditory sequence perceived it as two tones if a concurrent visual sequence was presented that was perceived as two moving dots, and one tone if a concurrent visual sequence perceived as a single object was presented.

Oppenheim, Alan V., ed. Applications of Digital Signal Processing. ISBN 0-13-039115-8. Prentice Hall, 1978.

Collection of papers on DSP applications, A couple of chapters are dedicated to processing audio signals and speech. The rest is exotic (RADAR, SONAR, Geophysics) and there is one chapter on digital image processing.

Parncutt, R. Harmony: A Psychoacoustical Approach. Berlin: Springer-Verlag, 1989.

Parncutt develops a model of Western tonal music on the basis of psychoacoustics and psychomusicology and applies the model to the identification of the specific effects of musical conditioning and to the analysis of musical compositions. Included are interesting studies of analytic vs. synthetic, etc. listening preferences for both musicians and nonmusicians (categorizing listeners according to their listening styles).

Patterson, R. D. "Guidelines for Auditory Warning Systems on Civil Aircraft." Paper No. 82017, Civil Aviation Authority, London, 1982.

One of the first and best papers to discuss the proper structure of auditory warning signals based on psychoacoustical principles.

Peacock, K. "Synesthetic Perception: Alexander Scriabin's Color Hearing." Music Percep. 2(4) (1985): 483--506.

A curious phenomena which has surfaced repeatedly since the late Baroque era has come to be known as synaesthesia. It was used by the Romanticists of the nineteenth century as an effective means to enrich their accounts of sensuous impressions. Other names for the phenomena include chromesthesia, photothesia, synopsia, color hearing, and color audition. People who have this characteristic, experience a crossover between one or more sensory modes. Thus, they might be blessed with the ability to hear colors or odors, or see sounds. People who habitually perceive stimuli in this manner are often surprised when told that not everyone shares this faculty. Color hearing, though only one form of synaesthesia, is probably the commonest.

Perrott, D. R., K. Saberi, K. Brown, and T. Z. Strybel. "Auditory Psychomotor Coordination and Visual Search Performance." Percep. & Psycho. 48 (1990): 214--226.

The authors postulate that the primary function of the auditory spatial system is to direct the eyes. They studied the visual search time for targets presented with and without associated spatial audio cues and found that presenting a 10-Hz click train from the same location as the visual target substantially reduced the time for visual search.

Perrott, D. R., T. Sadralodabai, K. Saberi, and T. Z. Strybel. "Aurally Aided Visual Search in the Central Visual Field: Effects of Visual Load and Visual Enhancement of the Target." Human Factors 33 (1991): 389--400.

Visual search performance in displays with distracters was studied with and without spatially correlated audio cues. The value of audio in directing visual search was found to be particularly great when there were a large number (63) of distracter images. Potential in applications such as enhanced cockpit displays is noted.

Pitt, I. J., and A. D. N. Edwards. "Navigating the Interface by Sound for Blind Users." In People and Computers VI: Proceedings of the CHI '91 Conference, edited by D. Diaper and N. Hammond, 373--383. Cambridge: Cambridge University Press, 1991.

A description of experiments on using sounds to guide navigation in graphical user interfaces.

Plomp, R., and W. J. M. Levelt. "Tonal Consonance and Critical Bandwidth." J. Acous. Soc. Am. 38 (1965): 548--560.

Musicians' rank harmonic intervals in terms of their consonance and dissonance differently than naive subjects. For nonmusicians, consonance and "pleasantness" are one and the same. This is not true for trained musicians who often consider dissonances more pleasant for reasons of harmony or aesthetics.

Plomp, R. "Acoustical Aspects of Cocktail Parties." Acustica 38 (1977): 186--191.

Plomp considers cocktail parties in terms of signal-to-noise ratio and speech intelligibility. He discovers the importance of room acoustics--the height of the hall and the sound absorption characteristics of the ceiling and walls.

Pollack, I., and L. Ficks. "Information of Elementary Multidimensional Auditory Displays." J. Acous. Soc. Am. 26 (1954): 155--158.

The authors consider two different mappings of multidimensional data onto the parameters of sound. Using these two display types, they measure the information transmitted to subjects as the sum of the number of bits in each correctly identified dimensional level. Their results indicate that multidimensional displays, in general, outperformed unidimensional displays measured elsewhere, and that subdivision of display dimensions into finer levels does not improve information transmission as much as increasing the number of display dimensions does.

Pohlmann, Ken C., ed. Advanced Digital Audio. ISBN 0-672-22768-1. SAMS, 1991.

An excellent book covering a variety of advanced topics in digital signal processing for audio. It covers several formats (optical disk technology, digital audio systems for video and film, data compression, signal processing for audio, and DSP architecture).

Pratt, C. C. "Music as the Language of Emotion." Lecture delivered in the Whittall Pavilion of the Library of Congress, December 21, 1950. Washington, DC: US Government Printing Office, 1952.

In a study involving 227 college students, the author finds a strong consensus as to the mood conveyed in four distinct pieces of music that cannot be accounted for by the status of music as a language of emotions.

Rabenhorst, D. A., E. J. Farrell, D. H. Jameson, T. D. Linton, and J. A. Mandelman. "Complementary Visualization and Sonification of Multidimensional Data." In Extracting Meaning from Complex Data: Processing, Display, Interaction, edited by E. J. Farrel, Vol. 1259, 147--153. SPIE, 1990.

Data enhancement is only the first level of auditory data representation. Utilization of the auditory channel to present unseen data is represented in this paper. The goal of these researchers was to use sound in a manner that "is intuitive enough and readily learnable enough to be effective as an additional sensory input to a mental model."

Rakowski, A. "Intonation Variants of Musical Intervals in Isolation and in Musical Contexts." Psych. Music 18 (1990): 60--72.

Rakowski presents experimental evidence to demonstrate that different musical intervals have different perceptual salience for musicians, some having more distant category boundaries and some more accurate recognition (strong and weak intervals). Further evidence is presented that musicians tune intervals to match perceptual categorizations. Three factors--acoustic (tuning to beats arising from its relationship with preceding note), psychological (perceptual) and aesthetic (accentuation of intervals)--all participate is this phenomenon.

Rasch, R. A., and R. Plomp. "The Perception of Musical Tones." In The Psychology of Music, edited by Diana Deutsch. New York: Academic Press, 1982.

Musicians' rank dyads in terms of their consonance and dissonance differently than naive subjects.

Richards, W. Natural Computation. Cambridge, MA: MIT Press, 1988.

This book includes seven chapters on sound interpretation including: representing acoustic information, models of binaural localization and separation, schematizing spectrograms, acoustics of the signing, recovering material properties from sound, perception of breaking and bouncing events, and perception of melodies.

Risset, J. C., and D. L. Wessel. "Exploration of Timbre by Analysis and Synthesis." In The Psychology of Music, edited by D. Deutsch. New York: Academic Press, 1982.

The authors describe analysis and synthesis as a technique for data reduction in computer music. Seminal work by founders of the field.

Roads, Curtis, and John Strawn, eds. Foundations of Computer Music. ISBN 0-262-68051-3. Cambridge, MA: MIT Press, 1985.

A survey of sound computation and other computer music issues.

Roads, Curtis, ed. The Music Machine. ISBN 0-262-18131-2. Cambridge, MA: MIT Press, 1989.

A collection of papers from Computer Music Journal.

Roederer, J. G. Introduction to the Physics and Psychophysics of Music. New York: Springer-Verlag, 1973.

A good introduction for the "intelligent layman."

Rothstein, Joseph. MIDI: A Comprehensive Introduction. Computer Music and Digital Audio Series, Vol. 7. John Strawn, Series Editor. Madison, WI: A-R Editions, 1992.

A thorough discussion of the basic principles of MIDI. Rothstein describes categories of MIDI instruments, accessories, and computer software, and tells how to get it all to work together.

Scaletti, C. "Sound Synthesis Algorithms for Auditory Data Representation." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

Scaletti provides a working definition of sonification. Models of data as continuous streams and as discrete events are illustrated using the author's sound specification language Kyma. Several sound synthesis algorithms are outlined, each with an example application, and there is a summary of which synthesis algorithms are best applied to which kinds of data. The paper concludes with an enumeration of some of the open questions and future research directions in the field of auditory display.

Scaletti, C., and A. B. Craig. "Using Sound to Extract Meaning from Complex Data." In Extracting Meaning from Complex Data: Processing, Display, Interaction, edited by E. J. Farrel, Vol. 1259, 147--153. SPIE, 1990.

Because sound is an inherently time-variable phenomena, Scaletti and Craig concentrated their ADR work on the representation of time-varying data mapped to animated graphics and sound. Examples discussed include sonifications of forest fire data, Los Angeles pollution levels, and swinging pendula.

Schafer, R. M. The Tuning of the World. New York: Knopf, 1977.

Schafer describes studies of the acoustic environment undertaken in the World Soundscape Project including historical changes in the acoustic environment, cross-cultural studies of listening preferences and sound interpretation, and studies of references to sound in literature.

Scharf, B., and S. Buus. "Audition I: Stimulus, Physiology, Thresholds." In Handbook of Perception and Human Performance, edited by K. R. Boff, L. Kaufman, and J. P. Thomas, Vol. 1, 14.1--14.71. New York: Wiley, 1986.

Standard reference to have on your bookshelf.

Scharf, B., and A. J. M. Houtsma. "Audition II: Pitch, Localization, Aural Distortion, Pathology." In Handbook of Perception and Human Performance, edited by K. R. Boff, L. Kaufman, and J. P. Thomas, Chap. 26. New York: Wiley, 1986.

This book covers psychophysical performance in detection and discrimination of intensity and frequency, sound localization, and perception of loudness and pitch.

Scherer, Klaus R., and James S. Oshinsky. "Cue Utilization in Emotion Attribution from Auditory Stimuli." Motivation & Emotion 1(4) (1977).

The authors describe a study using a MOOG synthesizer in which seven 2-level factors, amplitude and pitch level, pitch contour, pitch variability, tempo, envelope and filtration, and other more complicated stimuli were systematically manipulated and then rated on emotional impact. Inter-judge (naive students) agreement was generally good, some emotions having more reliable cues than others, as might be expected.

Schmandt, C., B. Arons, and C. Simmons. "Voice Interaction in an Integrated Office and Telecommunications Environment." In Proceedings of 1985 Conference. American Voice I/O Society, 1985.

The Conversational Desktop is a conversational office assistant that manages personal communications (phone calls, voice mail messages, scheduling, reminders, etc.). The system engages the user in a conversation to resolve ambiguous speech recognition input.

Schmandt, C., and B. Arons. Conversational Desktop (videotape). ACM SIGGRAPH Video Rev. 27 (1987).

A four-minute videotape demonstrating many features of the Conversational Desktop.

Schmandt, C., and B. Arons. "Getting the Word." UNIX Rev. 7 (Oct. 1989): 54--62.

An overview of "Desktop Audio" including the systems and interface requirements for the use of speech and audio in the personal workstation. It includes a summary of the VOX Audio Server, a system for managing and controling the audio resources in a networked personal workstation.

Schroeder, M. R. "Digital Simulation of Sound Transmission in Reverberant Spaces." J. Acous. Soc. Am. 47 (1970): 424--431.

One of the core papers discussing techniques for the simulation of reverberant acoustic environments.

Sloboda, J. A. "Music Structure and Emotional Response: Some Empirical Findings." Psych. Music 19 (1991): 110--120.

The author presents analysis of experimental results of emotive response to music extracts related to the musical structure of the compositions.

Smith, R. B. "A Prototype Futuristic Technology for Distance Education." In Proceedings of the NATO Advanced Workshop on New Directions in Educational Technology, held November 10--13, 1988, in Cranfield, UK.

Smith describes SharedARK, a collaborative system that was the basis of SoundShark (Gaver & Smith, 1990) and ARKola (Gaver et al., 1991).

Smith, S. "An Auditory Display for Exploratory Visualization of Multidimensional Data." In Workstations for Experiment, edited by G. Grinstein and J. Encarnacao. Berlin: Springer-Verlag, 1991.

Although it was published in 1991, this is actually the earliest paper about the University of Massachusetts' Lowell work in sonification. It shows what their thinking was as they embarked on their investigations in 1988, which is now mostly of historical interest.

Smith, S., R. D. Bergeron, and G. Grinstein. "Stereophonic and Surface Sound Generation for Exploratory Data Analysis." In Multimedia and Multimodal Interface Design, edited by M. Blattner and R. Dannenberg. Reading, MA: ACM Press/Addison-Wesley, 1992.

Smith, S., R. D. Bergeron, and G. Grinstein. "Stereophonic and Surface Sound Generation for Exploratory Data Analysis." In Proceedings of CHI '90, held 1990, in Seattle, WA. ACM Press, 1990.

This paper, published in two places, describes the authors' attempt to introduce spatial aspects of sound into sonification. This direction was not pursued further.

Smith, S., G. Grinstein, and R. M. Pickett. "Global Geometric, Sound, and Color Controls for the Visualization of Scientific Data." In Proceedings of the SPIE/SPSE Conference on Electronic Imaging, Vol. 1459, 192--206. San Jose, CA: SPIE, 1991.

The authors argue that users should be able to fine-tune visual and auditory data displays to achieve the optimal presentation of their data. They gives examples of how this can be done with the "iconographic" display techniques they developed. The accompanying video gives one brief sound example.

Smith, S., R. M. Pickett, and M. G. Williams. "Environments for Exploring Auditory Representations of Multidimensional Data." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

The authors outline a starting approach to sonification and argue for psychometric testing as part of the sonification design process.

Sorkin, R. D. "Design of Auditory and Tactile Displays." In Handbook of Human Factors, edited by G. Salvendy, 549--576. New York: Wiley & Sons, 1987.

In this chapter Sorkin addresses factors that must be considered in establishing the level, pitch, duration, shape, and temporal pattern of a sound. In addition, he covers the design of binaural sounds and complex coding for sounds.

Sorkin, R. D., F. L. Wightman, D. S. Kistler, and G. C. Elvers. "An Exploratory Study on the Use of Movement-Correlated Cues in an Auditory Head-Up Display." Human Factors 31 (1989): 161--166.

A sequence of three signals incorporating HRTF cues for auditory localization, was played to subjects over headphones, and subjects had to indicate the location of the source via keypress on a computer. This study focused on the importance of head movement in localization, and there were three conditions presented: (1) source fixed in physical space, head movement allowed; (2) no head movement allowed; and (3) source fixed in position relative to the subject's head. Azimuthal localization was found to be considerably better in the first case (source fixed in physical space/head movement allowed), demonstrating the importance to auditory localization of correlating cues to self-initiated movements of the listener's head.

Sorkin, R. D., D. E. Robinson, and B. G. Berg. "A Detection Theory Method for Evaluating Visual and Auditory Displays." In Proceedings of the Human Factors Society, Vol. 2, 1184--1188, 1987.

This paper describes a signal detection method for evaluating different display codes and formats. The method can be used to assess the relative importance of different elements of the display. The paper briefly summarizes data from different types of auditory and visual displays.

Sorkin, R. D., and D. D. Woods. "Systems with Human Monitors: A Signal Detection Analysis." Hum.-Comp. Inter. 1 (1985): 49--75.

This paper analyses the general system composed of a human operator plus an automated alarm subsystem. The combined human machine system is modeled as a two-stage detection system in which the operator and alarm subsystem monitor partially correlated noisy channels. System performance is shown to be highly sensitive to the decision bias (response criterion) of the alarm. The customary practice of using a "liberal" bias setting for the alarm (yielding a moderately high false alarm rate) is shown to produce poor overall system performance.

Sorkin, R. D., B. H. Kantowitz, and S. C. Kantowitz. "Likelihood Alarm Displays." Human Factors 30 (1988): 445--459.

This study describes a type of multilevel or graded alarm display in which the likelihood of the alarmed condition is encoded within the display. For example, the levels of an auditory alarm could vary by repetition rate or voice quality; and the levels of a visual display could vary by color. Several dual-task (tracking and alarm monitoring) experiments demonstrate the feasibility of Likelihood Alarm Displays.

Sorkin, R. D. "Why are People Turning Off Our Alarms?" J. Acous. Soc. Am. 84 (1988): 1107--1108. Reprinted in Human Factors Soc. Bull. 32 (1989): 3--4.

In this short paper Sorkin describes several tragic accidents in which auditory alarms had been disabled or ignored. The author argues that two culprits are high false-alarm rates and excessive sound levels.

Sorkin, R. D. "Perception of Temporal Patterns Defined by Tonal Sequences." J. Acous. Soc. Am. 87 (1990): 1695--1701.

In this study Sorkin describes a general model (the temporal correlation model) for predicting a listener's ability to discriminate between two auditory tone sequences that differ in their temporal pattern. According to the model, the listener abstracts the relative times of occurrence of the tones in each pattern and then computes the correlation between the two lists of relative times.

Speeth, S. D. "Seismometer Sounds." J. Acous. Soc. Am. 33 (1961): 909--916.

The author audified seismic data (sped up the playback of data recorded by seismometers to place the resultant frequencies in the audible range), and then set human subjects to the task of determining whether the stimulus was a bomb blast or an earthquake (after an appropriate training program). In this experiment, subjects were able to correctly classify seismic records as either bomb blasts or earthquakes for over 90% of the trials. Furthermore, because of the time compression required to bring the seismic signals into the audible range, an analyst could review 24-hours worth of data in about 5 minutes.

Stifelman, L. J., B. Arons, C. Schmandt, and E. A. Hulteen. "VoiceNotes: A Speech Interface for a Hand-Held Voice Notetaker." In Proceedings of INTERCHI '93, 179--186. Reading, MA: ACM Press/Addison-Wesley, 1993.

VoiceNotes is an application for a voice-controlled hand-held computer that allows the creation, management, and retrieval of user-authored "voice notes"--small segments of digitized speech containing thoughts, ideas, reminders, or things to do. VoiceNotes explores the problem of capturing and retrieving spontaneous ideas, the use of speech as data, and the use of speech input and output in the user interface for a hand-held computer without a visual display.

Stratton, V. N., and A. H. Zalanowski. "The Effects of Music and Cognition on Mood." Psych. Music 19 (1991): 121--127.

The author presents evidence that although the expected responses to pieces of music, selected for their affect inducing effects, apparently influenced the mood state of subjects performing a concurrent cognitive task (storytelling about a picture), the effect disappeared when specific mood instructions were given with the storytelling instructions. There was evidence that familiarity with the music and subjects individual preferences for the music selected also affected the extent to which their mood was influenced by the music.

Strothotte, T., K. Fellbaum, K. Crispien, M. Krause, and M. Kurze. "Multimedia Interfaces for Blind Computer Users." In Rehabilitation Technology--Proceedings of the 1st TIDE Congress, held April 6--7, 1993, in Brussels. ISSN: 0926-9630. IOS Press, 1993.

This paper deals with selected aspects of blind peoples' access to GUI computer systems which are addressed by the GUIB project (Textual and Graphical User Interfaces for Blind people). A new loudspeaker-based device for two-dimensional sound output to enable users to locate the position of screen objects is described. In a prototypical application, blind people are given access to a class of computer-generated graphics using the new device in an interactive process of exploration.

Strybel, T. Z., A. M. Witty, and D. R. Perrott. "Auditory Apparent Motion in the Free Field: The Effects of Stimulus Duration and Intensity." Percep. & Psycho. 52(2) (1992): 139--143.

The authors find that a minimum duration of 10--50 msec is required for the perception of auditory apparent motion, with the exact time varying from listener to listener.

Stuart, R. "Virtual Auditory Worlds: An Overview." In VR Becomes a Business: Proceedings of Virtual Reality '92, held September 1992, in San Jose, CA, 144--166. Westport, CT: Meckler, 1992.

An overview of issues concerning virtual auditory environments and applications that have been proposed or on which work is proceeding. It includes an extensive bibliography.

Sumikawa, D. A., M. M. Blattner, K. I. Joy, and R. M. Greenberg. "Guidelines for the Syntactic Design of Audio Cues in Computer Interfaces." In Nineteenth Annual Hawaii International Conference on System Sciences Los Alamitos, CA: IEEE Computer Society Press, 1986.

The material for this article is drawn from an M.S. thesis with the same name by Denise A. Sumikawa, University of California, Davis, also published as Lawrence Livermore National Laboratory Technical Report, UCRL-53656, June 1985. The material was later extended and became: "Earcons and Icons: Their Structure and Common Design Principles."

Tenney, J., and L. Polansky. "Temporal Gestalt Perception in Music." J. Music Theory 24(2) (1979): 205--241.

The authors describe a simple computational system for identifying hierarchical structure in melodies based on frequency proximity and temporal distance, and assuming categorical perception. The model is based to some extent on Tenney's 1961 book, Meta Hodos.

Terenzi, F. "Design and Realization of an Integrated System for the Composition of Musical Scores and for the Numerical Synthesis of Sound (Special Application for Translation of Radiation from Galaxies into Sound Using Computer Music Procedures)." Unpublished manuscript, Physics Department, University of Milan, 1988.

Terenzi describes audification of radio astronomy data and musical use of the results.

Terhardt, E. "Pitch, Consonance, and Harmony." J. Acous. Soc. Am. 55(5) (1974): 1061--1069.

Stumpf used the listener's judgment of the harmonic interval's "fusion" as the ranking criterion for his perception study; Malmberg used "smoothness" for his. The discrepancies between their studies are a result of the uncontrolled overtone series used in tone production, and the bias due to the adjective used to describe the harmonic intervals. Psychoacoustic studies relating to harmonic intervals continued in the first half of the twentieth century. Reviews of their contributions can be found in this article.

Terhardt, E. "Toward Understanding Pitch Perception: Problems, Concepts and Solutions." In Psychophysical, Physiological and Behavioural Studies in Hearing, edited by G. van den Brink and F. A. Bilsen. Delft University Press, 1980.

Terhardt presents a broad overview of pitch perception concepts and terminology.

Terhardt, E. "Pitch of Pure of Tones: Its Relation to Intensity." In Facts and Models in Hearing, edited by E. Zwicker and E. Terhardt. New York: Springer-Verlag, 1974.

The author finds that for some individuals the change in pitch with intensity can be as large as that reported by Stevens; however, when these changes are averaged over many subjects, the changes are insignificant.

Terhardt, E. "Gestalt Principles and Music Perception." In Auditory Processing of Complex Sounds, edited by W. A. Yost and C. S. Watson. Hillsdale, NJ: Lawrence Erlbaum, 1987.

Terhardt identifies contours and categorized representations of sensory objects as the focus of gestalt perception. The theory presented is known as Hierarchical Processing of Categories (HPC) in which perception is organized on hierarchical layers, each of which are concerned only with categorizing or recategorizing. The most peripheral level corresponds to analytic listening and spectral pitch, analogous to primary visual contours, whereas synthetic forms are secondary, including virtual pitch. The multiple levels of perception associated with Western Tonal music are considered within the context of HPC.

Terwoght, M. M., and F. van Grinsven. "Musical Expression of Moodstates." Psych. Music 19 (1991): 99--109.

The authors present an experimental study of emotional response to short extracts from eight pieces of classical music.

Treisman, A. "Properties, Parts, and Objects." In Handbook of Perception and Human Performance, edited by K. R. Boff, L. Kaufman, and J. P. Thomas, Chap. 35. New York: Wiley, 1986.

Treisman explores the information-processing mechanisms that identify the objects and events of subjective experience from physical stimuli. Includes coverage of similarity judgment, perceptual analysis of dimensions/features/parts, and integration of parts and properties. Although most of the coverage and examples are based on visual studies, auditory studies are covered where appropriate.

Truax, B. Acoustic Communication. Norwood, NJ: Ablex, 1984.

Broad coverage of sound including speech, music, natural sound and sounds of modern life. Truax presents a communication model that places sound in a mediating role between listeners and the environment. He presents results of the World Soundscape Project.

Tzelgov, J., R. Srebro, A. Henik, and A. Kushelevsky. "Radiation Detection by Ear and by Eye." Human Factors 29(1) (1987): 87--98.

Tzelgov et al. found that in a seach task the auditory signal was better than the visual display or the dual mode system. In a detection task, there were no differences between the single modes and no differences between single modes and dual mode. Their interpretation of the results considers a visual bias effect in which the operator's attention is directed away from other aspects of the monitoring task.

Urdang, E. G., and R. Stuart. "Orientation Enhancement Through Integrated Virtual Reality and Geographic Information Systems." In Proceedings of CSUN's Seventh Annual International Conference on Technology and Persons with Disabilities, held March 1993 in Northridge, CA, 55--61.

The authors propose the use of virtual acoustic display in combination with geographic information systems to assist a visually impaired user to navigate in a city.

Vanderveer, N. J. Ecological Acoustics: Human Perception of Environmental Sounds. Dissertation Abstracts International. 40/09B, 4543. University Microfilms No. 8004002.

Probably the earliest example of an ecological approach to audition. Vanderveer describes the result of a protocol study in which people identified events from the sounds they made.

Vicario, G. B. "Some Observations in the Auditory Field." In Organization and Representation in Perception, edited by J. Beck. Hillsdale, NJ: Lawrence Erlbaum, 1982.

Vicario identifies dependency relations, effects of context (embedded figures), the concept of an auditory field, and completion phenomena in the auditory mode of perception.

Walker. R. "The Effects of Culture, Environment, Age, and Musical Training on Choices of Visual Metaphors for Sound." Percep. & Psycho. 42 (1987): 491--502.

The author reports on studies of choices of visual metaphors for sound parameters. In the sound domain he looked at frequency, waveform, amplitude, and duration.

Ward, W. D. "Subjective Musical Pitch." J. Acous. Soc. Am. 26(3) 369--380.

Ward presents empirical evidence that the pitch of pure tones is a subjective judgment of the listener, not very consistent across subjects, and whose rate of change is independent of the frequency level of the acoustic source.

Wallach, H., E. B. Newman, and M. R. Rosenzweig. "The Precedence Effect in Sound Localization." Am. J. Psych. 57 (1949): 315--336.

In a reverberant room, two similar sounds reach a subject's ears from different directions, with one sound following the other after a short delay; yet the subject fuses them into a single sound and localizes this sound based on the source of the first sound to reach the ears. The authors study this perceptual phenomena, which they term the "precedence effect."

Warren, D. H., R. B. Welch, and T. J. McCarthy. "The Role of Visual-Auditory `Compellingness' in the Ventriloquism Effect: Implications for Transitivity Among the Spatial Senses." Percep. & Psycho. 30 (1981): 557--564.

The authors study intersensory interactions and find, with sufficiently compelling cues, visual cues can [text missing?].

Warren, W. H., R. R. Verbrugge. "Auditory Perception of Breaking and Bouncing Events: A Case Study in Ecological Acoustics." J. Exp. Psych. 10 (1984): 704--712.

A seminal study of everyday listening which used analysis and synthesis of events to link acoustical information to event perception.

Weber, C. R. "Sonic Enhancement of Map Information: Experiments Using Harmonic Intervals." Unpublished dissertation, Department of Geography, State University of New York at Buffalo, 1993.

The author has found an relationship among aural variables that is analogous to the hierarchy of visual variables presented by Bertin (1983). Pitch supersedes both texture (consonance) and color (scale position).

Weber, C. R., and M. A. Yuan. "Statistical Analysis of Various Adjectives Predicting Consonance/Dissonance and Intertonal Distance in Harmonic Intervals." Technical Papers, ACSM/ASPRS Annual Convention, New Orleans, Vol. 1, 391--400.

The authors report successful delineation of relative consonance and intertonal distance selection by subjects associating dyads with various continua of cartographic adjectives. These results seem to hold only when the dyads are presented in isolation.

Welch, R. B. Perceptual Modification: Adapting to Altered Sensory Environments. New York: Academic Press, 1978.

Excellent classic source on adaptation and response to presentation of altered sensory cues. Welch also considers intersensory interactions (more on this can be found in a paper by Welch and Warren (1986)).

Welch, R. B., and D. H. Warren. "Intersensory Interactions." In Handbook of Perception and Human Performance, edited by K. R. Boff, L. Kaufman, and J. P. Thomas, chap. 25. New York: Wiley, 1986.

The authors review and evaluate research on intersensory bias, particularly interactions between vision and audition on detection, spatial localization, and perception of temporal events. They take the view that sensory modalities vary in their appropriateness for the perception of various events.

Wenzel, E. M., F. L. Wightman, and S. H. Foster. "Development of a Three-Dimensional Auditory Display System." SIGCHI Bull. 20 (1988): 52--57.

An early description of the three-dimensional auditory display system created at NASA-Ames that would become the Convolvotron. The authors describe measurement and testing of HRTFs.

Wenzel, E. M., and S. H. Foste. "Real-Time Digital Synthesis of Virtual Acoustic Environments." Comp. Graphics 24(2) (1990): 139--140.

Wenzel, E. M., F. L. Wightman, and D. J. Kistler. "Localization with Non-individualized Virtual Acoustic Display Cues." in CHI '91 Proceedings, 351--359. Reading, MA: ACM Press/Addison-Wesley, 1991.

Virtual interface research is represented in the work of Wenzel et al. (1988, 1990, 1991) who have developed three-dimensional auditory cues transmitted over user-worn headphones. The authors have found that even simple auditory cues--such as a sound signaling a direction, distance, and, finally, contact with a virtual object--can aid the user in manipulating the virtual world.

Wenzel, E. M. "Localization in Virtual Acoustic Displays." Presence: Teleop. & Virtual Environ. 1 (1992): 80--107.

Wenzel provides an overview of the acoustical, psychoacoustical, and technological bases for the synthesis of spatial sound in virtual displays, with an emphasis on the work conducted at NASA-Ames Research Center.

Wenzel, E. M. "Spatial Sound and Sonification." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

Wenzel provides a brief description of three-dimensional sound synthesis and describes the performance advantages that can be expected when these techniques are applied to sound streams in sonification displays. Specific examples, and the lessons learned from each, are discussed for applications in telerobotic control, aeronautical displays, and shuttle launch communications.

"What's That Noise." Home Mechanix (May 1986): 81--107.

This article includes descriptions of the sounds that can be useful in diagnosing problems with automobiles.

Wiener, F. M., and D. A. Ross. "The Pressure Distribution in the Auditory Canal in a Progressive Sound Field." J. Acous. Soc. Am. 18 (1946): 401.

The authors took sound measurements using probe microphones. According to Blauert (1969), they were the first to measure the linear distortions caused by pinna, head, and ear canal.

Wightman, F. L., and D. J. Kistler. "Headphone Simulation of Free-Field Listening I: Stimulus Synthesis." J. Acous. Soc. Am. 85 (1989): 858--867.

The authors describe a technique for measuring head-related transfer functions and synthesizing static virtual sound sources, which forms the basis of current approaches to spatial sound displays.

Wildes, R., and W. Richards. "Recovering Material Properties from Sound." In Natural Computation, edited by W. Richards. Cambridge, MA: MIT Press, 1988.

Using analytical physics, the authors suggest that auditory identification of material involves judging damping and partial bandwidth, which together specify the internal friction-characterizing materials.

Williams, M. G., S. Smith, and G. Pecelli. "Experimentally Driven Visual Language Design: Texture Perception Experiments for Iconographic Displays." In Proceedings of the IEEE 1989 Visual Languages Workshop, held in Rome, Italy, 62--67. Rome: IEEE, 1989.

In this report the authors describe the only formal experiment conducted to date by the University of Massachusetts' Lowell group to evaluate their "iconographic" approach to both visualization and sonification. Like many similar experiments conducted during the 1980s, it showed that subjects' performance on a data analysis task improved modestly when the subjects used a combined visual-auditory data display rather than just a visual data display.

Williams, M. G., S. Smith, and G. Pecelli. "Computer-Human Interface Issues in the Design of an Intelligent Workstation for Scientific Visualization." SIGCHI Bull. 21 (4) (1990): 44--49.

The Exploratory Visualization project presents sonification of anatomic map data through an iconic technique. See annotations to S. Smith and G. Grinstein.

Williams, S. M. "STREAMER: A Prototype Tool for Computational Modelling of Auditory Grouping Effects." Research Report No: CS-89-31, Department of Computer Science, University of Sheffield, 1989.

Williams presents a simple gestalt-based computational model of auditory streaming together with a proposition for a framework of auditory gestalt.

Williams, S. M. "Perceptual Principles in Sound Grouping." In Auditory Display: Sonification, Audification, and Auditory Interfaces, edited by G. Kramer. Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII. Reading, MA: Addison Wesley, 1994.

Overview of auditory perception from a gestalt viewpoint, presenting examples of phenomena which may influence the interpretation of Auditory Displays.

Witten, M. "Increasing Our Understanding of Biological Models Through Visual and Sonic Representations: A Cortical Case Study." Intl. J. Supercomp. Appl. 6(3) (Fall 1992): 257--280.

Witten describes the use of integrated sonification and visualization in representation of digitized image data.

Yeung, E. S. "Pattern Recognition by Audio Representation of Multivariate Analytical Data." Anal. Chem. 52 (1980): 1120--1123.

Audible display for experimental data from analytical chemistry, designed to incorporate auditory parameters exhibiting continuity in scaling and relative independence from each other. His display consisted of data vectors, each dimension of which corresponded to the detected levels of various metals in a given sample, with one vector per sample. The analysis task involved classifying a given vector as belonging to one of four sets, after having been trained with vectors from those four sets. Although Yeung did not compare the performance of his subjects using the auditory display with that of any other display, he noted that all of his subjects achieved the 98% correct classification rate after (at most) two training sessions.

Yost, W. A., and C. S. Watson, eds. Auditory Processing of Complex Sounds. Proceedings of a workshop in April, 1986. Hillsdale, NJ: Lawrence Erlbaum, 1987.

This book includes a wide range of psychoacoustic and physiological studies on the processing of complex sounds.

Zwislocki, J. "Temporal Summation of Loudness." J. Acous. Soc. Am. 46 (1969): 413--441.

In this study, among several others, Zwislocki shows that the ear averages sound energy over periods up to 200 milliseconds, so that loudness increases by 10 dB with a tenfold increase in duration.