Scaletti provides a working definition of sonification. Models of data
as continuous streams and as discrete events are illustrated using the
author's sound specification language Kyma. Several sound synthesis algorithms
are outlined, each with an example application, and there is a summary
of which synthesis algorithms are best applied to which kinds of data.
The paper concludes with an enumeration of some of the open questions and
future research directions in the field of auditory display.
Scaletti, C., and A. B. Craig. "Using Sound to Extract Meaning from
Complex Data." In Extracting Meaning from Complex Data: Processing,
Display, Interaction, edited by E. J. Farrel, Vol. 1259, 147--153.
Because sound is an inherently time-variable phenomena, Scaletti and Craig
concentrated their ADR work on the representation of time-varying data
mapped to animated graphics and sound. Examples discussed include sonifications
of forest fire data, Los Angeles pollution levels, and swinging pendula.
Schafer, R. M. The Tuning of the World. New York: Knopf, 1977.
Schafer describes studies of the acoustic environment undertaken in the
World Soundscape Project including historical changes in the acoustic environment,
cross-cultural studies of listening preferences and sound interpretation,
and studies of references to sound in literature.
Scharf, B., and S. Buus. "Audition I: Stimulus, Physiology, Thresholds."
In Handbook of Perception and Human Performance, edited by K. R.
Boff, L. Kaufman, and J. P. Thomas, Vol. 1, 14.1--14.71. New York: Wiley,
Standard reference to have on your bookshelf.
Scharf, B., and A. J. M. Houtsma. "Audition II: Pitch, Localization,
Aural Distortion, Pathology." In Handbook of Perception and Human Performance,
edited by K. R. Boff, L. Kaufman, and J. P. Thomas, Chap. 26. New York:
This book covers psychophysical performance in detection and discrimination
of intensity and frequency, sound localization, and perception of loudness
Scherer, Klaus R., and James S. Oshinsky. "Cue Utilization in Emotion
Attribution from Auditory Stimuli." Motivation & Emotion 1(4)
The authors describe a study using a MOOG synthesizer in which seven 2-level
factors, amplitude and pitch level, pitch contour, pitch variability, tempo,
envelope and filtration, and other more complicated stimuli were systematically
manipulated and then rated on emotional impact. Inter-judge (naive students)
agreement was generally good, some emotions having more reliable cues than
others, as might be expected.
Schmandt, C., B. Arons, and C. Simmons. "Voice Interaction in an Integrated
Office and Telecommunications Environment." In Proceedings of 1985 Conference.
American Voice I/O Society, 1985.
The Conversational Desktop is a conversational office assistant that manages
personal communications (phone calls, voice mail messages, scheduling,
reminders, etc.). The system engages the user in a conversation to resolve
ambiguous speech recognition input.
Schmandt, C., and B. Arons. Conversational Desktop (videotape).
ACM SIGGRAPH Video Rev. 27 (1987).
A four-minute videotape demonstrating many features of the Conversational
Schmandt, C., and B. Arons. "Getting the Word." UNIX Rev. 7
(Oct. 1989): 54--62.
An overview of "Desktop Audio" including the systems and interface requirements
for the use of speech and audio in the personal workstation. It includes
a summary of the VOX Audio Server, a system for managing and controling
the audio resources in a networked personal workstation.
Schroeder, M. R. "Digital Simulation of Sound Transmission in Reverberant
Spaces." J. Acous. Soc. Am. 47 (1970): 424--431.
One of the core papers discussing techniques for the simulation of reverberant
Sloboda, J. A. "Music Structure and Emotional Response: Some Empirical
Findings." Psych. Music 19 (1991): 110--120.
The author presents analysis of experimental results of emotive response
to music extracts related to the musical structure of the compositions.
Smith, R. B. "A Prototype Futuristic Technology for Distance Education."
In Proceedings of the NATO Advanced Workshop on New Directions in Educational
Technology, held November 10--13, 1988, in Cranfield, UK.
Smith describes SharedARK, a collaborative system that was the basis of
SoundShark (Gaver & Smith, 1990) and ARKola (Gaver et al., 1991).
Smith, S. "An Auditory Display for Exploratory Visualization of Multidimensional
Data." In Workstations for Experiment, edited by G. Grinstein and
J. Encarnacao. Berlin: Springer-Verlag, 1991.
Although it was published in 1991, this is actually the earliest paper
about the University of Massachusetts' Lowell work in sonification. It
shows what their thinking was as they embarked on their investigations
in 1988, which is now mostly of historical interest.
Smith, S., R. D. Bergeron, and G. Grinstein. "Stereophonic and Surface
Sound Generation for Exploratory Data Analysis." In Multimedia and Multimodal
Interface Design, edited by M. Blattner and R. Dannenberg. Reading,
MA: ACM Press/Addison-Wesley, 1992.
Smith, S., R. D. Bergeron, and G. Grinstein. "Stereophonic and Surface
Sound Generation for Exploratory Data Analysis." In Proceedings of CHI
'90, held 1990, in Seattle, WA. ACM Press, 1990.
This paper, published in two places, describes the authors' attempt to
introduce spatial aspects of sound into sonification. This direction was
not pursued further.
Smith, S., G. Grinstein, and R. M. Pickett. "Global Geometric, Sound,
and Color Controls for the Visualization of Scientific Data." In Proceedings
of the SPIE/SPSE Conference on Electronic Imaging, Vol. 1459, 192--206.
San Jose, CA: SPIE, 1991.
The authors argue that users should be able to fine-tune visual and auditory
data displays to achieve the optimal presentation of their data. They gives
examples of how this can be done with the "iconographic" display techniques
they developed. The accompanying video gives one brief sound example.
Smith, S., R. M. Pickett, and M. G. Williams. "Environments for Exploring
Auditory Representations of Multidimensional Data." In Auditory Display:
Sonification, Audification, and Auditory Interfaces, edited by G. Kramer.
Santa Fe Institute Studies in the Sciences of Complexity, Proc. Vol. XVIII.
Reading, MA: Addison Wesley, 1994.
The authors outline a starting approach to sonification and argue for psychometric
testing as part of the sonification design process.
Sorkin, R. D. "Design of Auditory and Tactile Displays." In Handbook
of Human Factors, edited by G. Salvendy, 549--576. New York: Wiley
& Sons, 1987.
In this chapter Sorkin addresses factors that must be considered in establishing
the level, pitch, duration, shape, and temporal pattern of a sound. In
addition, he covers the design of binaural sounds and complex coding for
Sorkin, R. D., F. L. Wightman, D. S. Kistler, and G. C. Elvers. "An
Exploratory Study on the Use of Movement-Correlated Cues in an Auditory
Head-Up Display." Human Factors 31 (1989): 161--166.
A sequence of three signals incorporating HRTF cues for auditory localization,
was played to subjects over headphones, and subjects had to indicate the
location of the source via keypress on a computer. This study focused on
the importance of head movement in localization, and there were three conditions
presented: (1) source fixed in physical space, head movement allowed; (2)
no head movement allowed; and (3) source fixed in position relative to
the subject's head. Azimuthal localization was found to be considerably
better in the first case (source fixed in physical space/head movement
allowed), demonstrating the importance to auditory localization of correlating
cues to self-initiated movements of the listener's head.
Sorkin, R. D., D. E. Robinson, and B. G. Berg. "A Detection Theory Method
for Evaluating Visual and Auditory Displays." In Proceedings of the
Human Factors Society, Vol. 2, 1184--1188, 1987.
This paper describes a signal detection method for evaluating different
display codes and formats. The method can be used to assess the relative
importance of different elements of the display. The paper briefly summarizes
data from different types of auditory and visual displays.
Sorkin, R. D., and D. D. Woods. "Systems with Human Monitors: A Signal
Detection Analysis." Hum.-Comp. Inter. 1 (1985): 49--75.
This paper analyses the general system composed of a human operator plus
an automated alarm subsystem. The combined human machine system is modeled
as a two-stage detection system in which the operator and alarm subsystem
monitor partially correlated noisy channels. System performance is shown
to be highly sensitive to the decision bias (response criterion) of the
alarm. The customary practice of using a "liberal" bias setting for the
alarm (yielding a moderately high false alarm rate) is shown to produce
poor overall system performance.
Sorkin, R. D., B. H. Kantowitz, and S. C. Kantowitz. "Likelihood Alarm
Displays." Human Factors 30 (1988): 445--459.
This study describes a type of multilevel or graded alarm display in which
the likelihood of the alarmed condition is encoded within the display.
For example, the levels of an auditory alarm could vary by repetition rate
or voice quality; and the levels of a visual display could vary by color.
Several dual-task (tracking and alarm monitoring) experiments demonstrate
the feasibility of Likelihood Alarm Displays.
Sorkin, R. D. "Why are People Turning Off Our Alarms?" J. Acous.
Soc. Am. 84 (1988): 1107--1108. Reprinted in Human Factors
Soc. Bull. 32 (1989): 3--4.
In this short paper Sorkin describes several tragic accidents in which
auditory alarms had been disabled or ignored. The author argues that two
culprits are high false-alarm rates and excessive sound levels.
Sorkin, R. D. "Perception of Temporal Patterns Defined by Tonal Sequences."
J. Acous. Soc. Am. 87 (1990): 1695--1701.
In this study Sorkin describes a general model (the temporal correlation
model) for predicting a listener's ability to discriminate between two
auditory tone sequences that differ in their temporal pattern. According
to the model, the listener abstracts the relative times of occurrence of
the tones in each pattern and then computes the correlation between the
two lists of relative times.
Speeth, S. D. "Seismometer Sounds." J. Acous. Soc. Am. 33
The author audified seismic data (sped up the playback of data recorded
by seismometers to place the resultant frequencies in the audible range),
and then set human subjects to the task of determining whether the stimulus
was a bomb blast or an earthquake (after an appropriate training program).
In this experiment, subjects were able to correctly classify seismic records
as either bomb blasts or earthquakes for over 90% of the trials. Furthermore,
because of the time compression required to bring the seismic signals into
the audible range, an analyst could review 24-hours worth of data in about
Stifelman, L. J., B. Arons, C. Schmandt, and E. A. Hulteen. "VoiceNotes:
A Speech Interface for a Hand-Held Voice Notetaker." In Proceedings
of INTERCHI '93, 179--186. Reading, MA: ACM Press/Addison-Wesley, 1993.
VoiceNotes is an application for a voice-controlled hand-held computer
that allows the creation, management, and retrieval of user-authored "voice
notes"--small segments of digitized speech containing thoughts, ideas,
reminders, or things to do. VoiceNotes explores the problem of capturing
and retrieving spontaneous ideas, the use of speech as data, and the use
of speech input and output in the user interface for a hand-held computer
without a visual display.
Stratton, V. N., and A. H. Zalanowski. "The Effects of Music and Cognition
on Mood." Psych. Music 19 (1991): 121--127.
The author presents evidence that although the expected responses to pieces
of music, selected for their affect inducing effects, apparently influenced
the mood state of subjects performing a concurrent cognitive task (storytelling
about a picture), the effect disappeared when specific mood instructions
were given with the storytelling instructions. There was evidence that
familiarity with the music and subjects individual preferences for the
music selected also affected the extent to which their mood was influenced
by the music.
Strothotte, T., K. Fellbaum, K. Crispien, M. Krause, and M. Kurze. "Multimedia
Interfaces for Blind Computer Users." In Rehabilitation Technology--Proceedings
of the 1st TIDE Congress, held April 6--7, 1993, in Brussels. ISSN:
0926-9630. IOS Press, 1993.
This paper deals with selected aspects of blind peoples' access to GUI
computer systems which are addressed by the GUIB project (Textual and Graphical
User Interfaces for Blind people). A new loudspeaker-based device for two-dimensional
sound output to enable users to locate the position of screen objects is
described. In a prototypical application, blind people are given access
to a class of computer-generated graphics using the new device in an interactive
process of exploration.
Strybel, T. Z., A. M. Witty, and D. R. Perrott. "Auditory Apparent Motion
in the Free Field: The Effects of Stimulus Duration and Intensity." Percep.
& Psycho. 52(2) (1992): 139--143.
The authors find that a minimum duration of 10--50 msec is required for
the perception of auditory apparent motion, with the exact time varying
from listener to listener.
Stuart, R. "Virtual Auditory Worlds: An Overview." In VR Becomes
a Business: Proceedings of Virtual Reality '92, held September 1992,
in San Jose, CA, 144--166. Westport, CT: Meckler, 1992.
An overview of issues concerning virtual auditory environments and applications
that have been proposed or on which work is proceeding. It includes an
Sumikawa, D. A., M. M. Blattner, K. I. Joy, and R. M. Greenberg. "Guidelines
for the Syntactic Design of Audio Cues in Computer Interfaces." In Nineteenth
Annual Hawaii International Conference on System Sciences Los Alamitos,
CA: IEEE Computer Society Press, 1986.
The material for this article is drawn from an M.S. thesis with the
same name by Denise A. Sumikawa, University of California, Davis, also
published as Lawrence Livermore National Laboratory Technical Report, UCRL-53656,
June 1985. The material was later extended and became: "Earcons and Icons:
Their Structure and Common Design Principles."