The two-sides rule in teaching
listening and pronunciation

by Richard Cauldwell
3.5 The Pronunciation layer

This section leads on to the second layer in the materials, Pronunciation, which like its Listening predecessor, consists of two sections. At the core of the first Pronunciation section (section 4 in the chapter) is a table of a set of either vowels or consonants, with a pair of speech-units, extracted from the original recording, in which the target vowel/consonant is spoken in a prominent syllable (see Figure 3).

Figure 3 Table of short vowels and speech unit models

Note: the symbols in the left hand column are those for the short vowels of English; the central column contains the sample speech units from the original recording, with the target sound shown in the syllable in bold upper-case letters; the right hand column shows the speed of the speech unit in words per minute. Syllables in upper-case are prominent syllables

The student’s task is to listen to the original (by clicking on the relevant speech unit), practise speaking it, record their version, and compare their version to the original. They are then asked to evaluate their own performance. In piloting the materials, students had difficulty getting up to speed, and so each Chapter now has specific guidance on how to progress from citation form speech, to the speed of the original recording.

The second section of the pronunciation layer (section 5 in the chapter) presents the student with an extended extract of speech for listening, imitation, recording & comparing. In such longer extracts, the variability of speech, more easily seen over a number of speech units, is practised (see Figure 4).

Figure 4 An extended extract

The first eight chapters follow the same pattern, covering between them all the vowels and consonants of English.

4 Chapter 9 – a choice of speaker to model pronunciation

Chapter 9 ‘Segments Workshop’ allows students to choose one of six speakers (three female, three male) to act as models for the full inventory of vowels and consonants. (See Figure 5.)

Figure 5

So if you like Corony’s voice, you can use her voice as a model for pronunciation; if you prefer Philip’s voice, you can use him as a model for all the vowels and consonants of English.

5 Chapter 10 – the patterns of normal speech

Chapter 10 is designed primarily for teachers: it provides, in one place, intensive training in recognising the patterns of normal speech which were identified and taught, in a pedagogically staged manner in the ‘filling’ (Discourse Feature) sections of each of Chapters 1-8.

6 The target audience

Streaming Speech is for those who aspire to be advanced users of the spoken language: those who want to handle fast speech both in listening and in their own vocal production. Streaming Speech is appropriate for three groups of people: those studying for high-level English examinations, those preparing to study in an English-speaking country, and non-native speaker teachers of English, either in training, or already at work.

7 Conclusion

The exciting thing about developing Streaming Speech has been the extent to which some of the tenets of phonology, as they are taught on TEFL courses, are challenged by the evidence of spontaneous speech. Normal speech is stream-like, and within its limits, infinitely variable, and appears not to obey many of the rules suggested by textbooks. Speed varies moment by moment, rhythmic patterns vary moment by moment (instances of ‘stress-timing’ are extremely rare), most yes/no questions have falling tone, and are not rude.

Using spontaneous speech also reveals native speaker strategies and short-cuts that make the task of speaking easier. Easier both in the sense of strategic planning, and in the degree of accuracy to be aimed at. An example of a strategy: native speakers use a wide variety of pause phenomena, including dwelling/pausing on content words (with level tone) to give themselves time to plan what to say next. This is a strategy worth teaching. As far as accuracy is concerned, consonant clusters are often simplified to simple consonants by native speakers particularly when they occur early in a speech-unit – we need to allow our students to avail themselves of the short-cuts that native speakers take. For too long our segment-focused, citation-from focused work on the spoken language has been based on views about how speech ‘ought to be’, rather than ‘how it is’. We can improve our students’ ability to handle fast speech both in fluent production and in understanding the stream of speech if we attend to the evidence of normal speech, and the implications of the two-sides rule.


Born in Dublin, educated in England, Richard has taught English in France, Hong Kong, and Japan. Between May 1990, and September 2001, he worked at The University of Birmingham's (UK) English for International Students Unit (EISU). He now works free-lance, continuing his research, and applying the results of this research in teacher training, and classroom materials. His research and teaching centre on spontaneous speech which he attempt to analyse on its own terms – in all its continually varying, stream-like, real-time, contextual glory.

Richard's web site, which contains his research & articles, can be found at:

And he can be contacted at:

