This media package is designed to encourage future research. We have deliberately made the transcripts and sound easy to download and analyse. It should be usable in a wide variety of text analysis software; one good set is the CLAN and CHAT coding system and analytic software available via

Over the past 20 years, Erbaugh's original Mandarin recordings and transcriptions circulated 'underground'. Erbaugh frequently attends conferences and hears talks on the Pears in which the speaker is innocently but annoyingly unaware of their source. All we ask is that you cite Erbaugh's Chinese Pear Stories in any research or writing which uses this media package, developed with generous support from the Research Grants Committee of the Hong Kong Special Administrative Region.

Comparisons with other dialects are an obvious area for investigation.
We encourage our colleagues to use the Pears film, and add other languages to the collection. Many people ask why we do not include Beijing Mandarin or Meixian Hakka or the extremely rich tones of Fuzhou dialect. The reasons are purely practical. In the 1970's, the U.S. government did not permit U.S. citizens to do research in the People's Republic. Taiwan welcomed U.S. researchers with tape recorders. For the other dialects, the original Hong Kong grant covered data collection only in Hong Kong. So these Hakkas are all Hong Kong residents. The original grant proposed taping the other dialects among mainland emigrees to Hong Kong. Finally, we were permitted to vary the contract to allow collaboration with mainland colleagues.

Lateral comparisons across the dialects. Most previous work on dialect, from Yang Xiong to the present, matches phonemes or word lists. Most also, has, of necessity, compared Mandarin to X dialect (but see Cheng 1996). The Pear Stories allow precise lateral discourse comparisons across dialects. They also reveal subtle dimensions for both phonological and grammatical distance and variation. For phonological comparisons, tone patterns and tone sandhi are especially well documented. The Cantonese sample, for example, documents loss of the high falling tone and replacement of initial/n/ to /l/ among young Hong Kong speakers.

Lexical variation is strongest in informal, connected speech. The degree of compounding is an especially sensitive indicator. Grammatical indicators, including word order, classifiers, and time/aspect morphology, and mood particles, also help document dialect relations, particularly the complex and controversial relationships between Hakka and Gan, and Hakka, Min, and Cantonese (Cheng 1996, Huang 1987, Norman 1988).

Intelligibility across dialects is a crucial but understudied issue. Most previous, pioneering studies carefully probe single words. Discourse studies, weighted for redundancy, are also needed. The Pear Stories also make it possible to vary systematically the type and amount of context provided, and the amount of speech required for varying degrees of comprehension. Listeners can also be varied systematically according to literacy, level of education, number of dialects known, and familiarity or non-familiarity with a 'bridge' dialect such as Mandarin or Cantonese. Foreign speakers of Mandarin can also be tested for intelligibility. Those whose mother tongue is a tone language will likely do much better.

Relations with non-Chinese substrata, including Tai and Malay should be probed. The greater Cantonese classifier use may, for example, reflect a Tai language substratum.

Comparisons of language distance with non-Chinese language families can also be pursued. Lexical, phonological, and grammatical distance in European families such as Germanic, Romance or Slavic are well-documented.
How great are such differences between Shanghai and Changsha, or Gan and Southern Min?

Psycholinguistic issues particular to Chinese include the psychological processing of tones. Tone neutralization, sandhi, and errors corrections can all be monitored. Are dialects such as Southern Min, which have many tones with complex sandhi, more prone to tone error? What is the distribution among the tones, and the relation between contour and level tones. Other errors, pauses, self-corrections and re-statements are also highly revealing of the basic units of mental processing.

Child language and narrative is also important. Seven-year-old Hong Kong children have described the Pears film. Though the task is rather difficult for them, they still use four times more specific classifiers than the Mandarin-speaking adults (Erbaugh 2001), demonstrating that higher classifier use does not indicate superior cognitive development, but simply a grammatical difference in Cantonese.

Individual variation speech rate, fluency, range of vocabulary, and types and frequency of error are important but little understood areas. Yet they must be probed across normal speakers, before we can set norms for foreign students.

At the sociolinguistic level, most samples show considerable code mixing with Mandarin or Cantonese (and a little English). Code mixing is most obvious at the lexical level, but appears more subtly in pronunciation and grammar. Cultural issues, such as the evaluative comments and proverbs, have not been compared across the dialects. Language attitudes can also be surveyed these neutral content samples. Taiwan Mandarin, Cantonese, Southern Min, and Shanghai Wu are more likely to produce strong evaluative reactions, since more non-locals come into contact with them.

Future studies might also systematically compare male and female speakers, and speakers of different social backgrounds.

Foreign language student comparisons are also important. Some aspects of language are universal, carried over to even beginning students. Basic rhetorical organization and highlighting of main actors and events seem available to all adults. Word order seems more easily mastered by both infants and adults. Other areas remain difficult even for advanced speakers. Tones are notoriously difficult for speakers of non-tone languages. Verb complements and mood particles are also difficult for Europeans. Students of English typically have persistent difficulty with definite and indefinite articles. But teachers should not expect an elevated written style, or idealize native speakers. The choppy and repetitive English Pear Stories make possible more nuanced and realistic comparisons.

Written versus Spoken Language is a vital distinction which is just now being explored systematically. Linguists have documented that speakers of unwritten languages are just as gifted verbally as those whose language have long written traditions. The issues for Chinese are even more complex because of 2,000 years of unified written standard. Mandarin is much closer to written style. It is possible, however, to measure systematically, the distance to the main Chinese dialects.

Issues of dialect writing are so complex that we did not attempt transcription in IPA. However, the errors in writing characters systematically reveal the writer's underlying sound system and rules for generalization. Asking native speakers to attempt dialect writing in characters (now very popular in the Hong Kong press) can also reveal many regularities in the underlying phonetic analysis.

FUTURE CLINICAL APPLICATIONS for testing and treatment of language disorders for language delayed children, dyslexic readers, and disabled speakers such as stroke or dementia victims require a baseline of normal speech. The need is especially urgent for baseline materials for dialect speakers. The Pear Stories are a first step toward this goal.