What Americans Need to Know About Chinese “Tones”

I always had trouble with the spoken language when I first started learning Mandarin. A friend in my dorm, Ann, gave me the most exasperated looks when she offered to help, then found herself helplessly lost in an endless loop with me.

Her: “Now, repeat after me. Bu.”

Me: “BOO.”

Her: “Bu.”

Me: “BOOooU?”

Her: “Bu!”

Me: “Hmm.”

Obviously, something wasn’t translating. In a normal Chinese class these days, you’ll get taught that there are four or five tones that you can use when pronouncing a syllable. When you get taught, your teacher will most likely speak verrry slowwwly, and overemphasize his or her pronounciation, with sharp changes in pitch.

When I was a student, my professor, Zhuang lao-shi, taught us through this method. We repeated after her: “BOO”, “booOOO”, “BooOO”, “BOoo”, singing awkwardly and uncomfortably through the lesson. To Americans, the word “tone” makes us think of tonal pitch, and the do-re-mi-fa-so-la-ti-do training we endured as children – so, when I was hearing this stuff, all I could tell was that each tone was supposed to change in pitch. First tone – high pitch. Second tone – low to high. Third tone – kinda starting in the middle, getting low, then going back up. Fourth tone – starts high, then ends low. Fifth tone – well, let’s just say that fifth tone was inscrutable, as it’s supposed to be “toneless.” But, if you’re anything like me, you might have wondered wonder how anything can’t have a pitch, the same way a young chemistry student might wonder how a solid could have a pH acidity reading if you can’t dip your pH paper into it.

So, in spite of my confusion, I continued onwards, speaking slowly and in a way that caused Chinese people to ask why I didn’t speak with my real voice. I got through three years of Chinese training in college, but unfortunately Chinese departments are so deathly afraid of losing all their students that they’ll give you A’s if you can just read and write. I sounded nasal and weird, and wasn’t precisely sure what was going on that was wrong, and left with the real misconception that Chinese have some unnatural ear for pitch that Americans don’t.

Then, years later, in the middle of a board game in which I was fumbling through my Chinese, I heard someone pronounce a syllable in an interesting way. “HuuuUUUUU,” went the word. It’s what you say when you switch out a piece in Mah Jong, and when heard in this context, it was spoken as a monotone that increased slowly and confidently in intensity, ending with an emphasis, as if a parent is giving a stern warning meant to discourage, and ends up emphasizing the second syllable to leave an ominous warning at last, i.e. “gordoNNNN… don’t you dare touch that cookie!”

It gave me pause, because the slowness of the pronunciation made it obvious to me that it was the fabled Second Tone… but the monotone threw me for a loop. If it could be that the pitch could be held constant, but the emphasis could change within a single syllable, was that what I was missing out on all those years in class?

Could it be that the word “tone” just struck an immediate, lasting, and fundamentally incorrect mental assumption about spoken Chinese that let me block out other variations in the pronunciation of words, such as emphasis and intensity?

Experience implied that an inconsistency so striking was worth a look. So, for several months, I attempted to change the way I listened to Chinese, listening for emphasis instead of pitch. Slowly, the importance of emphasis began to unfold before me, and I realized that steady strong emphasis was the telltale sign of first tone. An emphasis that slowly built like a crescendo was a sign of second. Fourth tone, with it’s sharp initial emphasis and quick drop-off, became a completely different beast to me – pronounced in this way, it sounds very harsh to Western ears, and can make the speaker sound agitated or angry in excited contexts. Third was like fourth, with a quick return to emphasis that hangs in the air, more enunciated the slower the word is pronounced. Fifth tone finally made sense – the lack of any emphasis at all, or a “soft” word.

Word after word, phrase after phrase, I started hearing things in a completely different light.

“BAA-ba”, pronounced with a constant strong emphasis in the first syllable, with a soft or no-emphasis second word following, is the way you say “father,” so you can imagine the difference between that, and nasally pronouncing the first syllable with a high pitch and then searching for an inexplicable no-pitch sound in the second.

Changing the way I understood tones made it much easier to listen to the flow of spoken Chinese. So much of real-world Chinese is relatively pitch-less but emphasized very dramatically, that it can be completely overwhelming and foreign to Americans who mistook tones for being only pitches.

In reality, much of American spoken English contains within it embedded meaning based on the emphasis within words and within sentences. I doubt that this is commonly taught to foreign speakers of English, but it’s there nonetheless. Americans definitely can understand how the emphasis put on a word can change its meaning significantly. Not only that, but there are far more than five tones in English (how do you classify “Shiiieeeet”?)! So why don’t Americans pick up on this more quickly? I have a theory.

The first reason is that horrible translation of “Si Sheng”, “Four Tones”. Tone connotes pitch too strongly in English, and I fear that it sends people down the wrong cognitive path, as it did for me. Secondly, Chinese are so certain that Americans just can’t speak Chinese, that they teach it by speaking extremely slowly, carefully enunciating with wide variations in pitch. When we attempt to mimic that, I believe that our brains pick up on the pitch changes first, and then as the words speed up, the teachers move to naturally using emphasis while the students are left stumbling over pitch changes that are completely foreign to Americans. Many completely give up, and just end up speaking all words as emphasis-less monotone, expecting all native Chinese speakers to be as patient and encouraging as their poor teachers.

The third one is a bit more confrontational to discuss. If you take a look about the Wikipedia entry about pinyin, the generally-accepted Romanization of Chinese words, you’ll see that it explicitly states that tones are changes in pitch. The graph you see on the right-hand side is pretty much the same thing that was given to me in handout form when I started learning. It’s the establishment way of teaching people, but it’s not enough. Sure, if you tune into a government official speaking to the public, they will sound almost musical in nature, shifting pitch in a slow, deliberate manner, and in this way, they express the official-ness of their words. But, if you pay attention, you’ll notice what’s present in every bit of conversational Mandarin – the sharp contrasts of emphasis that stand out with every syllable.

How would you fix it, and make it easier for Americans to learn Chinese? I’d definitely just abandon the concept of tone-as-pitch. Just throw it overboard. It’s too confusing for Americans to focus on pitch, and I think we’d learn less bad habits if we were given some mental symbolism that didn’t send us down the wrong path. Perhaps emphasis is the right word, perhaps not. At the least, the example in differences between pronunciation of a long-drawn out second tone, and a quick, spitting emphasis in fourth tone could go a long way towards demonstrating the difference. Having both male and female teachers during lessons about spoken Chinese might also help us get rid of our habits of attempting to mimic pitch only. In addition, learning several common Chinese sayings and then breaking them down might help. Once I learned how to pronounce “Gong Xi Fa Cai” (“Happy Chinese New Year”) perfectly from a friend, it was easy – but if I’d have started by reading the pinyin out loud, I’d have sounded quite ridiculous.

Nowadays, I’ve improved slightly in pronunciation and a great deal in comprehension, but unfortunately am past the days when I have a ton of time to practice. In reality, there’s both pitch, emphasis, and more going on inside the rhythm and meter of everyday Mandarin, and even knowing what I know now, it’s still a difficult mountain to climb. If I could have had this revelation earlier, I think it could have helped out, but it’s my hope that challenging the status quo on this one might do some good even if I’m hopeless. 🙂

10 thoughts on “What Americans Need to Know About Chinese “Tones”

  1. Wouldn’t it just be easier to teach everyone on China English?

    And by “easier,” I mean “easier for me.” As usual.

  2. Pinyin is a way to represent Chinese characters and express the sounds in the Chinese language using the alphabet. There are other systems to express Mandarin, but Pinyin is the most accepted and widely used. Once you learn Chinese Pinyin you will know how to pronounce any word in Mandarin using a Chinese dictionary. Pinyin is also the most common way to input Chinese characters into a computer. Although Pinyin and English both use the Roman alphabet, many letters are not expressed with the same sounds that English uses.

  3. @Chinesetomorrow:

    Yes, I know Pinyin, as it’s the system that was taught to me. Pinyin is a reasonable Romanization of Chinese pronunciation, but I am saying that the way the pronunciation is taught in the Pinyin-based schools is not enough to teach English speakers how to speak naturally. The article I linked to on Wikipedia is actually the article for Pinyin, and it tries to explain tones as pitch variations, the same way it is taught here. If you read my post carefully, this is what I am objecting against.

  4. getluky, I think most of us are not quite clear and thus find it hard to differentiate the definitions of tone, ptich, notes, octaves, frequencies in what we hear. So, I used a program called TS-AudioToMIDI to test the four tones by saying the four tones ma1, ma2, ma3, ma4 into the application through a microphone (used monosensor instead of polysensor mode). As I pronounced these four tones of chinese characters, I noticed that these tones as shown on the graphical keyboard have different notes. Different notes mean in the music context different frequencies and thus different pitches; and of course they also have different intensities, stress, or emphases as the way you liked to put it. So, the four tones are different in notes, pitches as well as emphases, intensity, loudness and stress. I certainly do not agree that they all have the same note with only changes in the emphases or stress.

    I have put some effort to understand what “tone” (using the four chinese tone as reference) does an english sylable has; and I discover that it has only two tones or notes: the 3rd and 4th tone. Try these two words for yourself: “Begin” and “Enter”. “Begin” has two sylables, the 1st sylable “Be” has a 3rd tone and the 2nd sylable “gin” has the 4th tone. And the word “Enter” also has two sylables but with 4th tone follow by 3rd tone. If you reverse the tones of the two sylables of each word, you will notice that they sound funny!

  5. I’d also like to add that I completely agree with your hope to abandon the concept of pitches. As someone who considers both English and Chinese (Mandarin and Shanghainese) to be my native tongues, I grew up hearing Chinese, but never took any formal or academic classes. As a result, the Chinese “tones” are very second-nature to me. However, if you were to utter a word and ask me to assign it with a tone, I’d be clueless. I hope there will evolve a Chinese speech-training method that builds cognitive INSTINCT, which is what I’ve always relied on.

  6. Thanks for the comments, everyone. @Tony, doing an actual tonal analysis is interesting, but I am concerned that you separated and enunciated the tones very strongly as a test, rather than analyzing common colloquial speech, which tends to emphasize tones less and other aspects more. I don’t mean to say that there are no tone variations, but rather, that there is more to the Chinese tones than just pitch. This is in direct opposition to how Chinese is taught to Americans.

    @Wendy: yes, Greg is just kidding. He’s a friend that just stops by to leave snarky comments. 🙂 I think that the academic methods do need to develop students’ instincts better by incorporating repetition of complex phrases without the emphasis being just on pitch.

  7. Nice article (though I’m English and still benefited despite not being an American as your title part advises I should be)

  8. i feel you’re missing something in your new understanding of tones. commenter Tony and Wikipedia are right: Mandarin syllables are distinguished by clear differences in pitch (with pitch variations even more profound in Cantonese!). while it’s true that fast speech or informal speech often smooths out hills and valleys somewhat, as a musician & Mandarin student the tones really stand out to me.

    respectfully, it sounds like you may be slightly tone-deaf. reading your blog article, your discovery that you should concentrate on emphasis more than pitches makes me think you’ve discovered a cue you can use to infer the pitch of spoken Mandarin.

  9. Chris, I respect your opinion and thank you for your comment, but I believe you just are basically conceding my point. Fast speech and informal speech is the norm for spoken Chinese, not the slow, measured pace of television newscasters or government speeches. I’m definitely someone who has trouble with pitch distinctions, and I don’t think I’m alone among Americans. It doesn’t mean I’m tone-deaf, it just means that the academic decision to teach ONLY tones has left a lot of people like me without much to go on. My whole point is that there is an extremely helpful way to teach listening comprehension which is either being ignored or just remains unknown by the academic establishment, and I’d have been much better off if someone just mentioned it to me from the beginning.

Comments are closed.