< Use a Language Corpus to Improve Your Vocabulary
By VOA
22 July 2023

A language corpus is a huge collection of texts taken from print, video, audio and other sources. It is a powerful tool that can help you learn more about English words and phrases.

In today's article, we look at the most widely used corpus: the Corpus of Contemporary American English, or COCA.

COCA contains more than one billion words from 1990-2019. And it has many ways to help learners find out more about how words are used.

Screenshot of Corpus of Contemporary American English, captured on July 20, 2023.
Screenshot of Corpus of Contemporary American English, captured on July 20, 2023.

Take the adjective profound as an example. In a dictionary, you will find that one meaning of profound is the adjective deep. So, you might think English speakers will say that a body of water is profound.

But English speakers never say that.

A corpus like COCA shows that the most common words that follow profound are effect or ability. And the most common words that follow the adjective deep are breath or water. For example, we will say:

Her teaching has a profound effect on students' learning.

You need to be a strong swimmer to swim in deep water.

COCA makes it easy to read and hear examples of how to use vocabulary correctly.

How to use COCA

First, you will need to register with COCA to use the collection. You can register for two kinds of accounts. A free account limits you to 20 searches for words within any 24-hour period. A paid account will give you 200 searches within any 24-hour period.

Here are the steps for using COCA to find examples of how a word is used. We will use the word profound to show this.

COCA also has a tool that lets you see many useful things about a word on one page. We will show this with the word deep.

You will see that in these examples, the noun breath occurs the most after the adjective deep.

You may wonder, how can we know for sure that the most common words that follow deep are breath or water?

Scroll back to the top of the "Word homepage" for deep. On the right, you will see the word COLLOCATES in blue. (A collocation is a word or phrase that is often used with another word or phrase.) Clicking on COLLOCATES will show you lists of words that follow deep. And you will see that breath and water are the first two words on the list.

Why the corpus is helpful

Knowing what words sound the best together can be hard for people learning English. A corpus can be helpful because it shows you ways that words are used together.

See if you can learn to use COCA by searching for the adjectives utter and diverse. Utter is both a verb and an adjective, so you must tell COCA to only search for adjectives. Here is how to do that:

Now follow the steps shown above for profound and deep, but this time, use the words utter and diverse.

If you find on the "Word homepage" for utter that COCA has searched for the verb form, you can change it to adjective. To do this, click on ADJ in blue just below the word utter on the top left area of the page.

When you perform a search, what do you notice about the nouns that follow utter? About the nouns that follow diverse?

Write to us and let us know what you found.

Learning how to use a corpus can take some time. COCA has a guide showing you the many ways to use it. You can also find tutorials on YouTube.

Just remember that a corpus is a tool that can help you deepen your understanding of vocabulary and the ways to use it.

And in another Education Tips, we will explain more about the importance of collocations and how a corpus can help you learn them.

I'm Andrew Smith. And I'm Jill Robbins.

Andrew Smith wrote this article for VOA Learning English.

___

Words in This Story

corpus -n. a large collection spoken or written texts that are used for language research.

pronunciation -n. the way a word or phrase is pronounced.

translation -n. the meaning of a word in a different language

synonym -n. a word that means the same, or is very close in meaning, to another word

网站首页 电脑版 回到页首