Tuesday, April 21, 2015

Collocation Research

Here is an interesting study of collocations in the English language:
https://www.academia.edu/8563631/The_high_frequency_collocations_of_spoken_and_written_English

There are two significant findings by the researchers:
  1. Differences between collocations between spoken and written sets. In particular how different the collocations are between the sets: Each has a very distinct set of collocations. And secondly, the high frequency of collocations in spoken language compared to written language.
  2. The large number of collocations meeting the criteria, and the large number of these that would qualify for inclusion in the most frequent 1,000 items in English if no distinction was made between single words and collocations.
The here-and-now nature of spoken language is reflected in items like this morning, at the moment, last night, and over there , and the personal and interactional nature is reflected in items like thank you, thank you very much, you know, I think , and come in.

Personally, I find the most significant finding the enormous difference in the frequency of the items. Although the total number of different items meeting the various criteria was virtually the same in both corpora (2,261 in the spoken corpus and 2,266 in the written corpus), the top 50 spoken collocations occurred 147,217 times, while the top 50 written collocations occurred only 48,782 times. That is, the top 50 spoken collocations occurred almost three times as often as the top 50 written collocations. Spoken language makes much more frequent use of its common collocations than written language does. These results show that spoken collocations have a more important role in spoken language than written collocations do in written language, thus, spoken collocations particularly deserve attention in language teaching.


There were approximately 2,300 spoken and written collocations.

All top fifty spoken collocations would qualify for entry into the most frequent 1000 words of spoken English. All the top 50 spoken collocations are within the cut-off point for the first 1,000 single word types, and by comparison 14 written items would make the top 1,000.

There are 162 collocations in the spoken corpus which would get into the top 2000 words of spoken English, and 56 of these would be in the first 1000. There are 41 collocations which would get into the top 2000 words of written English, 14 of these would be in the first 1000.There are thus a large number of collocations that are of very high frequency.

Here is a list of the top 50 spoken collocations

No comments:

Post a Comment