This spreadsheet is designed to calculate popular association measures for collocations.
I have put this together both to help researchers who want to calculate large numbers of association measures at the same time and to make the calculations more transparent. Because all of the calculations needed are set out within the spreadsheet, researchers can explore these calculations for themselves to get a better understanding of how they work.
Using the spreadsheet
To calculate the measures, you need to know:
- the total size of your corpus in words;
- the frequency in your corpus of each word in the collocation;
- the frequency in your corpus of the collocation.
To calculate the scores, go to the appropriate sheet then:
- enter the corpus size (in words) in the column titled ‘corpus size’
- enter the frequencies of each word in the columns titled W1, W2, W3, etc.
- enter the frequency of the collocation in the column titled ‘collocation count’
The spreadsheet comes with two lines filled in as examples. These can be deleted. If you required more blank lines, just copy and paste the create more.
For details on how these figures are calculated, check one or more of the following:
- Gries, S. & Durrant, P. (forthcoming). Analyzing co-occurrence data. In S. Gries & M. Paquot (Eds). A practical handbook of corpus linguistics.New York: Springer.
- Gries, S. 50-something years of word on collocations: What is or should be next…. International Journal of Corpus Linguistics, 18(1), 136-165.
- Durrant, P. (2008). High frequency collocations and second language learning. PhD thesis, University of Nottingham. Section 4.3. pdf