vocabularyprofiles - tPMHighlighter - a tool to help you gain insights into text

tPMHighlighter
tPMHighlighter
Go to content
Vocabulary Profile
What are vocabulary profiles?
It is well known that some words occur very very frequently, some words occur fairly often and many words are very rare. Vocabulary profiles take either word families or specific word forms (types) and provide information on where these items are ranked in a larger reference corpus. tPMHighlighter uses Nation’s (2017) BNC+COCA wordlist for a general all-purpose list of English words, grouped into word families, and the wordlist (of types) from the readymade corpus selected when the corpus is built.

How does tPMHighlighter work with vocabulary profiles under the hood?
When the corpus is built, the wordlist from your corpus is sent to The Prime Machine server and each item is checked against the BNC+COCA wordlist and the reference corpus wordlist. Matching items for the BNC+COCA wordlist are given the level score, ranging from 1 (very high frequency word families) to 34 (rare word families). Matching items for the readymade reference corpus are given the log of the rank for the type in the corpus, typically ranging from 0 to 15. These log ranks are divided into 10 levels for display in tPMHighlighter.
When a text is displayed, each word can be given a colour to represent its level.  Words in the text which match the wordlist will be highlighted if they have a vocabulary level lower or equal to the slider setting. The multicoloured button will work through the list of highlighting colours, highlighting words for each level with a different colour.

In the screenshot, you can see that the text has been highlighted using colours according to the frequency of the word (type) in the reference corpus – in this case BNC: Academic.  Words such as freedoms and snapshots are relatively less frequent in this reference corpus.  The purple words are high frequency words, while moving from colour to colour the words become relatively less frequent in the reference corpus.  Using the dropdown menu next to the colour box, you can change between using multi-colour highlighting to using different shades of the same colour.
On this screen, you can also highlight words from the Academic Word List using the B-AWL button in the top-right.

How does tPMHighlighter work with vocabulary profiles under the hood?
When the corpus is built, the wordlist from your corpus is sent to The Prime Machine server and each item is checked against the BNC+COCA wordlist and the reference corpus wordlist. Matching items for the BNC+COCA wordlist are given the level score, ranging from 1 (very high frequency word families) to 34 (rare word families). Matching items for the readymade reference corpus are given the log of the rank for the type in the corpus, typically ranging from 0 to 15. These log ranks are divided into 10 levels for display in tPMHighlighter.  The Academic Word List is based on Cohead (2000).
When a text is displayed, each word can be given a colour to represent its level.  Words in the text which match the wordlist will be highlighted if they have a vocabulary level lower or equal to the slider setting. The multicoloured button will work through the list of highlighting colours, highlighting words for each level with a different colour.

Back to content