tPMHighlighter
tPMHighlighter is a corpus linguistics tool designed to highlight patterns of language use in English language texts you provide. While corpus tools mainly focus on providing evidence and examples for language patterning by taking large collections of text and presenting concordance lines or summary data, tPMHighlighter draws attention to words and word combinations in your own running texts by checking for tendencies that match those in larger corpus collections.
tPMHighlighter is free to download and free to use.
Get tPMHighlighter
- Import English language texts from TXT, DOCX, PDF.
- Generate lists of n-grams (2-5 words in length) within each text and display information about any matches in The Prime Machine’s readymade online corpora. N-grams are classified into matching Lexical Bundles (c.f. Biber & Barbieri, 2007), other matching n-grams (that are frequent in the reference corpus) and non-matching n-grams (strings of words which are repeated within one of your texts, but are not frequent in the readymade reference corpus).
- Analyse combinations of each word in each of your texts and display information about the collocational strength of these combinations in The Prime Machine’s readymade online corpora.
- Check each word in each of your texts against Nation’s BNC+COCA word family wordlists (2017) and The Prime Machine’s readymade online corpora and display information about the frequency of matching words in these corpora.
- Check each word in each of your texts against a variety of readymade wordlists, including Coxhead’s Academic Word List (2000).
- Use Keyword and Key Keyword analysis for each of your texts, using one of The Prime Machine’s readymade online corpora as a reference corpus, and highlight or list the keywords and key keywords in each text.
- Check each word in your texts against the semantic labels of words in one of The Prime Machine’s readymade online corpora as a reference corpus, and show information about the top three matches.
- Find links between sentences in each of your texts, formed through repetition of word families and calculate and display information about simple bonding within and across texts (Hoey, 1991).
- Display statistics about each of your texts, including Type-Token Ratio, average word length, average sentence length and the percentage of words and sentences of different lengths.
When might I want to use tPMHighlighter?
tPMHighlighter helps you notice language choices in texts, comparing them to patterns in larger corpora. This helps you become more aware of possible ways that words and combinations of words either stand out or blend in to the genre or domain. This can be a starting point to help you learn more about patterns of language use, or to help you think about ways to adjust text to make it more or less conventional.
As an English language student:
- See how you repeat words and combinations of words in your own writing.
- Compare features of your own writing with models or AI generated text by creating a second corpus of the model writing).
- Gain insights into how different wordings in your own sentences can affect the degree to which your writing compares and contrasts with other samples.
As an English language teacher:
- Help students gain insights into the ways repetition of word forms and word families can aid and hinder good writing.
- Help students see useful collocations and phrases for a particular topic.
- Explore features of human and AI generated texts.
As a linguistics or literature student/researcher:
- Get a quick overview of how a range of corpus measures work, but showing you examples of patterns within your own texts.
- Get insights into how an author’s style is similar or different to a reference corpus.
- Explore features of one text in comparison to other related texts.