variouswordlists - tPMHighlighter - a tool to help you gain insights into text

tPMHighlighter
tPMHighlighter
Go to content
Various Wordlists
What are the various wordlists?
The Prime Machine server has wordlists for personal pronouns (e.g. I, you, they), modals (e.g. could, should), reporting verbs (e.g. said, argues, suggests), academic words (e.g. research, evidence), positive words (e.g. important, community) and negative words (e.g. problem, difficult).
It is important to note that matches are made without attention to use of capital letters, part-of-speech tagging or according to different senses of words, so not all matches will be reliable. For example, in the results for modals, may will match “It may rain” (modal), “His birthday is in May” (not a modal) and “Mrs. May” (not a modal).  
Personal pronouns and modals are short simple lists. The list used on the server for reporting verbs draws from Semino & Short (2004). The Academic Wordlist is based on Coxhead (2000). Positive and negative wordlists draw on three sources: Mohammad & Turney (2012), Bodnaruk et al. (2015) and the General Inquirer (2000).
In the screenshot, we can see that two texts from the second corpus are being viewed side-by-side, with different colours for three of the available wordlists.  We can see there are several hits for the Academic Word List (Coxhead, 2000) in both essays, while personal pronouns and modals have low frequency in the left-hand text, and no visible hits in the right-hand text.
How does tPMHighlighter work with the various wordlists under the hood?
When the corpus is built, the wordlist from your corpus is sent to The Prime Machine server and each item is checked against each of the wordlists.
When a text is displayed, each word that matches the selected wordlist can be given a colour.  

Back to content