Write for everybody: What to do instead of chasing grade levels
Most people misunderstand the intended use for readability scoring by computer algorithm.
Teachers and textbook publishers assign grade levels so that students are using materials that improve their language skills. The material is meant to be challenging. But when selecting learning materials for other subjects, like science or math, teachers seek material at the next lower reading level, to stay in the reading comfort zone.
When we are delivering important information for the public, we want to stay in the reader’s comfort zone for language. We need to use familiar words with obvious sentence structures. We should aim for engaging style and clear lay-out and design.
How can we know if the words are truly in common use? Katherine Barber, former editor of the Oxford Canadian Dictionary says:
Speaking as a former dictionary editor, I can tell you two things:
1) People will often say “but that’s quite common” when they have no statistical evidence to prove it, and it may in fact be a word that is used only in their town or even their own family.
I had no idea until I started doing massive amounts of research for the Canadian Oxford dictionary that only Winnipeggers called jam-filled doughnuts “jambusters” or that only my mother’s family call spatulas egg lifters.
2) You can find ANYTHING if you look for it on Google. Again, it is no indicator of how common or widespread it is.
The information age has brought us new resources for this.
Word Frequency resources now available
Lately, I’ve been working with word corpuses to measure whether a word is in frequent use. While creating some material for high school students recently, I checked for word usage from 1990 to 2008.
My current method:
- Check the dictionary and thesaurus to make sure the word is apt and the most obvious choice for the particular use—and reflects the meaning in context.
- Check a databank to see if the word falls within the most frequently-used 5,000 words.
- Check Google’s Ngram to compare that word to other candidates for use.
Here are a few useful resources.
Word Frequency Lists
These lists are in pdf files, but you can use your search to check whether a word is in the top 5,000 or top 10,000 words according to frequency of use.
The news on the web corpus, called the NOW corpus, has collected 14
billion words equally divided among spoken, fiction, popular magazines,
newspapers, and academic texts since 2010. A new feature, virtual corpora, lets
you create interest-area collections of texts.
(PDF overview at https://corpus.byu.edu/iweb/help/iweb_overview.pdf)
Some ways to use it:
o Browse a frequency list of the top 60,000 words in the corpus.
o Search by individual word. Some types of search are unique to this site.
o Search for phrases and word strings.
Google Books Ngram Viewer
The Google Books Ngram Viewer lets you compare the frequency of use of a word or phrase. In its graphs, you can compare several words or phrases to see which is more frequently used. Or you can search for the usual substitutions for a word within a phrase. This is a body of words in published books and magazine.
Both GitHub and the Oxford Dictionary have worked with the Ngram to produce their own resources. The GitHub list is: “the 10,000 most common English words in order of frequency, as determined by Ngram frequency analysis of the Google’s Trillion Word Corpus.” https://github.com/first20hours/google-10000-english
See this example of an Ngram comparison.
Canadian Newsstand™ has the full text of nearly 300 Canadian newspapers, including their articles, columns, editorials, and features, as far back as the 1970s. While a subscription is required, this may be available through your library. It is also available through the ProQuest® web interface.