What has changed is that the
/words/{word}/frequency
endpoint will now return additional information than just the one score. For example, the frequency results for "apartment" now look like this:
{
"word" : "apartment",
"frequency": {
"zipf": 4.82,
"perMillion": 65.76,
"diversity": 0.17
}
}
Here's what each of those means:
zipf
This is the same number that gets returned for perMillion
In any given corpus of one million English words, this is the number of times you can expect to see the word. It's a common frequency measurement that academic papers use.
diversity
In a document that represents part of a corpus, this is the odds that the given word will appear at least once. It ranges from 0 to 1.
frequency
at the main words endpoint. It's a log10 scale representation of the number of times the word appeared in our corpus. It ranges from 1 to 7, where a higher number means a word that was seen more frequently. For more information, you may want to see this paper.
We hope these new ways of looking at word frequency are helpful! If you have any questions, please let us know: support@wordsapi.com