It is a theorem which dictates that any document containing words in any language is subject to following:
The probability of occurrence of words or other items starts high and tapers off. Thus, a few occur very often while many others occur rarely.
Formal Definition: Pn ∼ 1/na, where Pn is the frequency of occurrence of the nth ranked item and a is close to 1.
More information can be found here.
To be able to visualize the Zipf's Law across many different genres, languages swiftly.
Scripts are written in Python 3.9.7
This project is under active development.