It is a theorem which dictates that any document containing words in any language is subject to following:
-
The probability of occurrence of words or other items starts high and tapers off. Thus, a few occur very often while many others occur rarely.
-
Formal Definition: Pn ∼ 1/na, where Pn is the frequency of occurrence of the nth ranked item and a is close to 1.
More information can be found here.
To be able to visualize the Zipf's Law across many different genres, languages swiftly.
-
Scripts are written in Python 3.9.7
-
This project is under active development.