Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated 10-lexical-dispersion-plot.md #11

Open
wants to merge 1 commit into
base: gh-pages
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions _episodes/10-lexical-dispersion-plot.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ teaching: 0
exercises: 0
questions:
- "How can I measure how frequently a word appears across the parts of a corpus?"
- "How can I plot the occurrences of a word and how many words from the beginning of the corpus it appears?"
- "How can I plot the occurrences of a word and find out after how many words from the beginning of the corpus, does this word appear?"
objectives:
- "Learn how to plot the occurances of specific words as they appear across a document or a corpus."
- "We will use the US Presidential Inaugural Addresses and which are provided with NLTK."
Expand All @@ -15,7 +15,7 @@ keypoints:

## Lexical Dispersion Plot

We can plot lexical dispersion of particular tokens. Lexical dispersion is a measure of how frequently a word appears across the parts of a corpus. This plot notes the occurrences of a word and how many words from the beginning of the corpus it appears (word offsets). This is particularly useful for a corpus that covers a longer time period and for which you want to analyse how specific terms were used more or less frequently over time.
We can plot lexical dispersion of particular tokens. Lexical dispersion is a measure of how frequently a word appears across the parts of a corpus. This plot notes the occurrences of a word and after how many words from the beginning of the corpus, this word appears (word offsets). This is particularly useful for a corpus that covers a longer time period and for which you want to analyse how specific terms were used more or less frequently over time.

To create a lexical disperson plot, you will first load and import a different corpus, the inaugural corpus which are all US Presidential Inaugural Addresses and which are provided with NLTK.

Expand Down Expand Up @@ -44,4 +44,4 @@ targets=['great','good','tax','work','change']
dispersion_plot(inaugural_texts, targets, ignore_case=True, title='Lexical Dispersion Plot')
```

<img src="../fig/output_9_0.png" width="700">
<img src="../fig/output_9_0.png" width="700">