Issues for nltk_guidelines: NLTK Methods - End #2

rachelrakov · 2018-05-23T19:52:23Z

In section "NLTK Methods with the NLTK Corpus", when importing matplotlib, mention what it is - define it.
In section "Searching for Words":
- Typo: "What about Sense and Sensibility?" - add question mark
  *Define "phatic function", as compared to a semantic one; may not be known by all
In section "Positioning Words"
- Define what a pipeline is
In section "Types vs Tokens":
- Typo: "we would need a count of the 'lol's." is missing and end parenthesis
- You bring in list comprehensions here with the code text1_tokens= [t.lower() for t in text1 if t.isalpha()] - I am not certain if participants will have ever seen a list comprehension before (not sure if Patrick covers it in the Python workshop) - take a minute to explain them.
- Might be worth mentioning that .isalpha() ignores punctuation - that is not necessarily intuitive
- Restore non-code after the above example back to words in markdown format - for awhile it's all in code format, which makes it very difficult to read
In section "Cleaning the Corpus":
- Might be work explaining how len(set(text1_tokens)) shows how many unique words there are - unless you have predefined set, which I did not see.
- It is important, if you are billing list comprehensions as a computationally faster way of completing what can be done in lists, to define them and mention that they are computationally faster as early on as you can.
- Typo: "but just for illustration, we can try it out with a instead (Porter is the most common)." Need a word between a and instead.
- I am not certain that Patrick will have taught dictionaries in his lesson - check with him before you mention dictionaries in trying to describe Frequency Distributions.
- I am also not certain if Patrick will have taught pass during his workshop - might need to explain what that does as you are going through the code.
In section "Make Your Own Corpus":
- Once again, make the text that isn't code here back into a markdown format, for ease of reading (might just be a matter of making the cell a markdown cell, if you wrote this in jupyter notebook)
In section "Part of Speech Tagging":
- Typo - "I'll make an epty dictionary to hold my results" - should be "empty"
- Once again, you may need to explain the syntax of creating a dictionary here, if that is not covered in the Python tutorial, as well as what a dictionary is and how it works. Check with Patrick.
- Fix the formatting at the end - once again, it's all in code format, making it hard to read. Everything that isn't code should be in markdown cells.
Are you going to add what is at the end of your jupyter notebook for this session into your nltk_guidelines markdown? There is discrepancy there; not sure if that is deliberate or not.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues for nltk_guidelines: NLTK Methods - End #2

Issues for nltk_guidelines: NLTK Methods - End #2

rachelrakov commented May 23, 2018 •

edited

Loading

Issues for nltk_guidelines: NLTK Methods - End #2

Issues for nltk_guidelines: NLTK Methods - End #2

Comments

rachelrakov commented May 23, 2018 • edited Loading

rachelrakov commented May 23, 2018 •

edited

Loading