nlpgroup

Introduction

This is the repo for the group learning about Natural Language Processing.

Description

This project aims to capture major decade trends that occur in New York Times article titles. This will be accomplished by pulling article titles from the NYT Archives API and performing text analysis. The goal is to find words that are important for a given decade relative to all other decades. The important concepts from a decade will be visualized using word clouds. This project will be built using python with the nltk and wordcloud libraries.

Analysis

recessions vs expansion periods: what are significant words for each across all 150 years?
recessions through the years: what makes each 50-year period of recessions unique?
expansions through the years: what makes each 50-year period of expansions unique?

Presentation

introduction - research questions
explain data (nytapi)
explain methods (python api, tfidf, wordcloud)
show results 1
show results 2

Resources

Wheel Repository

Use this to install libraries (binaries) as wheel files. http://www.lfd.uci.edu/~gohlke/pythonlibs/

nyt api

We use the archives api from the nyt api. https://developer.nytimes.com/archive_api.json

nltk

We can start learning NLP by going through this tutorial. http://www.nltk.org/book/

gensim

gensim is a python library https://radimrehurek.com/gensim/models/word2vec.html

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
data		data
examples		examples
nytapi_data		nytapi_data
.gitignore		.gitignore
.gitmodules		.gitmodules
NYT_HeadlinesAnalysis.py		NYT_HeadlinesAnalysis.py
README.md		README.md
config.json		config.json
nyt.py		nyt.py
nyt_download.py		nyt_download.py
tfidflib.py		tfidflib.py
yearcloud.py		yearcloud.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nlpgroup

Introduction

Description

Analysis

Presentation

Resources

Wheel Repository

nyt api

nltk

gensim

About

Releases

Packages

Contributors 3

Languages

UCSB-dataScience-ProjectGroup/nlpgroup

Folders and files

Latest commit

History

Repository files navigation

nlpgroup

Introduction

Description

Analysis

Presentation

Resources

Wheel Repository

nyt api

nltk

gensim

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages