Skip to content
Joseph Lai edited this page Feb 25, 2021 · 21 revisions

GitHub contributors GitHub closed issues License Lines of code

Welcome to URS - a comprehensive Reddit scraping command-line tool written in Python.

This wiki merely serves as a repository statistics page and an archive for all iterations of URS. This exists for me to see the evolution of my programming/skills and for anyone who is also curious how this repository has evolved since its inception.

Repository Statistics

I found this dope statistics tool called Star Chart and wanted to display it somewhere in this repository. It plots the repository's stars over time, which is such a cool feature and definitely something I am very interested in seeing.

URS stargazers over time

I will also display the hit count. Maybe one day this repository will blow up again because of Reddit events such as the r/wallstreetbets fiasco that occurred in late January 2021.

URS hitcount

I would love to see these statistics if something like that happens again, so consider the media above as future-proofing this wiki.

Major Iterations

  v1.0.0 v2.0.0 v3.0.0
CLI? No Yes Yes
What Does It Scrape? Subreddits Only Subreddits Only Subreddits, Redditors, Post Comments
Export Options CSV CSV CSV, JSON
READMEs README README README
Scraper reddit_scraper.py scraper.py scraper.py
Requirements Text File N/A requirements.txt requirements.txt

v3.0.0+ Iterations

  New Features
v3.1.0
  • Major code refactor. Applied OOP concepts to existing code and rewrote methods in attempt to improve readability, maintenance, and scalability.
  • New in 3.1.0:
    • Scrapes will now be exported to the scrapes/ directory within a subdirectory corresponding to the date of the scrape. These directories are automatically created for you when you run URS.
    • Added log decorators that record what is happening during each scrape, which scrapes were ran, and any errors that might arise during runtime in the log file scrapes.log. The log is stored in the same subdirectory corresponding to the date of the scrape.
    • Replaced bulky titles with minimalist titles for a cleaner look.
    • Added color to terminal output.
  • Improved naming convention for scripts.
  • Integrating Travis CI and Codecov.
  • Updated community documents located in the .github/ directory: BUG_REPORT, CONTRIBUTING, FEATURE_REQUEST, PULL_REQUEST_TEMPLATE, and STYLE_GUIDE
  • Numerous changes to Readme. The most significant change was splitting and storing walkthroughs in docs/.
v3.1.1
  • Added time filters for Subreddit categories (Controversial, Search, Top).
  • Updated README to reflect new changes.
  • Updated style guide. Made minor formatting changes to scripts to reflect new rules.
  • Performed DRY code review.
Clone this wiki locally