Skip to content

Commit

Permalink
Merge pull request #11 from jakeharding/dev
Browse files Browse the repository at this point in the history
Merging dev into master.
  • Loading branch information
Benjamin Parish authored Feb 13, 2017
2 parents 25a2ca6 + c97cd2e commit 678129a
Show file tree
Hide file tree
Showing 96 changed files with 2,229 additions and 2 deletions.
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
*.sqlite*
*.db
*.pyc
.DS_Store
local_settings.py
node_modules
.vscode
__pycache__
10 changes: 10 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
####No external contributions are currently being accepted because this is a project for a university course.

####Guidelines
- Always checkout a branch for developing. Never push to master or dev
- Merging into master or dev requires a pull request, code review, and approval by a contributor
- After pull request is approved, the creator of the pull request will complete the merge unless otherwise noted.

####Contributors by Github username:
- bparish628
- jakeharding
2 changes: 1 addition & 1 deletion LICENSE.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Copyright 2017 Contributors to this repository.
Copyright 2017 Contributors to this repository are listed in CONTRIBUTING.md and hold the copyright of this software.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

Expand Down
106 changes: 105 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,105 @@
##Repository Health Project for CSCI 4900
# Repository Health Project for CSCI 4900

The repository hold the proof of concept for the repository health and sustainability project for CSCI 4900 at the Univeristy of Nebraska at Omaha. This repository will hold the backend and frontend source code to extract data from Github and ghtorrent and prodvide statistics about a selected repository. Description of the backend and frontend source are provided.

## Backend
The Django web framework is used in the project to leverage quick development and the third party packages available. Python requirements are kept in the requirements.txt file, and this file is generated using `pip freeze > requirements.txt`.

Python 3.5 is recommended for this project. It can be downloaded and installed at [python.org](python.org).

## Frontend
The frontend source is written using Angular and Angular UI-Router.

## Development Environment
A Unix dev environment is recommended because the setup instructions provided are known to work in these environments using a bash terminal. The instructions may work in a Windows bash terminal but have not been tested.

## Production Environment
A Linux production environment is recommmended, and Ubuntu version 12 and greater is preferred. A database will be needed and configuration for the database will need to be provided. See below for database configuration.

## Database configuration
Both production and development environments use MySql for a DBMS and require a database configuration specific to the individual environment.

#### Development
For development, a subset of data is used for testing and provided by ghtorrent [here](http://ghtorrent.org/msr14.html). Use this link to download the database, and after successful download, unpack the archive to your desired location. Take note of the location. The supplied data requires a database to hold so we need to create one using MySql. The installation of MySql varies on the package manager being used, so it is assumed MySql is installed and running.

Open the MySql interpreter with root access using: `mysql -u root -p`. If you have a password setup for the root MySql user, you will be prompted to enter it. Create a database to hold the data by running `create database msr14;`. This names the databae 'msr'. You can name it however you like. Create a user for the database by running `create user 'msr14'@'localhost' identified by password;`. The password will need to be added to the local_settings.py file for database connection, and please do not commit the password, or any passwords, to the repository.

The user needs to have access to the database. Run `grant all on msr14.* to 'msr14'@'localhost;`. After commands have been run successfully, run `quit;` to exit the interpreter. We can now load the data into the database by running `mysql -u root -p msr14 < path_to_extracted_data_file`. The `path_to_extracted_data_file` is the absolute path noted earlier, and `msr14` is the name of the database created. Once this is successful, you can enter the information into the local_settings.py.

If running unit tests, this process will need to be repeated with a test database named `test_msr14`.

Located in the `repo_health` directory of this repository is a `local_settings.py.example` file. Save a copy of this file as `local_settings.py` to the same directory. Please do not remove the example file from the repo. If you created the database and user as `msr14` then you will need insert the password for the user in the `PASSWORD` placeholder. Otherwise enter the required information into the correct places and save the file.

#### Production
Production database is configured in a similar fashion, but the project is not ready for a production setup so this will be revisited.

## Development setup

### Backend Configuration
Developers develop on many projects, and each project has it's own dependancies or deps.
For this reason, a virtual environment is created on the developer's machine in order to isolate Python dependancies between projects.
Virtualenv and virtualenvwrapper are used to create and maintain the environment and need to be installed in the base Python using the pip command line tool.
An explanation of how to install pip is [here](https://pip.pypa.io/en/stable/installing/), but if you can run `which pip` and see a result, you have pip installed.
To install virtualenv and virtualenvwrapper use:
`pip install virtualenv virtualenvwrapper`
and wait for a success message. If unsuccessful, try the same command using `pip3` in place of `pip`.
Some environment variables need to be set and is explained [here](http://virtualenvwrapper.readthedocs.io/en/latest/install.html), but a bare bones setting is:
- Open up your .bash_profile or .bashrc
- Add the following lines:
`export WORKON_HOME=$HOME/.envs
source <(echo which virtualenvwrapper.sh)
`
- Run `source ~/.bash_profile` or `source ~/.bashrc` depending on what file you edited and verify no errors. If an error appears, run `which virtualwrapper.sh` from the terminal and copy the output. Replace `<(echo which virtualenvwrapper.sh)` in .bash_profile with the output copied.
Once this is configured, you are able to run commands such as `mkvirtualenv`, and `workon` with no errors (maybe a usage message).
To create a virtualenv for our project:
- Run `which python3` and copy the output
- Run `mkvirtualenv <virtualenv name> -p <paste output of previous command>`. The name of the virtualenv is of your choosing. The name of the project works well.
If this is successful, you will see the command prompt change to begin with `(<virtualenv name>)`. You are now working inside your virtualenv and running `which python` should result to a path in your `~/.envs` folder.
To stop using the virtualenv, run `deactivate`.
To start using the virtualenv, run `workon <virtualenv name>`.

From inside the virtualenv and in the project root folder, run `pip install -r requirements.txt` to install the deps from the file.
Note for Linux users: `sudo apt-get install libmysqlclient-dev` may need to be ran from the terminal in order for MySQL to be used with the mysqlclient dependency. A mysql_config not found error will be present when running `pip install -r requirements.txt` to signify this.

Once successful, the database is ready to be migrated. Since we are using an existing database and structure we will need fake the initial migrations that normally created the tables and columns for each Django app using models that map the Github tables. These are noted by packages beginning with `gh_`. Run this command for every package in the `repo_health` folder: `python manage.py migrate <package_name> --fake-initial`. After success run `python manage.py migrate` to run any other migrations.

Run `python manage.py runserver` to start the built in dev server, and navigate a browser to [http://localhost:8000](http://localhost:8000) to view the index page.

An admin is available at [http://localhost:8000/admin/](http://localhost:8000/admin/) when the server is running. In order to login to the admin a user will need to be created. Run `python manage.py createsuperuser` and follow the instructions.

Run `python manage.py test --keepdb` to run the tests. The `--keepdb` is important so Django doesn't try to destroy the test database.

### Frontend configuration
From project root, `cd repo_health/index/static`. Run `npm install`. After successful install, [http://localhost:8000](http://localhost:8000) should display the Hello World page.

#### Some common commands to help determine what python is being used
- `which python` or `which python3`
- Shows path python version.
- The `python3` should display version 3.5 when development is configured correctly.
- `python --version`
- Show current version of Python used by the command line command.

### Production setup
The app is not ready for production yet so this part is incomplete.

Assumptions made about production:
- Operating system: Ubuntu 12.04 or greater
- Application will be running in it's own isolated environment, and a virtualenv is not needed.
- Apache httpd server is used to serve app using mod_wsgi.
- All static files are served from Apache using a redirect from url `/static/` to a static documents folder. This folder is created using `python manage.py collectstatic`. This is not necessary in development.


##Contributing
External contributions are not being accepted at this time. For existing contributors, please use the following header documentation at the top of each file:

```
fileName.ext - (C) Copyright - 2017
This software is copyrighted to contributors listed in CONTRIBUTIONS.md.
SPDX-License-Identifier: MIT
Author(s) of this file:
Your name or github username
Brief description of the file.
```
Binary file added docs/Client - Server Workflow.pdf
Binary file not shown.
Binary file added docs/Data Flow Diagram.pdf
Binary file not shown.
11 changes: 11 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#REST API README

##Purpose
GitHub maintains a REST API so why do we need another?
- The reason to host and supply the data independentluy of GitHub enables us to structure the data in a fashion suitable for our purpose.

This presents a challenge in updating our database to stay current. We recognize the need for data synchronization as an external entity of our system and outside the scope of our purpose. The documentation here provides structure and details for the repo_health REST API, which works with the msr14 database downloaded from GhTorrent [here](http://ghtorrent.org/msr14.html). This database is a subset of data and appears to be outdated. This means many structure changes have likely occurred and will need to be accounted for when updating the database.

The repo_health REST API accepts and returns JSON.

Thank you to GhTorrent for providing the needed data and structure to get us started.
15 changes: 15 additions & 0 deletions docs/metrics/PullRequests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#Metrics for Pull Requests of Repo


###From slides:
- total numer of pr
- total number commits
- number of pr with no comment from maintainer
- number of prs waiting for a response
- average pr lifetime

###More ideas
- organizational prs
- numbers prs from outside maintainers
- total number of comments

29 changes: 29 additions & 0 deletions docs/metrics/Repos.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#Some overall stats on the project

msr14 limitations
- has no recognition of releases, tags, stars, downloads
- has an empty table: RepoMilestone


###From slides
- Release frequency - Unavailable in msr14
- Number of releases - Unavailable in msr14
- Number of downloads - Unavailable in msr14
- number of stars - Unavailable in msr14
- number of forks

###More ideas
- Semantic versioning? Major version released? - Unavailable
- Date time and version of latest release - Unavailable
- travis ci stuff - parse readme of repo
- number of contributors
- numbers of watchers
- number of watchers who aren't contributors
- organizations of contributors
- project doesn't have an organization directly
- number of labels
- is a fork
- number of commits
- number of milestones
- age
- latest commit
31 changes: 31 additions & 0 deletions docs/rest_api/Repos.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#Repo Retrieve

Repos are retrieved using the owner's login and the repo's name and a GET request to the url.

URL:
- `/api/v1/gh-projects?owner__login=<login>&name=<name>`

Reponse will be a 404 if no project is found.

Response:
```
{
"id": int,
"name": str,
"description": str,
"language": str,
"created_at": date str,
"ext_ref_id": str,
"deleted": int,
"owner": int,
"contribs_count": int,
"watchers_count": int,
"watch_not_contribs_counts: int,
"orgs_of_contribs_count": int,
"labels_count": int,
"is_fork": bool,
"commits_count": int,
"milestones_count": int,
"age_of_latest_commit": date str
}
```
22 changes: 22 additions & 0 deletions manage.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/usr/bin/env python
import os
import sys

if __name__ == "__main__":
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "repo_health.settings")
try:
from django.core.management import execute_from_command_line
except ImportError:
# The above import may fail for some other reason. Ensure that the
# issue is really that Django is missing to avoid masking other
# exceptions on Python 2.
try:
import django
except ImportError:
raise ImportError(
"Couldn't import Django. Are you sure it's installed and "
"available on your PYTHONPATH environment variable? Did you "
"forget to activate a virtual environment?"
)
raise
execute_from_command_line(sys.argv)
Empty file added repo_health/__init__.py
Empty file.
13 changes: 13 additions & 0 deletions repo_health/gh_commits/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
"""
__init__.py - (C) Copyright - 2017
This software is copyrighted to contributors listed in CONTRIBUTIONS.md.
SPDX-License-Identifier: MIT
Author(s) of this file:
J. Harding
Configure app.
"""

default_app_config = 'repo_health.gh_commits.apps.GhCommitsConfig'
18 changes: 18 additions & 0 deletions repo_health/gh_commits/admin.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
"""
admin.py - (C) Copyright - 2017
This software is copyrighted to contributors listed in CONTRIBUTIONS.md.
SPDX-License-Identifier: MIT
Author(s) of this file:
J. Harding
Register models in admin.
"""

from django.contrib import admin as a
from repo_health.index.admin import ReadOnlyAdmin
from .models import *

a.site.register(GhCommit, ReadOnlyAdmin)
a.site.register(GhCommitComment, ReadOnlyAdmin)
18 changes: 18 additions & 0 deletions repo_health/gh_commits/apps.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
"""
apps.py - (C) Copyright - 2017
This software is copyrighted to contributors listed in CONTRIBUTIONS.md.
SPDX-License-Identifier: MIT
Author(s) of this file:
J. Harding
Setup app config for Django.
"""

from django.apps import AppConfig


class GhCommitsConfig(AppConfig):
name = 'repo_health.gh_commits'
verbose_name = 'Github Commits'
57 changes: 57 additions & 0 deletions repo_health/gh_commits/migrations/0001_initial.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# -*- coding: utf-8 -*-
# Generated by Django 1.10.5 on 2017-02-04 19:51
from __future__ import unicode_literals

from django.db import migrations, models


class Migration(migrations.Migration):

initial = True

dependencies = [
]

operations = [
migrations.CreateModel(
name='GhCommit',
fields=[
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
('sha', models.CharField(blank=True, max_length=40, null=True, unique=True)),
('created_at', models.DateTimeField()),
('ext_ref_id', models.CharField(max_length=24)),
],
options={
'managed': False,
'verbose_name': 'GitHub Commit',
'db_table': 'commits',
},
),
migrations.CreateModel(
name='GhCommitComment',
fields=[
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
('body', models.CharField(blank=True, max_length=256, null=True)),
('line', models.IntegerField(blank=True, null=True)),
('position', models.IntegerField(blank=True, null=True)),
('comment_id', models.IntegerField(unique=True)),
('ext_ref_id', models.CharField(max_length=24)),
('created_at', models.DateTimeField()),
],
options={
'managed': False,
'verbose_name': 'GitHub Commit Comment',
'db_table': 'commit_comments',
},
),
migrations.CreateModel(
name='GhCommitParent',
fields=[
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
],
options={
'managed': False,
'db_table': 'commit_parents',
},
),
]
Empty file.
Loading

0 comments on commit 678129a

Please sign in to comment.