Skip to content

Commit

Permalink
Merge pull request #41 from jakeharding/dev
Browse files Browse the repository at this point in the history
Milestone2
  • Loading branch information
jakeharding authored Mar 30, 2017
2 parents 678129a + fe0aa02 commit 31ac90f
Show file tree
Hide file tree
Showing 111 changed files with 2,928 additions and 215 deletions.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
*.sqlite*
*.db
*.pyc
*.log
*~
.DS_Store
local_settings.py
node_modules
dist
.vscode
.idea
__pycache__
8 changes: 6 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
####No external contributions are currently being accepted because this is a project for a university course.

####Guidelines
- Always checkout a branch for developing. Never push to master or dev
- Merging into master or dev requires a pull request, code review, and approval by a contributor
- Always checkout a branch for developing. Never push to master or dev.
- Write tests for all code written.
- Use the appropriate coding standard for the code you are writing.
- Backend is written in Python - use [PEP8](https://www.python.org/dev/peps/pep-0008/)
- Frontend is written in ES6 - use [AngularJS styleguide](https://github.com/toddmotto/angular-styleguide)
- Merging into master or dev requires a pull request, code review, and approval by a contributor.
- After pull request is approved, the creator of the pull request will complete the merge unless otherwise noted.

####Contributors by Github username:
Expand Down
26 changes: 26 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
FROM python:3.6
ENV PYTHONUNBUFFERED 1
LABEL Description="This image supplies the server needed for repo-health."

# installing dependencies such as curl, node, and netcat
RUN apt-get update && apt-get install -y curl netcat
RUN curl -sL https://deb.nodesource.com/setup_7.x | bash -
RUN apt-get install -y nodejs
RUN mkdir /www
WORKDIR /www

COPY requirements.txt /www/
RUN pip install -r requirements.txt
COPY . /www

# Building the UI
WORKDIR repo_health/index/static
RUN npm install && npm run dist
WORKDIR /www

EXPOSE 8000

# Adding start script
COPY docker/runserver.sh /runserver.sh
CMD ["/runserver.sh"]

34 changes: 30 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
# Repository Health Project for CSCI 4900

The repository hold the proof of concept for the repository health and sustainability project for CSCI 4900 at the Univeristy of Nebraska at Omaha. This repository will hold the backend and frontend source code to extract data from Github and ghtorrent and prodvide statistics about a selected repository. Description of the backend and frontend source are provided.
This repository holds the proof of concept for the repository health and sustainability project for CSCI 4900 at the University of Nebraska at Omaha. This repository will hold the backend and frontend source code to extract data from Github and ghtorrent and provide statistics about a selected repository. Description of the backend and frontend source are provided.

This project is not meant to be in production. Charts for the user interface are not implemented yet, so arrays of integers are displayed where charts will be placed in the future.

## Backend
The Django web framework is used in the project to leverage quick development and the third party packages available. Python requirements are kept in the requirements.txt file, and this file is generated using `pip freeze > requirements.txt`.

Python 3.5 is recommended for this project. It can be downloaded and installed at [python.org](python.org).

## Frontend
The frontend source is written using Angular and Angular UI-Router.
The frontend source is written in ES2015 using [Angular](https://angularjs.org/). We are using [Bootstrap](http://getbootstrap.com/) for our styles.

## Development Environment
A Unix dev environment is recommended because the setup instructions provided are known to work in these environments using a bash terminal. The instructions may work in a Windows bash terminal but have not been tested.
Expand All @@ -18,6 +20,16 @@ A Linux production environment is recommmended, and Ubuntu version 12 and greate

## Database configuration
Both production and development environments use MySql for a DBMS and require a database configuration specific to the individual environment.
If using mysql 5.7 or greater, a warning may been seen due to strict mode being enable by default. Turn off strict mode by adding a file named:

`/usr/local/bin/etc/my.cnf`.

To this file add:
```
[mysqld]
sql_mode=NO_ENGINE_SUBSTITUTION
```

#### Development
For development, a subset of data is used for testing and provided by ghtorrent [here](http://ghtorrent.org/msr14.html). Use this link to download the database, and after successful download, unpack the archive to your desired location. Take note of the location. The supplied data requires a database to hold so we need to create one using MySql. The installation of MySql varies on the package manager being used, so it is assumed MySql is installed and running.
Expand All @@ -35,6 +47,12 @@ Production database is configured in a similar fashion, but the project is not r

## Development setup

### Using Docker
This was tested using v1.12.6 of [Docker](https://docs.docker.com/engine/installation/linux/ubuntu/) and v1.11.2 of [Docker Compose](https://docs.docker.com/compose/install/). Any version below these are untested.
To deploy this application in a docker instance go to the root of this repository and run `docker-compose up`. This will run through the configurations
of our project and create a working copy. This process takes a couple of minutes. Once it is complete, just head to [localhost:8000](http://localhost:8000) to view
this application.

### Backend Configuration
Developers develop on many projects, and each project has it's own dependancies or deps.
For this reason, a virtual environment is created on the developer's machine in order to isolate Python dependancies between projects.
Expand All @@ -61,7 +79,7 @@ To start using the virtualenv, run `workon <virtualenv name>`.
From inside the virtualenv and in the project root folder, run `pip install -r requirements.txt` to install the deps from the file.
Note for Linux users: `sudo apt-get install libmysqlclient-dev` may need to be ran from the terminal in order for MySQL to be used with the mysqlclient dependency. A mysql_config not found error will be present when running `pip install -r requirements.txt` to signify this.

Once successful, the database is ready to be migrated. Since we are using an existing database and structure we will need fake the initial migrations that normally created the tables and columns for each Django app using models that map the Github tables. These are noted by packages beginning with `gh_`. Run this command for every package in the `repo_health` folder: `python manage.py migrate <package_name> --fake-initial`. After success run `python manage.py migrate` to run any other migrations.
Once successful, the database is ready to be migrated. Since we are using an existing database and structure we will need to fake the initial migrations that normally created the tables and columns for each Django app using models that map the Github tables. These are noted by packages beginning with `gh_`. Run this command for every package in the `repo_health` folder: `python manage.py migrate <package_name> --fake-initial`. After success run `python manage.py migrate` to run any other migrations.

Run `python manage.py runserver` to start the built in dev server, and navigate a browser to [http://localhost:8000](http://localhost:8000) to view the index page.

Expand All @@ -70,7 +88,11 @@ An admin is available at [http://localhost:8000/admin/](http://localhost:8000/ad
Run `python manage.py test --keepdb` to run the tests. The `--keepdb` is important so Django doesn't try to destroy the test database.

### Frontend configuration
From project root, `cd repo_health/index/static`. Run `npm install`. After successful install, [http://localhost:8000](http://localhost:8000) should display the Hello World page.
From project root, `cd repo_health/index/static`.

Run `npm start` and go to [http://localhost:3000](http://localhost:3000) to view the application.

To run ui tests, run `npm test` in the `repo_health/index/static` folder.

#### Some common commands to help determine what python is being used
- `which python` or `which python3`
Expand All @@ -88,6 +110,10 @@ Assumptions made about production:
- Apache httpd server is used to serve app using mod_wsgi.
- All static files are served from Apache using a redirect from url `/static/` to a static documents folder. This folder is created using `python manage.py collectstatic`. This is not necessary in development.

##License and Copyright
All source code is covered by the MIT license. This license is located in the LICENSE.txt file at the root of the project.

All other material, such as documentation, is covered by the Creative Commons - Attribution, or the CC BY license.

##Contributing
External contributions are not being accepted at this time. For existing contributors, please use the following header documentation at the top of each file:
Expand Down
27 changes: 27 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
version: '2'
services:
database:
build: ./docker/database
restart: always
command: mysqld --user=root --verbose
volumes:
- /var/lib/mysql
ports:
- "3306:3306"
environment:
MYSQL_BASE: "msr14"
MYSQL_USER: "msr14"
MYSQL_PASSWORD: "msr14"
MYSQL_ROOT_PASSWORD: "password"
MYSQL_ALLOW_EMPTY_PASSWORD: "yes"

web:
build: .
volumes:
- .:/www
ports:
- "8000:8000"
links:
- database
depends_on:
- database
8 changes: 8 additions & 0 deletions docker/clean.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash
# Stops and removes all docker containers and images

docker stop $(docker ps -a -q)
sleep 1
docker rm $(docker ps -a -q)
sleep 1
docker rmi $(docker images -a -q)
14 changes: 14 additions & 0 deletions docker/database/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
FROM mysql:5.5

RUN mkdir /data
WORKDIR /data

# Install MSR14 database and add a custom sql file to be ran on initialization.
RUN apt-get update && apt-get install -y curl sed
RUN curl -O https://ghtstorage.blob.core.windows.net/downloads/msr14-mysql.gz
RUN gzip -d msr14-mysql.gz
RUN echo 'USE msr14;' | cat - msr14-mysql > temp && mv temp msr14-mysql
RUN mv msr14-mysql /docker-entrypoint-initdb.d/msr-mysql.sql
COPY create_db.sql /docker-entrypoint-initdb.d/

CMD ["/usr/bin/mysqld"]
9 changes: 9 additions & 0 deletions docker/database/create_db.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
create database msr14;

GRANT SELECT, INSERT, DELETE, UPDATE ON msr14.* TO 'msr14'@'%';
GRANT SELECT, INSERT, DELETE, UPDATE ON msr14.* TO 'msr14'@'localhost';

GRANT ALL ON *.* to msr14@localhost IDENTIFIED BY 'msr14';
GRANT ALL ON *.* to msr14@'%' IDENTIFIED BY 'msr14';

FLUSH PRIVILEGES;
23 changes: 23 additions & 0 deletions docker/runserver.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/bin/bash

# Create the local_settings.py to connect to the docker mysql
echo "DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'msr14',
'USER': 'msr14',
'PASSWORD': 'msr14',
'HOST': 'database',
'PORT': '3306'}}" > repo_health/local_settings.py

# Check if the database is done running its init scripts
until nc -z -v -w30 'database' 3306
do
echo "Waiting for connection to database..."
# wait for 7 seconds before checking if DB is up
sleep 7
done

# Migrate and start the server
python manage.py migrate
python manage.py runserver 0.0.0.0:8000
Binary file modified docs/Client - Server Workflow.pdf
Binary file not shown.
Binary file modified docs/Data Flow Diagram.pdf
Binary file not shown.
44 changes: 44 additions & 0 deletions docs/Milestone1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Milestone 1

## System Description
The information system being built provides an unbiased evaluation of a GitHub repository using data collected by the GHTorrent project. The system is a web based interface and retrieves data about a GitHub repository using a URL specified by the user. The source code for the system is hosted on GitHub at: https://github.com/jakeharding/repo-health.

The intent is to provide the user with usable information about a GitHub repository so the user can make an informed decision about the health and sustainability of the repository and the community supporting it. The system is not meant to provide any value to the statistics.

The system has many parts. The backend provides a connection to the database, serialization of the data, and serving the data to the client in JSON format. The web client uses the JSON data to render the visualization of the data. Both parts are included in the source code repository for convenience, but the system is built so the separation of the frontend and backend can be easily achieved.

The current state of the system is in a proof of concept and is not ready for a production environment. The proof of concept supplies basic statistics about a repository and leaves out complex data manipulation.

## Development Environment
Our development environment consists of several different tools. Our development operating system is a Linux/Unix (Mac) environment. Our database management system is MySQL. Our server is running on Python 3.5+ and is using the Django web framework and dev server. Python’s `virtualenv` is used to keep every developers python environment the same. Our front end is using the node package manager (npm) to manage and keep our dependencies in sync.


## Data Flow Diagram
Our Data Flow Diagram can be found [here](https://github.com/jakeharding/repo-health/blob/master/docs/Data%20Flow%20Diagram.pdf).

## Database Schema
We are using ghtorrent's schema. It can be found [here](http://ghtorrent.org/files/schema.pdf).

## Copyright and License
Licenses used for this project cover two areas: documentation and software. The software is covered under the MIT license and the documentation is covered under CC BY.

Each file has the following documentation:

```
fileName.ext - (C) Copyright - 2017
This software is copyrighted to contributors listed in CONTRIBUTIONS.md.
SPDX-License-Identifier: MIT
Author(s) of this file:
Your name or github username
Brief description of the file.
```

## Contributions
##### Jake:
Jake is responsible for implementation of connecting to the database, the web server for the REST API, and all related documentation.

##### Benji:
Benji is responsible for the frontend implementation. This includes UI/UX Design and the display of information from the server on the UI. All documentation for the frontend will be written by Benji.
52 changes: 52 additions & 0 deletions docs/Milestone2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Milestone 2

## System Description

The system is a web based user interface and retrieves data about a GitHub repository using a URL specified by the user. The source code for the system is hosted on GitHub at: https://github.com/jakeharding/repo-health.

The intent is to provide the user with usable information about a GitHub repository so the user can make an informed decision about the health and sustainability of the repository and the community supporting it.

The system has many parts. The backend provides a connection to the database, serialization of the data, and serving the data to the client in JSON format. The web client uses the JSON data to render the visualization of the data. Both parts are included in the source code repository for convenience, but the system is built so the separation of the frontend and backend can be easily achieved.

The current state of the system is in a proof of concept and is not ready for a production environment. The proof of concept supplies basic statistics about a repository and leaves out complex data manipulation.

Rendering of charts has been left out in this milestone but the data is still present in the user interface as a list of numbers.

## Development Environment
Our development operating system is a Linux/Unix (Mac) environment and our database management system is MySQL.

Our server is running on Python 3.5+ and is using the Django web framework and dev server. Python’s `virtualenv` is used to keep every developers python environment the same.

Our front end is using the node package manager (npm) to manage and keep our dependencies in sync.

Docker is used to create an instance of the system as fast as possible.

## Data Flow Diagram
Our Data Flow Diagram can be found [here](https://github.com/jakeharding/repo-health/blob/master/docs/Data%20Flow%20Diagram.pdf).

## Database Schema
We are using ghtorrent's schema. It can be found [here](http://ghtorrent.org/files/schema.pdf).

## Copyright and License
Licenses used for this project cover two areas: documentation and software. The software is covered under the MIT license and the documentation is covered under CC BY.

Each file has the following documentation:

```
fileName.ext - (C) Copyright - 2017
This software is copyrighted to contributors listed in CONTRIBUTIONS.md.
SPDX-License-Identifier: MIT
Author(s) of this file:
Your name or github username
Brief description of the file.
```

## Contributions
##### Jake:
Jake is responsible for implementation of connecting to the database, the web server for the REST API, and all related documentation.

##### Benji:
Benji is responsible for the frontend implementation. This includes UI/UX Design and the display of information from the server on the UI. All documentation for the frontend will be written by Benji.
4 changes: 2 additions & 2 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#REST API README
# REST API README

##Purpose
## Purpose
GitHub maintains a REST API so why do we need another?
- The reason to host and supply the data independentluy of GitHub enables us to structure the data in a fashion suitable for our purpose.

Expand Down
22 changes: 22 additions & 0 deletions docs/Use Case.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Use Case

**Title:** Determining the health of a Github repository

**Primary Actor:** Project Leader

**Goal in Context:** The project lead is able to assess if an open source project is healthly enough for its company's standards to incorporate into their product

**Stakeholders:**
* Project Lead: To understand if software is appropriate to incorporate into their own product
* Developer: To clearly gauge if a piece of software will need to be maintained by them or if it's reliable
* Manager: To access the longevity of this project with regards to dependencies

**Preconditions:**
* Metrics for a healthy repository are already defined
* The repositories to be compared are already picked out

**Main Success Scenario:** Project lead understands the metrics presented and makes an accurate decision if the software should be incoporated

**Failed End Conditions:** Project lead receives little to no information on the health of the chosen repository

**Trigger:** Project lead is assigned to a brand new project and needs to assess what libraries and dependencies should be used
9 changes: 9 additions & 0 deletions docs/metrics/Issues.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#Metrics for Issues of Repo

###Issue ideas
- Number of Issues vs last completed issue
- Number of Issues closed over a period of time
- Average of how often an issue is created
- Age of Issues
- How often a contributor comments on an issue
- Average of how often issues get commented on
10 changes: 8 additions & 2 deletions docs/metrics/PullRequests.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@


###From slides:
- total numer of pr
- total number of pr
- total number commits
- number of pr with no comment from maintainer
- number of prs waiting for a response
Expand All @@ -11,5 +11,11 @@
###More ideas
- organizational prs
- numbers prs from outside maintainers
- total number of comments
- total number of comments on prs
- total number of comments from maintainers
- most contributing user (user with most accepted prs)
- avg size of pr?
- not maintainer pr past year?
- not org member pr? past year?
- avg comments per pr

Loading

0 comments on commit 31ac90f

Please sign in to comment.