-
Notifications
You must be signed in to change notification settings - Fork 31
Database Schema
Alvaro del Castillo edited this page Oct 16, 2013
·
3 revisions
This schema is a bit out of date, but this is the best we have so far. This figure was obtained from the deliverable 3.1 in the FLOSSMetrics project.
- scmlog This table contains general information about the commits. Every commit in the repository is represented by a record in the scmlog table
Name | Data Type | Description | Key |
---|---|---|---|
id | INTEGER | Commit unique identifier | PK |
repository id | INTEGER | Repository identifier | FK |
author id | INTEGER | Author identifier. Some source control management systems,differentiate the person who did the commit from the person who actually made the changes. When not supported by the repository, this field will be NULL | FK |
committer id | INTEGER | Committer identifier. It is the identifier in the database of the person who did the commit It’s the revision identifier in the repository. It’s always unique in every repository. | FK |
rev | MEDIUMTEXT | Revision number assigned by the SCM (CVS, SVN,GIT,...) | |
date | DATETIME | Date and time of the commit | |
message | LONGTEXT | General comment about the commit | |
composed rev | BOOL | It’s a boolean to indicate whether the rev field is composed or not. This is needed because the rev field must be unique in every repository which is not possible in CVS since it uses revision numbers per file. The combination of a file path and its revision is what make a commit unique in a system like CVS. For this particular case the rev field is represented by the concatenation of the revision number, the pipe character and the file path. |
- file types This table contains a register for each kind of file that may be found in the repository, such as documentation source code, images, etc.
Name | Data Type | Description | Key |
---|---|---|---|
id | INTEGER | File type unique identifier | PK |
file_id | INTEGER | File identifier | FK |
type_id | MEDIUMTEXT | File type are: code:source code files (C/C++,Python, Java, etc.), build:files used to build and configure the source code (Make-file, configure, cvsignore, etc.), ui:files containing graphical user interface definitions (glade, gtkbuilder, ui files, etc.), i18n: translation files (.po, .mo, etc.), documentation:documentation files devel doc:documentation for developers (HACK-ING, ChangeLog, etc.), package:package files (.tar.gz, .deb, .rmp, etc.), image:icons and files (.png, .jpeg, etc.), multimedia: audio and video files (.ogg, .avi, .mp3, etc.), unknown: files with an unknown type, generally files that don’t have extension |
- actions This table contains the different actions performed in a commit. In sys- tems like CVS, where the commit is limited to a single file, there will be only one record in the actions table for every commit. However, most of the version control systems support atomic commits, where several actions are carried out on several files@footnote
Name | Data Type | Description | Key |
---|---|---|---|
id | INTEGER | Action unique identifier | PK |
commit_id | INTEGER | Commit identifier where the action was performed | FK |
file_id | INTEGER | File identifier | FK |
branch_id | INTEGER | Branch identifier. | FK |
type | VARCHAR(1) | Action type (A:Added, M:Modified, D:Deleted, V:Renamed, C:copied, R:Replaced, ). Not all of the action types are always supported, for example, for CVS repositories only A, M and D actions are supported. |
- branches This table contains the distinct branches of a repository.
Name | Data Type | Description | Key |
---|---|---|---|
id | INTEGER | Branches unique identifier | PK |
name | VARCHAR(255) | Branches name | FK |
- metrics This table contains distinct metrics obtained from a file. This extension provides simple source code metrics for every revision of every single file found in the repository. Since this extension is about source code, it uses the FileTypes extension to get only source code files. Note: a full descrip- tion of this metrics can be found in the appendixes of this document
Name | Data Type | Description | Key |
---|---|---|---|
id | INTEGER | Metric unique identifier | PK |
file_id | INTEGER | File identifier | FK |
commit_id | INTEGER | Commit identifier | FK |
lang | TINYTEXT | Programming language | |
sloc | INTEGER | Number lines of code | |
loc | INTEGER | NUmber lines of all the file | |
ncomment | INTEGER | Number of comments | |
lcomment | INTEGER | Number lines of the comments | |
lblank | INTEGER | Number of blank lines | |
mccabe_min | INTEGER | Minimun Mccabe complexity of the functions that exists in the file | |
nfunctions | INTEGER | Number of functions | |
mccabe_max | INTEGER | Maximum Mccabe complexity of the functions that exists in the file | |
mccabe_sum | INTEGER | Sum Mccabe complexity of the functions that exists in the file | |
mccabe_mean | INTEGER | Mean Mccabe complexity of the functions that exists in the file | |
mccabe_median | INTEGER | Median Mccabe complexity of the functions that exists in the file | |
halstead_length | INTEGER | Halstead length in the file | |
halstead_vol | INTEGER | Halstead volumen in the file | |
halstead_level | DOUBLE | Halstead level in the file | |
halstead_md | INTEGER | Halstead mental discrimination |
- people This table contains registers about people have worked in the repository. Contains the name and email (when available) of the people involved in the repository.
Name | Data Type | Description | Key |
---|---|---|---|
people_id | INTEGER | People unique identifier | PK |
name | VARCHAR(255) | People name | |
VARCHAR(255) | People mail |
- repositories This table contains uris to the analysed repositories
Name | Data Type | Description | Key |
---|---|---|---|
id | INTEGER | Repository unique identifier | PK |
uri | VARCHAR(255) | URI of the repository | |
name | VARCHAR(255) | Repository name | |
type_2 | VARCHAR(30) | Repository type (cvs, svn, git) |
- commits lines This table contains information about the line’s number added and re- moved by commit.
Name | Data Type | Description | Key |
---|---|---|---|
id | INTEGER | Commit line unique identifier | PK |
commit_id | INTEGER | Commit identifier | FK |
added_name | INTEGER | Number lines added | |
removed_name | INTEGER | Number lines removed |
- file copies This table contains general information about the file copies. It is used to store additional information about actions that involve more than one file. Copies, moves, renames and replacements are actions performed over two or more files. The file id field of the actions table refers always to the file that is the object of the action.
Name | Data Type | Description | Key |
---|---|---|---|
id | INTEGER | File copies unique identifier | PK |
from_id | INTEGER | Source file identifier. Identifier of the file that is the source of the action. In a move or copy operation this is the file id of the file from which the move or copy is done. It’s a foreign key that references the ’id’ field of the ’files’ table | FK |
from_commit_id | INTEGER | Commit source identifier. The source file contents are taken from the revision associated to this commit identifier. It’s a foreign key that references the ’id’ field of the ’scmlog’ table | FK |
to_id | INTEGER | Target file identifier. Identifier of the file that is the destination of the action. In a move or copy operation, this field is the same than the 'file id’ in the ’actions’ table. However, when a file is replaced, the ’file id’ stored in the ’actions’ table is the existing file being replaced, and this field contains the ’file id’ of the new file that replaces the existing one. It’s a foreign key that references the ’id’ field of the ’files’ table | FK |
action_id | INTEGER | Action identifier | FK |
new_file_name | MEDIUMTEXT | Contains the new name of the file for rename actions or ’NULL’ for other actions |
- files This table contains general information about the files found in the repos- itory
Name | Data Type | Description | Key |
---|---|---|---|
id | INTEGER | File unique identifier | PK |
repository_id | INTEGER | Repository identifier | FK |
file_name | VARCHAR(255) | FIle or directory name |
- files links This table contains general information about the topology between files. The relationship between two files is always parent - child.
Name | Data Type | Description | Key |
---|---|---|---|
id | INTEGER | File links unique identifier | PK |
file_id | INTEGER | File identifier | FK |
parent_id | INTEGER | Parent file identifier or -1 if the file is in the root of the repository. It’s a foreign key that references the id field of the files table | FK |
commit_id | INTEGER | Commit identifier | FK |
- tag revisions This table contains general information about the list of revisions pointing to every tag
Name | Data Type | Description | Key |
---|---|---|---|
id | INTEGER | Tag revision unique identifier | PK |
datasource_id | INTEGER | Datasource identifier | FK |
commit_id | INTEGER | Commit identifier | FK |
tag_id | INTEGER | Tag identifier | FK |
- tags This table contains general information about the name of the tags.
Name | Data Type | Description | Key |
---|---|---|---|
people_id | INTEGER | Tag unique identifier | PK |
name | VARCHAR(255) | Tag name |