Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-edges to the author and artifact networks #115

Merged
merged 27 commits into from
Apr 30, 2018
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
2f1b4d9
Add implemention of multi-edge-networks for author and artifact networks
ecklbarb Apr 17, 2018
b55d3e8
Adapt plot functions to multi-edge networks
ecklbarb Apr 17, 2018
f190ca1
Use color palette 'viridis' for plotting for better flexibility
ecklbarb Apr 17, 2018
7c628fb
Introduce the vertex attribute 'kind'
ecklbarb Apr 17, 2018
021ac8b
Change the 'simplify.network' function to handle multi-edge networks
ecklbarb Apr 18, 2018
784c417
Include 'relation' and 'kind' in the expected data of the test suite
ecklbarb Apr 18, 2018
7ad49c4
Remove vertex attribute 'id' and add vertex attribute 'artifact.type'
ecklbarb Apr 19, 2018
be6ee8c
Add tests for networks with multiple relations
ecklbarb Apr 19, 2018
2941c22
Set Copyright in test suite
ecklbarb Apr 20, 2018
cd4645f
Add description of edge and vertex attributes in 'README'
ecklbarb Apr 20, 2018
3e286ac
Bug fix: Set attribute 'artifact.type' correctly
ecklbarb Apr 23, 2018
c2e92c7
Bug fix: Plotting multi networks
ecklbarb Apr 23, 2018
e95afd0
Update changelog
ecklbarb Apr 20, 2018
ef094eb
Minor fixes from review in PR #115
ecklbarb Apr 26, 2018
d791df8
Change plot function: edge width depends on edge weight
ecklbarb Apr 26, 2018
86f39c5
Remove unneeded browser() statements
clhunsen Apr 26, 2018
84516fc
Streamline and improve vertex and edge attributes
clhunsen Apr 26, 2018
4dead55
Adjust resolution of vertex attribute 'kind'
clhunsen Apr 26, 2018
a9605c6
Rename artifact-vertex kind for mail relation to 'MailThread'
clhunsen Apr 26, 2018
78c7c87
Rename artifact type for issue relation to 'IssueComment'
clhunsen Apr 26, 2018
26f0c59
Fix plot legends
clhunsen Apr 26, 2018
0817dd8
Add additional parameters to network simplification functions
bockthom Apr 27, 2018
74bd090
Rename artifact type for issue relation to 'IssueEvent'
ecklbarb Apr 27, 2018
8833ae8
Minor fixes from review in PR #115
ecklbarb Apr 30, 2018
9e68293
Add comment
ecklbarb Apr 30, 2018
b5e03ac
Simplify resolution of vertex kind for artifact networks
clhunsen Apr 30, 2018
bc3c8c1
Adjust return value of 'construct.edge.list.from.key.value.list'
clhunsen Apr 30, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,47 @@
# codeface-extraction-r – Changelog

## unversioned
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bockthom: Just as a reminder: When packing the next release, we need to combine the duplicate headings named ## unversioned.


### Add relations to authors and artifacts (2f1b4d9b0d6a629163a6dd3111b20930e15fcc13)
- add for new relation types for each edge
- accept vector with more than one relation for `author.relation` and `artifact.relation` in util-conf.R

### Changes in util-networks (#98, 2f1b4d9b0d6a629163a6dd3111b20930e15fcc13)

#### Build Networks (2f1b4d9b0d6a629163a6dd3111b20930e15fcc13)
- edit function `get.author.network` to handle more than one relation
- edit function `get.artifact.network` to handle more than one artifact relation
- handle more than one relation type and merge the resulting vertex lists and edge lists in `get.bipartite.network`
- enable for different relations for `authors.to.artifacts` in `get.multi.network`, add information about the relation
and merge vertex sets
- add new vertex attribute `kind`
(7c628fb93eb21f280c7d9da66680f817e107fa24, 7ad49c4ad937c9a6c7398a45179e25d5d5c03faa)
- remove vertex attribute `id` in artifact vertices (7ad49c4ad937c9a6c7398a45179e25d5d5c03faa)

#### Network and Edge Construction (2f1b4d9b0d6a629163a6dd3111b20930e15fcc13)
- function `construct.network.from.list` split in two functions `construct.edge.list.from.key.value.list` and `construct.network.from.edge.list`
- add function `merge.network.data` und `merge.networks`
- handle more than one relation in function `add.edges.for.bipartite.relation` and set the edge attribute `relation`
- add function `create.empty.edge.list`
- enable for different relations in `get.bipartite.relation` and save the type of the relation in an attribute `relation`

#### Simplify (021ac8b88e9a181364a51e89807df55cb741ed44)
- iterate over the different types of relations and simplfy the subnetworks relating to the `relation` attribute
- the `EDGE.ATTR.HANDLING` of the attribute `relation` is "first"

### Changes in test suite (784c417c50eb1de5d0143908a390ead6ba22dbbf, 7ad49c4ad937c9a6c7398a45179e25d5d5c03faa, be6ee8cd48dc7692e02b7f1c512870591300fa8a)
- add relation attribute `relation` to result data frames
- remove vertex attribute `id`
- write new tests for networks with more than one relation type

### Changes in util-plot (b55d3e84a5f9b122dacd0ee52784d930f22d1f4b, f190ca130a15a82e5eed836e9ffc53b8a34aac20)
- colors of the edges depending on the relation type
- line shape depend on the edge type (inter or intra)
- change colors of edges
- remove colors from `plot.fix.type.attributes`
- use palette 'viridis'
- different colors for artifacts


## unversioned

Expand Down
24 changes: 20 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ When selecting a version to work with, you should consider the following points:
- `ggraph`: For plotting of networks (needs `udunits2` system library, e.g., `libudunits2-dev` on Ubuntu!)
- `markovchain`: For core/peripheral transition probabilities
- `lubridate`: For convenient date conversion and parsing
- `viridis`: For plotting of networks with nice colours


## How-To
Expand Down Expand Up @@ -214,7 +215,7 @@ Updates to the parameters can be done by calling `NetworkConf$update.variables(.
**Note**: Default values are shown in *italics*.

- `author.relation`
* The relation among authors, encoded as edges in an author network
* The relation(s) among authors, encoded as edges in an author network
* **Note**: The author--artifact relation in bipartite and multi networks is configured by `artifact.relation`!
* possible values: [*`"mail"`*, `"cochange"`, `"issue"`]
- `author.directed`
Expand All @@ -228,7 +229,7 @@ Updates to the parameters can be done by calling `NetworkConf$update.variables(.
* Remove all authors from an author network (including bipartite and multi networks) who are not present in an author network constructed with `artifact.relation` as relation, i.e., all authors that have no biparite relations in a bipartite/multi network are removed.
* [`TRUE`, *`FALSE`*]
- `artifact.relation`
* The relation among artifacts, encoded as edges in an artifact network
* The relation(s) among artifacts, encoded as edges in an artifact network
* **Note**: This relation configures also the author--artifact relation in bipartite and multi networks!
* possible values: [*`"cochange"`*, `"callgraph"`, `"mail"`, `"issue"`]
- `artifact.directed`
Expand All @@ -239,13 +240,14 @@ Updates to the parameters can be done by calling `NetworkConf$update.variables(.
* The list of edge-attribute names and information
* a subset of the following as a single vector:
- timestamp information: *`"date"`*, `"date.offset"`
- general information: *`"artifact.type"`*
- author information: `"author.name"`, `"author.email"`
- committer information: `"committer.date"`, `"committer.name"`, `"committer.email"`
- e-mail information: *`"message.id"`*, *`"thread"`*, `"subject"`
- commit information: *`"hash"`*, *`"file"`*, *`"artifact.type"`*, *`"artifact"`*, `"changed.files"`, `"added.lines"`, `"deleted.lines"`, `"diff.size"`, `"artifact.diff.size"`, `"synchronicity"`
- commit information: *`"hash"`*, *`"file"`*, *`"artifact"`*, `"changed.files"`, `"added.lines"`, `"deleted.lines"`, `"diff.size"`, `"artifact.diff.size"`, `"synchronicity"`
- PaStA information: `"pasta"`,
- issue information: *`"issue.id"`*, *`"event.name"`*, `"issue.state"`, `"creation.date"`, `"closing.date"`, `"is.pull.request"`
* **Note**: `"date"` is always included as this information is needed for several parts of the library, e.g., time-based splitting.
* **Note**: `"date"` and `"artifact.type"` are always included as this information is needed for several parts of the library, e.g., time-based splitting.
* **Note**: For each type of network that can be built, only the applicable part of the given vector of names is respected.
* **Note**: For the edge attributes `"pasta"` and `"synchronicity"`, the project configuration's parameters `pasta` and `synchronicity` need to be set to `TRUE`, respectively (see below).
- `simplify`
Expand All @@ -267,6 +269,20 @@ You can also update the `NetworkConf` object at any time by calling `NetworkBuil
For more examples, please look in the file `showcase.R`.


### Network properties

- Mandatory vertex attributes
* *`"type"`*: [`"Author"`, `"Artifact"`]
* *`"kind"`*: [`"Author"`,`"File"`, `"Feature"`, `"Function"`, `"Mail"`, `"Issue"`,`"FeatureExpression"`]
* *`"name"`*

- Mandatory edge attributes
* *`"type"`*: [`Unipartite`, `Bipartite`]
* *`"artifact.type"`*: [`"Author"`,`"File"`, `"Feature"`, `"Function"`, `"Mail"`, `"Issue"`,`"FeatureExpression"`]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update the values here and for kind above:

  • remove artifact type "Author",
  • change artifact type "Issue" to "IssueEvent", and
  • change vertex kind "Mail" to "MailThread".

* *`"relation"`*: [`mail`, `cochange`, `issue`, `callgraph`] (from `artifact.relation` and `author.relation` attributes in the `NetworkConf` class)
* *`"date"`*


## File/Module overview

- `util-init.R`
Expand Down
4 changes: 3 additions & 1 deletion tests/test-data-cut.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
## Copyright 2017 by Christian Hechtl <[email protected]>
## Copyright 2017 by Felix Prasse <[email protected]>
## Copyright 2018 by Claus Hunsen <[email protected]>
## Copyright 2018 by Barbara Eckl <[email protected]>
## All Rights Reserved.


Expand Down Expand Up @@ -68,7 +69,8 @@ test_that("Cut commit and mail data to same date range.", {
date = get.date.from.string("2016-07-12 16:04:40"),
date.offset = as.integer(c(100)),
subject = c("Re: Fw: busybox 2 tab"),
thread = sprintf("<thread-%s>", c(9)))
thread = sprintf("<thread-%s>", c(9)),
artifact.type = "Mail")

commit.data = x.data$get.data.cut.to.same.date(data.sources = data.sources)$get.commits()
rownames(commit.data) = 1:nrow(commit.data)
Expand Down
28 changes: 16 additions & 12 deletions tests/test-networks-artifact.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
##
## Copyright 2017-2018 by Christian Hechtl <[email protected]>
## Copyright 2017 by Claus Hunsen <[email protected]>
## Copyright 2018 by Barbara Eckl <[email protected]>
## All Rights Reserved.


Expand Down Expand Up @@ -47,22 +48,25 @@ test_that("Network construction of the undirected artifact-cochange network", {
network.built = network.builder$get.artifact.network()

## vertex attributes
vertices = c("Base_Feature", "foo", "A")
vertices = data.frame(name = c("Base_Feature", "foo", "A"),
kind = "Feature",
type = TYPE.ARTIFACT)

data = data.frame(from = c("Base_Feature", "Base_Feature"),
to = c("foo", "foo"),
date = get.date.from.string(c("2016-07-12 16:06:32", "2016-07-12 16:06:32")),
hash = c("0a1a5c523d835459c42f33e863623138555e2526", "0a1a5c523d835459c42f33e863623138555e2526"),
file = c("test2.c", "test2.c"),
artifact.type = c("Feature", "Feature"),
artifact = c("Base_Feature", "foo"),
weight = c(1, 1),
type = TYPE.EDGES.INTRA)
data = data.frame(
from = c("Base_Feature", "Base_Feature"),
to = c("foo", "foo"),
date = get.date.from.string(c("2016-07-12 16:06:32", "2016-07-12 16:06:32")),
artifact.type = c("Feature", "Feature"),
hash = c("0a1a5c523d835459c42f33e863623138555e2526", "0a1a5c523d835459c42f33e863623138555e2526"),
file = c("test2.c", "test2.c"),
artifact = c("Base_Feature", "foo"),
weight = 1,
type = TYPE.EDGES.INTRA,
relation = "cochange"
)

## build expected network
network.expected = igraph::graph.data.frame(data, directed = FALSE, vertices = vertices)
network.expected = igraph::set.vertex.attribute(network.expected, "id", value = igraph::get.vertex.attribute(network.expected, "name"))
igraph::V(network.expected)$type = TYPE.ARTIFACT

expect_true(igraph::identical_graphs(network.built, network.expected))
})
Loading