Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version v3.2 #126

Merged
merged 54 commits into from
May 17, 2018
Merged

Version v3.2 #126

merged 54 commits into from
May 17, 2018

Conversation

clhunsen
Copy link
Collaborator

3.2

Changes in detail

v3.1.2...510863e

Added

Changed/Improved

  • Remove the mandatory vertex attribute artifact.type due to inconsistent use ()
  • Remove the mandatory vertex attribute id from artifact vertices due to inconsistent use (7ad49c4)
  • Streamline edge attribute artifact.type for uniformity ()
  • Use color palette 'viridis' for plotting for better flexibility (f190ca1)
  • Edge width in network plots now depends on edge weight, i.e., width = 0.3 + 0.5 * log(weight) (d791df8)
  • Split function construct.network.from.list into the two functions construct.edge.list.from.key.value.list and construct.network.from.edge.list (2f1b4d9)
  • Handle data for more than one relation in function add.edges.for.bipartite.relation (2f1b4d9)
  • Retain one edge for each available value of edge attribute relation during network simplification (021ac8b)
  • Read also lines from the PaStA data without the message.id being mapped to a commit.hash (992ddf8)
  • Add column revision.set.id to PaStA data to indicate which e-mails are concerned with the same patch (992ddf8)
  • Add PaStA data to the unfiltered commit data if configured (70d9b8b)

Fixed

  • Check whether a given object to the ProjectConf setter in the ProjectData class really is a object of type ProjectConf (ab00c96)
  • The method for eigenvector centrality now properly considers whether the network is directed or not (c0277c3)
  • Fix a bug that caused errors when the core classification within a core-periphery classification is empty (c0277c3)

bockthom and others added 30 commits April 18, 2018 19:11
When constructing ranges, in some situations, we might need to avoid that
the last range is not a perfect range any more, i.e., it is smaller than the
specified time period. Therefore, add a parameter 'imperfect.range.ratio'
which quantifies the allowed imperfectness (e.g., at least 70% of the
specified time period are needed to keep an imperfect range, otherwise it
will be combined with the previous range.)

This fixes #104.

In addition to that, add a parameter 'include.end.date' to specify whether to
add 1 second to the end (for including the end date as the end of ranges is
exclusive) or not. (The default should be to add 1 second to the end, as
hitherto. However, we came across situations where we explicitlydon't want
that behavior, which is the reason for adding the additional parameter.)

Also add some additional tests for the newly introduced behavior.

Signed-off-by: Thomas Bock <[email protected]>
Signed-off-by: Thomas Bock <[email protected]>
To improve the current implementation of the function
'construct.overlapping.ranges' regarding the parameter
'imperfect.range.ratio', we now use one single compact if-block for
correcting the ranges *after* their construction.

The test suite is adapted to catch the corner case where we obtain a
single range which is already shorter than the chosen time period –
while passing also 'imperfect.time.ratio' in a manner that would cause
the adaptation of the last range.

Signed-off-by: Claus Hunsen <[email protected]>
Reviewed-by: Thomas Bock <[email protected]>
 Enable contraction of imperfect ranges in the end

Reviewed-by: Claus Hunsen <[email protected]>
Reviewed-by: Thomas Bock <[email protected]>
Now, it is possible to construct networks which include more than one
artifact and author relation.

To provide this functionality, the allowed number of the
'author.relation' and the 'artifact.relation'
is changed to Inf instead of 1.
A further edge attribute 'relation' is added to describe the relation
type. The functions 'get.artifact.network' and 'get.author.network' are
changed to handle more than one relation. Therefore, the functions
'get.bipartite.network' and 'get.multi.network' are adjusted, too. The
functions use the 'add.edges.for.bipartite.relations' function which can
handle more than one relation now.
To merge the networks of all relations, we add the functions
'merge.network.data' and 'merge.networks'. The function
'construct.network.from.list' is split into to functions
'construct.edge.list.from.key.value.list' and
'construct.network.from.edge.list'.

Solves the second part of #98.

Signed-off-by: Barbara Eckl <[email protected]>
Signed-off-by: Claus Hunsen <[email protected]>
Now, the colors of the edges depend on the edge attribute 'relation'.
The colors of the edges are not any more saved in an edge attribute,
because the edge attribute is never been used.

Signed-off-by: Barbara Eckl <[email protected]>
To get rid of the global variables in the plot module, we now use the
color palette 'viridis' to colorize vertices and edges.

The setting of the 'relation' attribute is changed accordingly.

Signed-off-by: Claus Hunsen <[email protected]>
Signed-off-by: Barbara Eckl <[email protected]>
Now, the different kindes of artifacts (Mail, Issue, File, Function,
Feature, Featureexpression, Author) can be plotted with different
colors. For this functionality, a new vertex attribute 'kind' is
necessary, which includes the information of the attribute
'artifact.type' or 'Author' (for author vertices).

The new vertex attribute is respected as vertex color during plotting.

Signed-off-by: Barbara Eckl <[email protected]>
Signed-off-by: Claus Hunsen <[email protected]>
We would respect the edge attribute 'relation' while simplifying a
network, therefore, the new algorithm for simplification is now the
following:
First, the networks are split depending on the relation type. Then, they
are simplified and, in a last step, the networks are merged to one
simplified network which have parallel edges if these have different
relation types.

Also, this works on networks without the 'relation' attribute!

Signed-off-by: Barbara Eckl <[email protected]>
Signed-off-by: Claus Hunsen <[email protected]>
Remove vertex attribute 'id' of artifact networks with relation 'mail'
and add vertex attribute 'artifact.type' in artifact networks of
relations 'issues', 'mail' and 'callgraph'.

Then these changes of the vertex attributes in artifact networks are
included in the test suite.

Signed-off-by: Barbara Eckl <[email protected]>
Signed-off-by: Barbara Eckl <[email protected]>
The colors and line types of edges were sometimes wrong. To delete loops
from the network before plotting, solves the problem.

Signed-off-by: Claus Hunsen <[email protected]>
Signed-off-by: Barbara Eckl <[email protected]>
Signed-off-by: Barbara Eckl <[email protected]>
Adding some more comments. Include a new section in 'README.md' for
attributes.

Signed-off-by: Barbara Eckl <[email protected]>
In this patch, we improve the recently introduced vertex and edge
attributes. Details are listed below.

Changes regarding vertex attributes:
- The attribute 'type' is either TYPE.AUTHOR or TYPE.ARTIFACT.
- The attribute 'kind' now contains solely artifact types (such as
"Feature" or "Function") and "Author" as values.
- The attribute 'artifact.type' is removed completely as it has been
partly redundant to the 'kind' attribute.

Changes regarding edge attributes:
- The attribute 'type' is either TYPE.EDGES.INTRA or TYPE.EDGES.INTER.
- The attribute 'artifact.type' is sourced solely from the column
'artifact.type', present in all data sources.
- The attribute 'relation' is filled by the NetworkConf values for
'author.relation' or 'artifact.relation', respectively.

Additionally, some TODO items are added or adapted.

Signed-off-by: Claus Hunsen <[email protected]>
Signed-off-by: Barbara Eckl <[email protected]>
To reliably resolve the correct vertex attribute 'kind' for the current
artifact relation, we now use a mapping method.

This way, we are open to rename any 'kind' value in later steps.

Signed-off-by: Claus Hunsen <[email protected]>
As the vertices that we produce in the artifact relation 'mail'
represent mail threads, the corresponding vertex kind is renamed to
'MailThread'.

Signed-off-by: Claus Hunsen <[email protected]>
As the edges that we produce for the relation 'issue' are initiated by
an issue comment – and not the complete issue–, the corresponding
artifact type is renamed to 'IssueComment'.

Signed-off-by: Claus Hunsen <[email protected]>
For consistent and comprehensible legends for network plots, the legend
keys for shapes/linetypes and colors are properly distinguished now.
Additionally, the size of the vertex colors are reduced in size to
properly match the one of the vertex shapes.

Signed-off-by: Claus Hunsen <[email protected]>
Add the parameters 'remove.multiple' and 'remove.loops' to the functions
'simplify.network' and 'simplify.networks' and pass the them to the
'igraph::simplify' function.

Signed-off-by: Thomas Bock <[email protected]>
As the edges that we produce for the relation 'issue' are not only
initiated by an issue comment, but also by other events, for example
opening or closing the issue, the corresponding artifact type is renamed
to 'IssueEvent'.

Signed-off-by: Barbara Eckl <[email protected]>
ecklbarb and others added 24 commits April 30, 2018 14:14
Signed-off-by: Barbara Eckl <[email protected]>
As discussed regarding PR #115 and commit
4dead55, the resolution of the vertex
kind for artifact networks is now solved through using the available
relation variable. This enables us to remove the graph attributes we
added earlier.

Thanks to @bockthom for pointing this out.

Signed-off-by: Claus Hunsen <[email protected]>
To enhance intercompatibility with internal functionality from the
package 'igraph', especially with the function
'igraph::as.data.frame(..., what = "both")', the function
'construct.edge.list.from.key.value.list' now returns a data.frame for
the vertices, not a plain vector.

Signed-off-by: Claus Hunsen <[email protected]>
Add multi-edges to the author and artifact networks

Reviewed-by: Thomas Bock <[email protected]>
Reviewed-by: Claus Hunsen <[email protected]>
Add revision set id to pasta items

Signed-off-by: Christian Hechtl <[email protected]>
Fix bug that throws an error when the core classification is empty

Signed-off-by: Christian Hechtl <[email protected]>
Change logging to improve performance

Signed-off-by: Christian Hechtl <[email protected]>
Signed-off-by: Christian Hechtl <[email protected]>
Signed-off-by: Christian Hechtl <[email protected]>
Therefor add a method that returns the names if the cached data objects
in order to not having to compare all data sources but only the cached
ones. This fixes #116.

Signed-off-by: Christian Hechtl <[email protected]>
As with the changes in PR #115 the 'viridis' package was added as needed
package, also update the install.R.

In addition, fix the misspelling of a variable name in install.R.

Signed-off-by: Thomas Bock <[email protected]>
This patch contains the following changes:
Return `FALSE` when no second `ProjectData` object is given to the
equals method

Add parantheses to the function calls in the equals method of
`ProjectData`

Change the revision set id in the pasta reading method from a global
parameter to a `seq_along` list and iterate with `mcmapply`

Signed-off-by: Christian Hechtl <[email protected]>
Change the revision.set.id from an integer to a string of the form
<revision-set-ID>

Signed-off-by: Christian Hechtl <[email protected]>
This is because the timeststamps are implicitly tested anyway and they
cause problems as the getter of them reads all not cached data sources
and this changes the data object that is tested.

Signed-off-by: Christian Hechtl <[email protected]>
To do so, fix the check for the object type in the equals method of the
ProjectData class to also allow RangeData objects. Call this method from
the equals method in RangeData and then compare the remaining two
data sources (`range` and `revision.callgraph`).
Also add a test for this and fix the faulty documentation of
`get.data.cut.to.same.date()`
This fixes #116

Signed-off-by: Christian Hechtl <[email protected]>
Add check that one can not compare a ProjectData object to a RangeData
object
Expand test to that effect

Signed-off-by: Christian Hechtl <[email protected]>
Core-Periphery additions and ProjectData object comparison

Reviewed-by: Thomas Bock <[email protected]>
Reviewed-by: Claus Hunsen <[email protected]>
Signed-off-by: Claus Hunsen <[email protected]>
@clhunsen clhunsen added this to the v3.2 milestone May 17, 2018
@clhunsen
Copy link
Collaborator Author

As we reviewed everything before, I will merge rightaway.

@clhunsen clhunsen merged commit cfadd76 into master May 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants