-
Notifications
You must be signed in to change notification settings - Fork 13
Related Literature
Carlos Paradis edited this page Aug 10, 2023
·
13 revisions
- The promises and perils of mining git
- The promises and perils of mining GitHub
- Detecting and Characterizing Bots that Commit Code
- Findings from GitHub: methods, datasets and limitations
- Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based Git Data
- Sampling Projects in GitHub for MSR Studies
- Replicating MSR: A study of the potential replicability of papers published in the Mining Software Repositories proceedings
-
Small patches get in!
- Classification of Patch Developers vs Committers
-
Detecting Patch Submission and Acceptance in OSS Projects
- Patch Parser from MBOX files
-
Will my patch make it? And how fast? Case study on the Linux kernel
- Simplified Diagram of Linux Patch Development Process
-
Socialization in an Open Source Software Community: A Socio-Technical Analysis
- Concept Diagram from Core Developers to Users in Circles
-
Mining CVS Repositories to Understand Open-Source Project Developer Roles
- A more detailed Concept Diagram from Core Developers to Users in Circles
- The Secret Life of Patches: A Firefox Case Study
- An exploratory study of the pull-based software development model
- An insight into the pull requests of GitHub
-
Let's talk about it: evaluating contributions through discussion in GitHub
- Pull Request discussion analysis and interviews with Github developers.
- On the Shoulders of Giants: A New Dataset for Pull-based Development Research
-
Bug Lifecycle
- Bugzilla Conceptual Diagram
- The missing links: bugs and bug-fix commits
- Filling the Gaps of Development Logs and Bug Issue Data
- ReLink: recovering links between bugs and changes
- Fair and balanced?: bias in bug-fix datasets
- Discovering Loners and Phantoms in Commit and Issue Data
- Traceability in the wild: automatically augmenting incomplete trace links - Distinguished Paper ICSE'18
- Social science theories in software engineering research
- Socio-technical developer networks: should we trust our measurements?
- Validity of network analyses in Open Source Projects
- Mining email social networks
- An empirical study on the risks of using off-the-shelf techniques for processing mailing list data
- Putting It All Together: Using Socio-technical Networks to Predict Failures
- An Empirical Study of Multiple Names and Email Addresses in OSS Version Control Repositories
-
Communication in open source software development mailing lists
- "Our investigation reveals that implementation details are discussed only in about 35% of the threads, and that a range of other topics is discussed. Moreover, core developers participate in less than 75% of the threads. We observed that the development mailing list is not the main player in OSS project communication, as it also includes other channels such as the issue repository."
-
Episodic volunteering in open source communities
- "Episodic volunteers, who prefer short term engagement to habitual contributions, are present in Free/Libre and Open Source Software (FLOSS) communities."
-
Who is the expert? combining intention and knowledge of online discussants in collaborative RE tasks
- "address the problem of expert finding in mailing-list discussions"
- Inter-Package Dependency Networks in Open-Source Software
- Assessing Code Authorship: The Case of the Linux Kernel
-
User and developer mediation in an Open Source Software community: Boundary spanning through cross participation in online discussions
- "several key participants act as boundary spanners between the user and the developer communities. This emerging role is characterized by cross-participation in parallel same-topic discussions in both mailing-lists, cohesion between cross-participants, the occupation of a central position in the social network linking users and developers, as well as active, distinctive and adapted contributions"
-
Joining Free/Open Source Software Communities: An Analysis of Newbies' First Interactions on Project Mailing Lists
- "We found that nearly 80% of newbie posts received replies, and that receiving timely responses, especially within 48 hours, was positively correlated with future participation. We also found that while the majority of interactions were positive, 1.5% of responses were rude or hostile."
-
Newcomer integration and learning in technical support communities for open source software
- "We found that one third of newcomers' transition into a role of help givers in the community and demonstrate evidence of learning."
-
Understanding the process of participating in open source communities
- "the number of active developers does not change significantly when the total number of committers increases for the selected OSS projects."
-
Core-Periphery Communication and the Success of Free/Libre Open Source Software Projects
- "An innovation of the paper is that use of inclusive pronouns is measured using natural language processing techniques. We find that core and peripheral members differ in their volume of contribution and in their use of inclusive pronouns, and that volume of communication is related to project success."
- Candoia: A Platform for Building and Sharing Mining Software Repositories Tools as Apps
- git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories
- TNM: A Tool for Mining of Socio-Technical Data from Git Repositories
- PSIMiner: A Tool for Mining Rich Abstract Syntax Trees from Code
- Network Motifs: Simple Building Blocks of Complex Networks
- Network motifs in computational graphs: A case study in software architecture
- Detect Related Bugs from Source Code Using Bug Information
- Supplementary Bug Fixes vs. Re-opened Bugs
- On measuring affects of github issues' commenters
- An Empirical Study on the Structural Complexity Introduced by Core and Peripheral Developers in Free Software Projects
- Characterizing the Roles of Contributors in Open-Source Scientific Software Projects
- Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineering Tools
- Exploring the patterns of social behavior in GitHub
- Commit Bubbles
- Who Cares About My Feature Request?
- The Impact of a Low Level of Agreement Among Reviewers in a Code Review Process
- Release Early, Release Often and Release on Time. An Empirical Case Study of Release Management
- Understanding source code evolution using abstract syntax tree matching
- Issue ownership activity in two large software projects
- Scitools Understand Metrics
-
Studying the Chaos of Code Development
- Richard Holt File Entropy
-
Visualization of Methods Changeability Based on VCS Data
- Method co-change
- Automated Software Vulnerability Assessment with Concept Drift
- https://dl.acm.org/doi/10.1145/3379597.3387465
- Standing on Shoulders or Feet? The Usage of the MSR Data Papers
- STRESS: A Semi-Automated, Fully Replicable Approach for Project Selection
- Predicting Vulnerable Components: Software Metrics vs Text Mining
- Software Metrics and Security Vulnerabilities: Dataset and Exploratory Study
-
Fixing of Security Vulnerabilities in Open Source Projects: A Case Study of Apache HTTP Server and Apache Tomcat
- Case Study for Apache Tomcat and Apache HTTP Server. Suggests CVE data may be obtained from these projects.
-
When a Patch Goes Bad: Exploring the Properties of Vulnerability-Contributing Commits
- Another study on Apache HTTP Server.
-
A Manually-Curated Dataset of Fixes to Vulnerabilities of Open-Source Software
- Dataset: https://snyk.io/vuln
- VulinOSS: a dataset of security vulnerabilities in open-source systems
- A C/C++ Code Vulnerability Dataset with Code Changes and CVE Summaries
-
Security and Emotion: Sentiment Analysis of Security Discussions on GitHub
- List of security keywords to identify issues as software vulnerabilities for other analysis.
- An Empirical Study of Security Issues Posted in Open Source Projects uses this list of keywords.
- On development of a framework for massive source code analysis using static code analyzers
- Boa Meets Python: A Boa Dataset of Data Science Software in Python Language
- The SmartSHARK Ecosystem for Software Repository Mining - ICSE 2020 Demo Track
- Replicating Data Pipelines with GrimoireLab
- https://devguide.ropensci.org/
- A Large-Scale Study About Quality and Reproducibility of Jupyter Notebooks
- I'm leaving you, Travis: a continuous integration breakup story
- Studying the impact of adopting continuous integration on the delivery time of pull requests
- Machine Learning Technical Debt
- Creating Evolving Project Data Sets in Software Engineering
- Building the Collaboration Graph of Open-Source Software Ecosystem
- World of Code: An Infrastructure for Mining the Universe of Open Source VCS Data
- Public git archive: a big code dataset for all
- The Maven Dependency Graph: A Temporal Graph-Based Representation of Maven Central
- SOTorrent: Studying the Origin, Evolution, and Usage of Stack Overflow Code Snippets
- A Dataset of Non-Functional Bugs
- SeSaMe: A Data Set of Semantically Similar Java Methods
- CROP: linking code reviews to source code changes
- Mining DEV for social and technical insights about software development
- Technical Debt in the Peer-Review Documentation of R Packages: a rOpenSci Case Study
- A historical dataset of software engineering conferences
- Using docker containers to improve reproducibility in software engineering research
- Analyzing software engineering experiments: everything you always wanted to know but were afraid to ask
- Writing Good Software Engineering Research Papers: Revisited
- Synthesizing qualitative research in software engineering: a critical review
- Predicting Exploitation of Disclosed Software Vulnerabilities Using Open-source Data -- Bad Practices in Machine Learning Model Setup in SE - "MIT Paper"
- The art and practice of data science pipelines: A comprehensive study of data science pipelines in theory, in-the-small, and in-the-large
- Confessions of a Worldly Software Miner
- Failure is a four-letter word: a parody in empirical research
- Ethical Mining: A Case Study on MSR Mining Challenges