Oregon State University is well represented at the premiere software engineering conference, ICSE, which will be held in Hyderabad, India this year. Professors Danny Dig, Alex Groce and Carlos Jensen, and Ph.D students, Caius Brindescu, Mihai Codoban, Rahul Gopinath and Sergey Shmarkatiuk, have a combined total of four papers accepted to the prestigious research track on topics such as asynchronous programming, test suite evaluation, Distributed Version Control Systems (DVCS), and repetitive code changes. The Beaver invasion of India will happen this May. See their abstracts below.
A Study and Toolkit for Asynchronous Programming in C#
Semih Okur, David L. Hartveld, Danny Dig, and Arie van Deursen
University of Illinois at Urbana-Champaign, USA; Delft University of Technology, Netherlands; Oregon State University, USA
Asynchronous programming is in demand today, because responsiveness is increasingly important on all modern devices. Yet, we know little about how developers use asynchronous programming in practice. Without such knowledge, developers, researchers, language and library designers, and tool vendors can make wrong assumptions. We present the first study that analyzes the usage of asynchronous programming in a large experiment. We analyzed 1378 open source Windows Phone (WP) apps, comprising 12M SLOC, produced by 3376 developers. Using this data, we answer 2 research questions about use and misuse of asynchronous constructs. Inspired by these findings, we developed (i) Asyncifier, an automated refactoring tool that converts callback-based asynchronous code to the new async/await; (ii) Corrector, a tool that finds and corrects common misuses of async/await. Our empirical evaluation shows that these tools are (i) applicable and (ii) efficient. Developers accepted 313 patches generated by our tools.
Code Coverage for Suite Evaluation by Developers
Rahul Gopinath, Carlos Jensen, and Alex Groce
Oregon State University, USA
One of the key concerns of developers testing code is how to determine a test suite's ability to find faults. The most common approach is to use code coverage as a measure for test suite quality, and diminishing returns in coverage or high absolute coverage as a stopping rule. In testing research, suite quality is often evaluated by measuring its ability to kill mutations, artificially seeded code changes. Mutation testing is effective but expensive, thus seldom used by practitioners. Determining which criteria best predict mutation kills is therefore critical to practical estimation of suite quality. Previous work has only used small sets of programs, and usually compares multiple suites for a single program. Practitioners, however, seldom compare suites --- they evaluate one suite. Using suites (both manual and automatically generated) from a large set of real-world open-source projects shows that results for evaluation differ from those for suite-comparison: statement coverage (not block, branch, or path) predicts mutation kills best.
How Do Centralized and Distributed Version Control Systems Impact Software Changes?
Caius Brindescu, Mihai Codoban, Sergii Shmarkatiuk, and Danny Dig
Oregon State University, USA
Distributed Version Control Systems (DVCS) have seen an increase in popularity relative to traditional Centralized Version Control Systems (CVCS). Yet we know little on whether developers are benefitting from the extra power of DVCS. Without such knowledge, researchers, developers, tool builders, and team managers are in the danger of making wrong assumptions. In this paper we present the first in-depth, large scale empirical study that looks at the influence of DVCS on the practice of splitting, grouping, and committing changes. We recruited 820 participants for a survey that sheds light into the practice of using DVCS. We also analyzed 409M lines of code changed by 358300 commits, made by 5890 developers, in 132 repositories containing a total of 73M LOC. Using this data, we uncovered some interesting facts. For example, (i) commits made in distributed repositories were 32% smaller than the centralized ones, (ii) developers split commits more often in DVCS, and (iii) DVCS commits are more likely to have references to issue tracking labels.
Mining Fine-Grained Code Changes to Detect Unknown Change Patterns
Stas Negara, Mihai Codoban, Danny Dig, and Ralph E. Johnson
University of Illinois at Urbana-Champaign, USA; Oregon State University, USA
Identifying repetitive code changes benefits developers, tool builders, and researchers. Tool builders can automate the popular code changes, thus improving the productivity of developers. Researchers can better understand the practice of code evolution, advancing existing code assistance tools and benefiting developers even further. Unfortunately, existing research either predominantly uses coarse-grained Version Control System (VCS) snapshots as the primary source of code evolution data or considers only a small subset of program transformations of a single kind - refactorings. We present the first approach that identifies previously unknown frequent code change patterns from a fine-grained sequence of code changes. Our novel algorithm effectively handles challenges that distinguish continuous code change pattern mining from the existing data mining techniques. We evaluated our algorithm on 1,520 hours of code development collected from 23 developers, and showed that it is effective, useful, and scales to large amounts of data. We analyzed some of the mined code change patterns and discovered ten popular kinds of high-level program transformations. More than half of our 420 survey participants acknowledged that eight out of ten transformations are relevant to their programming activities.