Eli Sennesh, M.Sc. Thesis Seminar
With the advent of easier to parse languages such as Java, and the availability
on the Internet of open-source software repositories, complete with versioning
histories, empirical studies at scale of software engineering metrics and
measurements have become possible and feasible.
We take up the questions of if and how "structured GOTO" statements impact
defect proneness, and of which what concept of size yields a
superior metric for defect prediction. We view the topic through the lens of
evidence-based language design, following the drive ignited by Markstrum and others.
Both the \Goto keyword and large methods are traditionally "considered harmful",
so much so that programmers are advised to avoid them in all cases. Despite this
traditional view, modern languages still contain constructs for branching to
nonadjacent syntax-tree nodes, which we term unstructured jumps. We count
these \Goto-like unstructured jumps, alongside method size and compressed method size,
as software engineering metrics, and examine the evolution of 26 open-source
code corpora in relation to those metrics. We employ five different measures
of defectiveness and development effort. We measure the statistical quality of
our metrics as predictors of our defect measurements.
We show that the number of unstructured jumps is a predictor of defects, routine
maintenance and two other metrics of software development effort. The correlation
between unstructured jumps and development effort is positive, and it remains
so even after accounting for the effect of code size. We also show that between
uncompressed and compressed code size, compressed size is the superior predictor
of defect proneness, maintenance, version increase, and code churn, while
uncompressed size only predicts better when measuring accumulated defects.
The number of unstructured jumps is superior to code size, both compressed and
uncompressed, in its predictive power of accumulated defects. Compressed size,
however, provides the best predictor for churn and routine maintenance.
Uncompressed size provides the best predictor for the density of defects
throughout methods of fixed size.
We also find that size metrics do not predict defects as a linear function of
method size. Defect density, the quantity of defects per unit of method size,
is nonuniform across method lengths, and displays a statistically significant
negative correlation with method length overall. When relative method size is
considered instead of absolute method size, we find that defects cluster densely
in the smallest and largest methods, with very low defect densities in between.
Attempts to propose a transformation on a size metric which would yield a new,
metric with constant defect density, contrary to expectations, yielded strictly
worse predictors than the original size metrics.