Scan site along with Coverity Prevent, did an objective measure of static analysis defect density in popular Open Source projects. The analysis is based on the fact that previous reports from Microsoft has proven that static analysis can be an accurate predictor of defect density ( Source: Static Analysis Tools as Early Indicators of Pre-Release Defect Density, Microsoft Research ).
As part of this research, Coverity Prevent™, the industry leading static analysis tool, was made available to qualified open source software projects via the Scan website. Through the Scan site, open source developers can retrieve the
defects identified by Prevent analyses through a portal accessible only by qualified project developers. scan.coverity.com
By comparing the number of defects identified in the first analysis of each open source project to the number of defects found in the most recent analysis, Coverity measures the overall progress of participating open source projects at the Scan site.
Change in Defect Density Across All Open Source Projects
Based on the Scan 2006 Benchmark, the initial static analysis defect density averaged across participating projects is 0.30, or roughly one defect per 3,333
lines of code.
The current average number of individual defects per project, based on the Scan 2006 Benchmark (as of March 2008) is 283.49. Based on the consolidated results of the most recent analysis for each project, the current static analysis defect density averaged across all the participating projects is 0.25, or roughly one defect per 4,000 lines of code.
These findings represent an overall reduction of static analysis defect density across 250 open source projects of a total of 23,068 individual defects, lowering the average static analysis defect density in these open source projects by 16%.
Certainly, there is a change in the defect density across various open source project. The findings represent an overall reduction of static analysis defect density across 250 open source projects of a total of 23,068 individual defects, lowering the average static analysis defect density in these open source projects by 16%.
Frequency of Individual Code Defect Types
To provide insight into general trends regarding the frequency of specific defect types, consolidated totals across all open source projects are presented in the list table
- NULL Pointer Dereference: Number of Defects: 6,448 Percentage: 27.95%
- Resource Leak: Number of Defects:5,852 Percentage: 25.73%
- Unintentional Ignored Expressions: Number of Defects: Percentage: 2,252 9.76%
- Use Before Test (NULL): Number of Defects: 1,867 Percentage: 8.09%
- Buffer Overrun (statically allocated): Number of Defects: 1,417 Percentage: 6.14%
- Use After Free: Number of Defects: 1,491 Percentage: 6.46%
- Unsafe use of Returned NULL: Number of Defects: 1,349 Percentage:5.85%
- Uninitialized Values Read: Number of Defects: 1,268 Percentage:5.50%
- Unsafe use of Returned Negative: Number of Defects: 859 Percentage:3.72%
- Type and Allocation Size Mismatch: Number of Defects: 144 Percentage: 0.62%
- Buffer Overrun (dynamically allocated): Number of Defects: 72 Percentage: 0.31%
- Use Before Test (negative): Number of Defects: 49 Percentage: 0.21%
Projects with Exceptionally Low Defect Density
The site divides open source projects into rungs based on the progress each project makes in resolving defects. Projects at higher rungs receive access to additional analysis capabilities and configuration options. Projects are promoted as they resolve the majority of defects identified at their current rung.
The first rung is rung 0. At rung 0, a project has been built and analyzed by Coverity’s Scan infrastructure, but no representatives of the open source project have come forward for access to the analysis results. Projects progress to the next rung by selecting a set of official contacts to represent the project to Coverity
Currently there are 173 projects at Rung 0 – http://scan.coverity.com/rung0.html
The next rung is rung 1. At rung 1 and above, Coverity supplies a mailing list for developers to discuss analysis results, and to facilitate communication from Coverity about questions from the project or additional functionality being made available. Projects progress to the Rug 2 by reaching a reasonably low defect count in the basic issue types, appropriate for the size of the project code base.
Currently, there are 86 projets in Rung 1 – http://scan.coverity.com/rung1.html
The following projects are the projects with exceptionally low defect density and have advanced to Rung 2 of the Scan ladder.
For details on defect density check http://scan.coverity.com/rung2.html
All of these projects eliminated multiple classes of potential security vulnerabilities and quality defects from their code on the Coverity Scan site. Because of their efforts to proactively ensure software integrity and security, organizations and consumers can now select these open source applications with even greater confidence.
Amanda’s developers fix over 40% of the Scan’s detected defects with a single reading of the Scan analysis for that issue. In the chart below, the red defects were RESOURCE LEAKs.
Over 75% of the defects Scan identified in Samba were fixed within two reviews of the Scan analysis. In the chart below, the blue defects were NULL DEREFERENCEs
Findings are based on analysis of over 55 million lines of code on a recurring basis from more than 250 open source projects, representing 14,238 individual project analysis runs for a total of nearly 10 billion lines of code analyzed:
• The overall quality and security of open source software is improving – Researchers at the Scan site observed a 16% reduction in static analysis defect density over the past two years
• Prevalence of individual defect types - There is a clear distinction between common and uncommon defect types across open source projects
• Code base size and static analysis defect count - Research found a strong, linear relationship between these two variables
• Function length and static analysis defect density - Research indicates static analysis defect density and function length are statistically uncorrelated
• Cyclomatic complexity and Halstead effort – Research indicates these two measures of code complexity are significantly correlated to codebase size
• False positive results - To date, the rate of false positives identified in the Scan databases averages below 14%