Code Coverage Design (V2)

Author:Matt Albrecht

Overview

The goal of the codecoverage package is to capture the lines of code that were executed in a Java VM. Traditional Java-based code coverage tools fall into four categories:

Type Speed Needs Source Special In-Process Setup Examples

Monitor VM Slow No VM must have monitoring enabled GroboCodeCoverage V1

ClassLoader Dynamic Class Analysis Moderate No Running VM must use specific classloader, and code can't load covered classes in a new classloader. Quilt, Hansel

Source Recompilation Fast Yes No (just extra classes in classpath) Clover

Bytecode Recompilation Fast No No (just extra classes in classpath) Gretel

The V2 design extends the standard ClassLoader analysis tools and moves the bytecode analysis and recompilation to a step outside the actual runtime analysis. The disadvantages include:

there's an extra step in the coverage analysis;
there are two copies of the class files (the differences from the outside being the size of the classes), which can lead to confusion;
the pre-runtime recompilation generates extra files which must be used with the exact class files generated;
additional system properties must be specified in the runtime engine, along with an additional class library in the classpath.

Using an automated build tool such as Ant along with a structured build process can lead to considering the first three disadvantages as inconsequential; additional disk-drive space becomes the only worrysome factor (which, more and more, this is a minimal factor). The fourth factor requires adding additional information to the runtime setup scripts.

The perceived advantages far outweigh these manageable deficiencies:

the run-time processing is sped up, since the analysis of the class files and bytecode restructuring is performed outside the execution;
there is no more need to have a specially setup execution environment to ensure the proper class loader is generated, and no special care is needed to ensure that the executed code doesn't use its own class loader;
the additional classes needed to accompany the altered classes is minimal, since they do the equivalent of logging.

The reason why the author decided to create his own code coverage tool, rather than use Gretel, is that Gretel requires a Gretel-compiled "main" method to be executed first. This is rarely possible in application servers or other such environments.

Architecture

This package breaks the general code-coverage problem into four parts:

post-compilation engine: performs post-source code compilation on the bytecode (a "post-compiler", if you will), and dynamically allows the analysis modules to pre-process the class files.
analysis modules: parse the bytecode methods for all classes, mark bytecode instructions as under analysis, and generate data necessary for creating a report.
channel logger: the generated bytecode adds logging statements for each analysis module. The logging is directed towards one "channel" per module during the runtime execution. These loggers must be minimal in runtime overhead, and must be separately jarred for easy inclusion into the runtime.
report engine: parses the post-compilation data along with the runtime channel logs to generate reports for each analysis module. If the modules and post-compilation engine were constructed correctly, this part does not need to interact with the analysis modules directly.
data store: organizes the post-compilation meta-data and channel logs. Allows for easy and efficient query of per-analysis module and per-class data.

This package does not need to be classloader aware. If the recompiled classes log the classname with a checksum, then each part can uniquely identify each class file. Even though, technically under the JVM spec, that if two class files are loaded by different class loaders then they are different classes, the actual coverage numbers do not split with this differentiation.

Details of Analysis Modules

Analysis modules provide the actual value to the coverage tool. They can report number of bytecodes covered, number of lines covered, or branch coverage.

A proper module knows how to decode the BCEL object format for methods, marks individual bytecode instructions as "check for coverage", and generates associated meta-data for each mark, so that the report engine can easily create a report without having to refer back to the module.

The post-compilation engine will provide objects that simplify the matter of inserting the marks. It will also generate a data store for each method for the module to record all the meta-data.


This space graciously provided by the SourceForge project	Copyright © 2002-2004 GroboUtils Project. All rights reserved.