By running a test method on its own and storing the coverage data for the run, it is possible to later compare coverage data and make suggestions about tests that could be safely deleted. If two tests cover exactly the same lines of code, then one can go. If one test covers the same lines of code as another, and has a few extra lines of coverage, then the one with less lines could go. Coverage is normally calculated down to the path through conditionals, and that fidelity needs to be taken into account here too when making that delete or non-delete decision.

Lamdaj example

Lamdaj is a pioneering technology that allowed some lambda functions for Java version before 8. I’m using its source here, for analysis.

I took a copy of it from it’s Subversion home on GoogleCode, and checked it into Github with it’s maven script upgraded to capture JaCoCo coverage. I’ve coded some fairly hacky sed, ack, and egrep to store only the covered (and partially covered) lines for the production sources - per test. That’s all checked in to the project in a coverage_data directory as is the list of test methods. Here’s the bash script that makes that:


rm -rf coverage_data
mkdir coverage_data

# list of unit tests methods (note: possibility of inaccuracy)

ack -A1 --nogroup "@Test" src/test/ | grep public | sed 's/\.java/ /' | \
     sed 's#/#.#g' | sed 's/().*//' | sed 's/public void//' | \
     perl -pne 's/[ \t]+/#/g' | sed 's/' | \
     cut -d'#' -f 1,3 > coverage_data/list_of_test_methods.txt

mvn clean

# Process tests one at a time
for testMethod in `cat coverage_data/list_of_test_methods.txt`; do

	# One maven invocation per test (so we can get focused coverage)

    rm -rf target/coverage-reports

    mvn -Dtest=$testMethod -DfailIfNoTests=false test


    ack --noheading "id=\"L" --noignore-dir=target target | \
    egrep -i "class=\"(pc|fc)" | sed 's/\.java.html:/-/' | \
    sed 's#target/##' | sed 's#site/jacoco-ut/##' | sed 's#/#.#g' | \
    sed 's/id=.*//' | sed 's/<span class=//' | \
    sed 's/\"//g' > $op_file_name

    [ -s $op_file_name ] || rm $op_file_name


Comaparing coverage reports to each other is next and it’s O(n-1 squared) lengthy. Here’s the bash again (Bash v4 is needed - the Mac has v3 by default):


mkdir coverage_diff_data

for test1 in `find coverage_data -name "*#*" | sed 's#coverage_data/##' \
        | sed 's/\.txt//'`; do
    for test2 in `find coverage_data -name "*#*" | sed 's#coverage_data/##' \
          | sed 's/\.txt//'`; do
        if [[ "$test1" < "$test2" ]]; then
            diff "coverage_data/${test1}.txt" "coverage_data/${test2}.txt" >> "${output}"
            deleted=$(grep -c "^<" "${output}")
            added=$(grep -c "^>" "${output}")
            matching=$(fgrep -xcf "coverage_data/${test1}.txt" \
            if [[ "$matching" != "0" ]]; then
                if [[ ( "$added" == "0" && "$deleted" == "0" ) ]]; then
                    echo "$test1 and $test2 are covering exactly the same lines"
                elif [[ ( "$added" == "0" && "$deleted" != "0" ) ]]; then
                    echo "$test2 covers the same lines as $test1 (and more)"
                elif [[ ( "$added" != "0" && "$deleted" == "0" ) ]]; then
                    echo "$test1 covers the same lines as $test2 (and more)"
                    rm "${output}"
                rm "${output}"

Tests that cover each other perfectly:

  1. ch.lambdaj.ClosureSpecialCasesTest#testWithEmptyListAndUnboundVar and ch.lambdaj.ClosureSpecialCasesTest#testWithNonEmptyListAndUnboundVar
  2. ch.lambdaj.ClosureSpecialCasesTest#testWithEmptyListOnly and ch.lambdaj.ClosureSpecialCasesTest#testWithNonEmptyListOnly
  3. ch.lambdaj.collection.LambdaListTest#test1 and ch.lambdaj.collection.LambdaListTest#test2
  4. ch.lambdaj.collection.LambdaListTest#testMap1 and ch.lambdaj.collection.LambdaListTest#testMap2
  5. ch.lambdaj.function.closure.FileParserClosureTest#testCountFile and ch.lambdaj.function.closure.FileParserClosureTest#testReadFile
  6. ch.lambdaj.function.closure.FileParserImplicitClosureTest#testCountFile and ch.lambdaj.function.closure.FileParserImplicitClosureTest#testReadFile
  7. and
  8. and
  9. and
  10. and
  11. ch.lambdaj.LambdaTest#testFilterArray and ch.lambdaj.LambdaTest#testFilterOnCustomMatcher
  12. ch.lambdaj.LambdaTest#testIndex and ch.lambdaj.demo.LambdaDemoTest#testIndexCarsByBrand
  13. ch.lambdaj.LambdaTest#testSelectFranceExposures and ch.lambdaj.LambdaTest#testSelectStringsThatEndsWithD

There are another 168 cases where one test has the same lines covered, and some additional lines too. There’s probably many more that are similar enough to analyze, without being a clean super-set situation.

Drilling into a single comparison

I’ll drill into the second of those above (two test methods in ClosureSpecialCasesTest).

The test source ch.lambdaj.ClosureSpecialCasesTest#testWithEmptyListOnly (line 59) and ch.lambdaj.ClosureSpecialCasesTest#testWithNonEmptyListOnly (the method below - at line 70).

The coverage reports as I’ve stored them: “ch.lambdaj.ClosureSpecialCasesTest#testWithEmptyListOnly and ch.lambdaj.ClosureSpecialCasesTest#testWithNonEmptyListOnly

The diff comparison of those is empty as you’d expect.

And a different case

According to the output, ch.lambdaj.LambdaTest#testSelectStringsThatEndsWithD covers the same lines as ch.lambdaj.collection.LambdaCollectionTest#test1 (and more).

The test source ch.lambdaj.LambdaTest#testSelectStringsThatEndsWithD (line 777) and ch.lambdaj.collection.LambdaCollectionTest#test1 (line 31).

The coverage reports as I’ve stored them: ch.lambdaj.LambdaTest#testSelectStringsThatEndsWithD and ch.lambdaj.collection.LambdaCollectionTest#test1

The diff comparison of those isn’t empty, and shows one is a super-set of the other in terms of LoC covered.


Length of test run is what drives you to want to act on this data. Lambdaj’s tests are hardly slow, but we are seeing things covered redundantly. That’s nearly always going, but when you can couple it to a single redundant test, you’ve an opportunity to delete that test to reduce the elapsed time of a test suite.

There’s many more things that could be looked at with per-test coverage, including clues that you might want to decompose your solution a little more, and mock those new components in tests that don’t really concern them. You could do that based on the numbers of tests that pass though the same lines of code, or even the total lines of code for a single test (larger isn’t better for unit tests)