By running a test method on its own and storing the coverage data for the run, it is possible to later compare coverage data and make suggestions about tests that could be safely deleted. If two tests cover exactly the same lines of code, then one can go. If one test covers the same lines of code as another, and has a few extra lines of coverage, then the one with less lines could go. Coverage is normally calculated down to the path through conditionals, and that fidelity needs to be taken into account here too when making that delete or non-delete decision.

Lamdaj example

Lamdaj is a pioneering technology that allowed some lambda functions for Java version before 8. I’m using its source here, for analysis.

I took a copy of it from it’s Subversion home on GoogleCode, and checked it into Github with it’s maven script upgraded to capture JaCoCo coverage. I’ve coded some fairly hacky sed, ack, and egrep to store only the covered (and partially covered) lines for the production sources - per test. That’s all checked in to the project in a coverage_data directory as is the list of test methods. Here’s the bash script that makes that:

#!/bin/sh

rm -rf coverage_data
mkdir coverage_data

# list of unit tests methods (note: possibility of inaccuracy)

ack -A1 --nogroup "@Test" src/test/ | grep public | sed 's/\.java/ /' | \
     sed 's#/#.#g' | sed 's/().*//' | sed 's/public void//' | \
     perl -pne 's/[ \t]+/#/g' | sed 's/src.test.java.//' | \
     cut -d'#' -f 1,3 > coverage_data/list_of_test_methods.txt

mvn clean

# Process tests one at a time
for testMethod in `cat coverage_data/list_of_test_methods.txt`; do

	# One maven invocation per test (so we can get focused coverage)

    rm -rf target/coverage-reports

    mvn -Dtest=$testMethod -DfailIfNoTests=false test

    op_file_name="coverage_data/${testMethod}.txt"

    ack --noheading "id=\"L" --noignore-dir=target target | \
    egrep -i "class=\"(pc|fc)" | sed 's/\.java.html:/-/' | \
    sed 's#target/##' | sed 's#site/jacoco-ut/##' | sed 's#/#.#g' | \
    sed 's/id=.*//' | sed 's/<span class=//' | \
    sed 's/\"//g' > $op_file_name

    [ -s $op_file_name ] || rm $op_file_name

done

Comaparing coverage reports to each other is next and it’s O(n-1 squared) lengthy. Here’s the bash again (Bash v4 is needed - the Mac has v3 by default):

#!/usr/local/bin/bash

mkdir coverage_diff_data

for test1 in `find coverage_data -name "*#*" | sed 's#coverage_data/##' \
        | sed 's/\.txt//'`; do
    for test2 in `find coverage_data -name "*#*" | sed 's#coverage_data/##' \
          | sed 's/\.txt//'`; do
        if [[ "$test1" < "$test2" ]]; then
            output="coverage_diff_data/${test1}-${test2}.diff"
            diff "coverage_data/${test1}.txt" "coverage_data/${test2}.txt" >> "${output}"
            deleted=$(grep -c "^<" "${output}")
            added=$(grep -c "^>" "${output}")
            matching=$(fgrep -xcf "coverage_data/${test1}.txt" \
                    "coverage_data/${test2}.txt")
            if [[ "$matching" != "0" ]]; then
                if [[ ( "$added" == "0" && "$deleted" == "0" ) ]]; then
                    echo "$test1 and $test2 are covering exactly the same lines"
                elif [[ ( "$added" == "0" && "$deleted" != "0" ) ]]; then
                    echo "$test2 covers the same lines as $test1 (and more)"
                elif [[ ( "$added" != "0" && "$deleted" == "0" ) ]]; then
                    echo "$test1 covers the same lines as $test2 (and more)"
                else
                    rm "${output}"
                fi
            else
                rm "${output}"
            fi
        fi
    done
done

Tests that cover each other perfectly:

ch.lambdaj.ClosureSpecialCasesTest#testWithEmptyListAndUnboundVar and ch.lambdaj.ClosureSpecialCasesTest#testWithNonEmptyListAndUnboundVar
ch.lambdaj.ClosureSpecialCasesTest#testWithEmptyListOnly and ch.lambdaj.ClosureSpecialCasesTest#testWithNonEmptyListOnly
ch.lambdaj.collection.LambdaListTest#test1 and ch.lambdaj.collection.LambdaListTest#test2
ch.lambdaj.collection.LambdaListTest#testMap1 and ch.lambdaj.collection.LambdaListTest#testMap2
ch.lambdaj.function.closure.FileParserClosureTest#testCountFile and ch.lambdaj.function.closure.FileParserClosureTest#testReadFile
ch.lambdaj.function.closure.FileParserImplicitClosureTest#testCountFile and ch.lambdaj.function.closure.FileParserImplicitClosureTest#testReadFile
ch.lambdaj.group.GroupByTest#testGroupByCountryNameAndInsuredName and ch.lambdaj.group.GroupByTest#testGroupByInsuredNameAndCountryName
ch.lambdaj.group.GroupByTest#testGroupByCountryNameAsCountries and ch.lambdaj.group.GroupByTest#testGroupByCountryNameAsExposures
ch.lambdaj.group.GroupByTest#testGroupByCountryNameAsCountries and ch.lambdaj.group.GroupByTest#testGroupByInsuredNameAsExposures
ch.lambdaj.group.GroupByTest#testGroupByCountryNameAsExposures and ch.lambdaj.group.GroupByTest#testGroupByInsuredNameAsExposures
ch.lambdaj.LambdaTest#testFilterArray and ch.lambdaj.LambdaTest#testFilterOnCustomMatcher
ch.lambdaj.LambdaTest#testIndex and ch.lambdaj.demo.LambdaDemoTest#testIndexCarsByBrand
ch.lambdaj.LambdaTest#testSelectFranceExposures and ch.lambdaj.LambdaTest#testSelectStringsThatEndsWithD

There are another 168 cases where one test has the same lines covered, and some additional lines too. There’s probably many more that are similar enough to analyze, without being a clean super-set situation.

Drilling into a single comparison

I’ll drill into the second of those above (two test methods in ClosureSpecialCasesTest).

The test source ch.lambdaj.ClosureSpecialCasesTest#testWithEmptyListOnly (line 59) and ch.lambdaj.ClosureSpecialCasesTest#testWithNonEmptyListOnly (the method below - at line 70).

The coverage reports as I’ve stored them: “ch.lambdaj.ClosureSpecialCasesTest#testWithEmptyListOnly and ch.lambdaj.ClosureSpecialCasesTest#testWithNonEmptyListOnly

The diff comparison of those is empty as you’d expect.

And a different case

According to the output, ch.lambdaj.LambdaTest#testSelectStringsThatEndsWithD covers the same lines as ch.lambdaj.collection.LambdaCollectionTest#test1 (and more).

The test source ch.lambdaj.LambdaTest#testSelectStringsThatEndsWithD (line 777) and ch.lambdaj.collection.LambdaCollectionTest#test1 (line 31).

The coverage reports as I’ve stored them: ch.lambdaj.LambdaTest#testSelectStringsThatEndsWithD and ch.lambdaj.collection.LambdaCollectionTest#test1

The diff comparison of those isn’t empty, and shows one is a super-set of the other in terms of LoC covered.

Conclusion

Length of test run is what drives you to want to act on this data. Lambdaj’s tests are hardly slow, but we are seeing things covered redundantly. That’s nearly always going, but when you can couple it to a single redundant test, you’ve an opportunity to delete that test to reduce the elapsed time of a test suite.

There’s many more things that could be looked at with per-test coverage, including clues that you might want to decompose your solution a little more, and mock those new components in tests that don’t really concern them. You could do that based on the numbers of tests that pass though the same lines of code, or even the total lines of code for a single test (larger isn’t better for unit tests)

← Previous Archive Next →

Published

January 27^th, 2015

Syndicated by DZone.com
Reads:

Comments formerly in Disqus, but exported and mounted statically ...

Tue, 27 Jan 2015	Jay Fields
I like the idea, but I'd go opposite when given two tests with overlapping coverage. I'd look to refactor the larger one down, removing the overlap, and only keeping the now-more-focused test.
Tue, 27 Jan 2015	paul_hammant
Yup - that'd be better.

Paul Hammant's Blog: Detecting Redundant Tests In order to delete them