Paul Hammant's Blog: Detecting Redundant Tests In order to delete them
By running a test method on its own and storing the coverage data for the run, it is possible to later compare coverage data and make suggestions about tests that could be safely deleted. If two tests cover exactly the same lines of code, then one can go. If one test covers the same lines of code as another, and has a few extra lines of coverage, then the one with less lines could go. Coverage is normally calculated down to the path through conditionals, and that fidelity needs to be taken into account here too when making that delete or non-delete decision.
Lamdaj example
Lamdaj is a pioneering technology that allowed some lambda functions for Java version before 8. I’m using its source here, for analysis.
I took a copy of it from it’s Subversion home on GoogleCode, and checked it into Github with it’s maven script upgraded to capture JaCoCo coverage. I’ve coded some fairly hacky sed, ack, and egrep to store only the covered (and partially covered) lines for the production sources - per test. That’s all checked in to the project in a coverage_data directory as is the list of test methods. Here’s the bash script that makes that:
#!/bin/sh
rm -rf coverage_data
mkdir coverage_data
# list of unit tests methods (note: possibility of inaccuracy)
ack -A1 --nogroup "@Test" src/test/ | grep public | sed 's/\.java/ /' | \
sed 's#/#.#g' | sed 's/().*//' | sed 's/public void//' | \
perl -pne 's/[ \t]+/#/g' | sed 's/src.test.java.//' | \
cut -d'#' -f 1,3 > coverage_data/list_of_test_methods.txt
mvn clean
# Process tests one at a time
for testMethod in `cat coverage_data/list_of_test_methods.txt`; do
# One maven invocation per test (so we can get focused coverage)
rm -rf target/coverage-reports
mvn -Dtest=$testMethod -DfailIfNoTests=false test
op_file_name="coverage_data/${testMethod}.txt"
ack --noheading "id=\"L" --noignore-dir=target target | \
egrep -i "class=\"(pc|fc)" | sed 's/\.java.html:/-/' | \
sed 's#target/##' | sed 's#site/jacoco-ut/##' | sed 's#/#.#g' | \
sed 's/id=.*//' | sed 's/<span class=//' | \
sed 's/\"//g' > $op_file_name
[ -s $op_file_name ] || rm $op_file_name
done
Comaparing coverage reports to each other is next and it’s O(n-1 squared) lengthy. Here’s the bash again (Bash v4 is needed - the Mac has v3 by default):
#!/usr/local/bin/bash
mkdir coverage_diff_data
for test1 in `find coverage_data -name "*#*" | sed 's#coverage_data/##' \
| sed 's/\.txt//'`; do
for test2 in `find coverage_data -name "*#*" | sed 's#coverage_data/##' \
| sed 's/\.txt//'`; do
if [[ "$test1" < "$test2" ]]; then
output="coverage_diff_data/${test1}-${test2}.diff"
diff "coverage_data/${test1}.txt" "coverage_data/${test2}.txt" >> "${output}"
deleted=$(grep -c "^<" "${output}")
added=$(grep -c "^>" "${output}")
matching=$(fgrep -xcf "coverage_data/${test1}.txt" \
"coverage_data/${test2}.txt")
if [[ "$matching" != "0" ]]; then
if [[ ( "$added" == "0" && "$deleted" == "0" ) ]]; then
echo "$test1 and $test2 are covering exactly the same lines"
elif [[ ( "$added" == "0" && "$deleted" != "0" ) ]]; then
echo "$test2 covers the same lines as $test1 (and more)"
elif [[ ( "$added" != "0" && "$deleted" == "0" ) ]]; then
echo "$test1 covers the same lines as $test2 (and more)"
else
rm "${output}"
fi
else
rm "${output}"
fi
fi
done
done
Tests that cover each other perfectly:
- ch.lambdaj.ClosureSpecialCasesTest#testWithEmptyListAndUnboundVar and ch.lambdaj.ClosureSpecialCasesTest#testWithNonEmptyListAndUnboundVar
- ch.lambdaj.ClosureSpecialCasesTest#testWithEmptyListOnly and ch.lambdaj.ClosureSpecialCasesTest#testWithNonEmptyListOnly
- ch.lambdaj.collection.LambdaListTest#test1 and ch.lambdaj.collection.LambdaListTest#test2
- ch.lambdaj.collection.LambdaListTest#testMap1 and ch.lambdaj.collection.LambdaListTest#testMap2
- ch.lambdaj.function.closure.FileParserClosureTest#testCountFile and ch.lambdaj.function.closure.FileParserClosureTest#testReadFile
- ch.lambdaj.function.closure.FileParserImplicitClosureTest#testCountFile and ch.lambdaj.function.closure.FileParserImplicitClosureTest#testReadFile
- ch.lambdaj.group.GroupByTest#testGroupByCountryNameAndInsuredName and ch.lambdaj.group.GroupByTest#testGroupByInsuredNameAndCountryName
- ch.lambdaj.group.GroupByTest#testGroupByCountryNameAsCountries and ch.lambdaj.group.GroupByTest#testGroupByCountryNameAsExposures
- ch.lambdaj.group.GroupByTest#testGroupByCountryNameAsCountries and ch.lambdaj.group.GroupByTest#testGroupByInsuredNameAsExposures
- ch.lambdaj.group.GroupByTest#testGroupByCountryNameAsExposures and ch.lambdaj.group.GroupByTest#testGroupByInsuredNameAsExposures
- ch.lambdaj.LambdaTest#testFilterArray and ch.lambdaj.LambdaTest#testFilterOnCustomMatcher
- ch.lambdaj.LambdaTest#testIndex and ch.lambdaj.demo.LambdaDemoTest#testIndexCarsByBrand
- ch.lambdaj.LambdaTest#testSelectFranceExposures and ch.lambdaj.LambdaTest#testSelectStringsThatEndsWithD
There are another 168 cases where one test has the same lines covered, and some additional lines too. There’s probably many more that are similar enough to analyze, without being a clean super-set situation.
Drilling into a single comparison
I’ll drill into the second of those above (two test methods in ClosureSpecialCasesTest).
The test source ch.lambdaj.ClosureSpecialCasesTest#testWithEmptyListOnly (line 59) and ch.lambdaj.ClosureSpecialCasesTest#testWithNonEmptyListOnly (the method below - at line 70).
The coverage reports as I’ve stored them: “ch.lambdaj.ClosureSpecialCasesTest#testWithEmptyListOnly and ch.lambdaj.ClosureSpecialCasesTest#testWithNonEmptyListOnly
The diff comparison of those is empty as you’d expect.
And a different case
According to the output, ch.lambdaj.LambdaTest#testSelectStringsThatEndsWithD covers the same lines as ch.lambdaj.collection.LambdaCollectionTest#test1 (and more).
The test source ch.lambdaj.LambdaTest#testSelectStringsThatEndsWithD (line 777) and ch.lambdaj.collection.LambdaCollectionTest#test1 (line 31).
The coverage reports as I’ve stored them: ch.lambdaj.LambdaTest#testSelectStringsThatEndsWithD and ch.lambdaj.collection.LambdaCollectionTest#test1
The diff comparison of those isn’t empty, and shows one is a super-set of the other in terms of LoC covered.
Conclusion
Length of test run is what drives you to want to act on this data. Lambdaj’s tests are hardly slow, but we are seeing things covered redundantly. That’s nearly always going, but when you can couple it to a single redundant test, you’ve an opportunity to delete that test to reduce the elapsed time of a test suite.
There’s many more things that could be looked at with per-test coverage, including clues that you might want to decompose your solution a little more, and mock those new components in tests that don’t really concern them. You could do that based on the numbers of tests that pass though the same lines of code, or even the total lines of code for a single test (larger isn’t better for unit tests)