Code Coverage

Code Coverage


A measure of the degree to which a test suite exercises the source code of a program.

Usually, we use a coverage analysis tool to collect coverage data when running code against a test suite. Sometimes, we use the same notion to generate test cases.

For example, if a piece of code is not yet covered by the test suite, that may be a good candidate for a new test.

We differentiate code coverage from test coverage.

  • Test coverage
    • overall amount of testing performed by a test suite
    • concerned with the overall test plan and can thus be subjective.
    • measured according to the requirements
      • how many requirements are or are not covered by test cases?
  • Code coverage tools can help achieve test coverage.

Statement Coverage

Statement Coverage


The percentage of statements that are executed. \[ 100 \cdot \left(\frac{\text{number of executed statements}}{\text{total number of statements}}\right) \]

In general, every statement should be tested. Otherwise, why was it written in the first place?

We should strive for 100% statement coverage, but this can be difficult.

  • To execute a statement that handles the execution of a corrupted file, we'd have to feed the code a corrupted file.
  • It's not always feasible to create truly exceptional conditions.

Statement coverage satisfies the reachability constraint of fault detection, but may not satisfy the necessity or propagation constraints.

double sqrt(double x) {
    if (x <= 0) {    // bug: should be x < 0
        return -1;
    } else {
      ...            // calculate root
      return root;
    }
}

Consider the following test cases:

  • sqrt(-1) exercises the if condition
  • sqrt(1) exercises the else condition.

Together, these test cases achieve 100% statement coverage! However, they don't reveal the bug. This means that statement coverage is inadequate for fault detection.

Decision Coverage and Branch Coverage

A decision is just a boolean expression evaluating to true or false.

Decision Coverage


The percentage of decision outcomes exercised by a test suite. \[ 100 \cdot \left(\frac{\text{number of decisions exercised}}{\text{total number of decisions}}\right) \]

100% decision coverage requires that every decision has taken each possible outcome at least once. In other words, every decision should be evaluated to both true and false at some point during a test suite execution.

Branch Coverage


The percentage of the branches of all control structures exercised by a test suite. \[ 100 \cdot \left(\frac{\text{number of branches exercised}}{\text{total number of branches}}\right) \]

The main differences between decision coverage and branch coverage are as follows:

  • decisions do not necessarily appear in an if statement or loop
  • branches are defined for the method's control flow graph.

Note: in addition to the true/false branches going out from the decision of an if statement or loop, the CFG also includes an unconditional branch (edge) that connects each true or else branch to the next statement.

Detailed Written Example

Consider the following function:

boolean isVowel(char letter) {
    boolean answer = true;    // Bug: should be initialized to false
    String vowels = "aeiouy";
    char ch = Character.toLowerCase(letter);
    if (vowels.indexOf(ch) >= 0) {
        answer = true;
    }
    return answer;
}

For convenience, we'll mark outcomes as subscripted variables.

  • answer == true is \(a_T\), false is \(a_F\)
  • Conditional/unconditional branches are \(C_n, U_n\) for optional index \(n \in \mathbb{N}\)

This snippet has:

  • two decisions, each with a true/false outcome:
    • \(\{ a_T, a_F \}\): answer (in return statement)
    • \(\{ v_T, v_F \}\): vowels.indexOf(ch) >= 0
  • three branches (conditional vs. unconditional/implicit)
    • \(C\): if (vowels.indexOf(ch) >= 0)
    • \(U_1\): implicit then to return answer after block exits
    • \(U_2\): implicit else to return answer if conditional is false

We'll use the table below to evaluate decision and branch coverage.

testbranchesbranch coveragedecision outcomesdecision coverage
isVowel('A')\(\{C, U_1\}\)\(\frac{2}{3} \approx 67\)%\(\{ a_T, v_T \}\)\(\frac{2}{4} = 50\)%
isVowel('C')\(\{ U_2 \}\)\(\frac{1}{3} \approx 33\)%\(\{ a_T, v_F \}\)\(\frac{2}{4} = 50\)%

To calculate full branch or decision coverage, consider the unions of the branch and decision sets.

\[ \begin{gather} \bigcup_{\text{branches}} = \{ C, U_1 \} \cup \{ U_2 \} = \{ C, U_1, U_2 \}\\ \bigcup_{\text{decisions}} = \{ a_T, v_T \} \cup \{ a_T, v_F \} = \{ a_T, v_T, v_F \} \end{gather} \]

So we see that \(\bigcup_{\text{branches}}\) covers all three branches, meaning the test suite given by the table has 100% branch coverage over isVowel().

However, \(\bigcup_{\text{decisions}}\) only covers 3 of the 4 possible decision outcomes - the code is never evaluated for \(a_F\). Thus, the test suite only has 75% decision coverage over the method.

Full Coverage Example

It's important to note that a test suite with full decision or branch coverage doesn't always show the presence of faults. Consider the following code from the previous section:

boolean foo(int x, int y) {
    int z;
    if (x > 0) {
        z = x * y;
    } else {
        z = y + y; // Should be y * y
    }
    return z > y;
}

This snippet has:

  • two decisions, each with a true/false outcome
    • \(\{ x_T, x_F \}\): x > 0
    • \(\{ z_T, z_F \}\): z > y
  • four conditions
    • \(C_1\): if (x > 0)
    • \(C_2\): else
    • \(U_1\): implicit then to return z > y after if block exits
    • \(U_2\): implicit then to return z > y after else block exits

Again, we'll make a table to evaluate our test suite.

testbranchesbranch coveragedecision outcomesdecision coverage
foo(1,1)\(\{ C_1, U_1 \}\)\(\frac{2}{4} = 50\)%\(\{ x_T, z_F \}\)\(\frac{2}{4} = 50\)%
foo(0,3)\(\{ C_2, U_2 \}\)\(\frac{2}{4} = 50\)%\(\{ x_F, z_T \}\)\(\frac{2}{4} = 50\)%

Notice that we cover all branches and decision outcomes, meaning this test suite has 100% decision and branch coverage. Still, none of these tests reveal the fault.

It's also possible for a decision expression itself to be at fault. Consider a mistaken a > 0 in place of a >= 0. The test suite \(\{ a=2, a=-2 \}\) satisfies decision and branch coverage but does not report failure. In other words, it satisfies the reachability constraint but not the necessity constraint.

Condition Coverage

Each condition in a decision is evaluated both true and false.

Example: \(C_1 \vee C_2\)

Test Suite 1
\(C_1\)\(C_2\)\(C_1 \vee C_2\)
truetruetrue
falsefalsefalse
Test Suite 2
\(C_1\)\(C_2\)\(C_1 \vee C_2\)
truefalsetrue
falsetruetrue

Is one of these better than the other? What about when \(C_1 \vee C_2\) is false?

  • Test Suite 2 doesn't cover the decision's else branch.

Condition coverage doesn't guarantee decision coverage!

Decision vs. Condition Coverage: A Fault Model

A decision may be composed of conditions alongside zero or more boolean operators.

Condition


A primitive Boolean expression that cannot be broken down into simpler Boolean expressions.

A decision without a Boolean operator reduces to a condition.

To discuss condition-based coverage criteria, we first introduce a fault model of the if-then-else code block.

if <decision-expression> {
    <then-branch>
} else {
    <else-branch>
}

Faults may occur in <decision-expression>, <then-branch>, or <else-branch>.

Decision Expression

In the decision expression, faults fall into two categories:

  • faulty boolean operators
  • faulty conditions

IMPORTANT NOTE


This section appears on p. 337-338 and may be worth taking directly from the book. I'm not sure I understand the point of this section which usually means I'm missing something important.

Consider the following pseudocode:

  if (a > 0 or b > 0)
      print(a + b)
  else
      print(a * b)

Suppose \(C_1\) and \(C_2\) denote \(a > 0\) and \(b > 0\) respectively. If the following are correct decisions, the given expression (\(C_1\) or \(C_2\)) contains a faulty boolean operator.

  • \(C_1\) and \(C_2\): the faulty operator is or vs. and
  • \(\neg{C_1}\) or \(C_2\): the faulty operator is \(\neg\) (negation)
  • \(C_1\) xor \(C_2\): the faulty operator is or vs. xor
  • \(\neg{(C_1 \text{ or } C_2)}\): the faulty operator is \(\neg\) (negation)
  • \(C_1\): the fault is extra condition and operator.

If the following are correct decisions, the given expression (\(a < 0\) or \(b < 0\)) has a faulty condition.

  • \(a \geq 0\) or \(b > 0\): the faulty condition is \(a > 0\) vs. \(a \geq 0\).
  • \(a > 0\) or \(b > 1\): the faulty condition is \(b > 0\) vs. \(b > 1\).
  • \(a > 0\) or \(b < 0\): the faulty condition is \(b > 0\) vs. \(b < 0\).

Then Branch

If the following are correct then branches, the given then branch is faulty.

  • print(a - b);
  • print(a * b);

Else Branch

If the following are correct else branches, the given else branch is faulty.

  • print(a / b);
  • print(a + b);

A test suite of full decision coverage satisfies the reachability constraint of both then and else branches.

A faulty decision expression can cause the reachability constraint to be met while not satisfying the necessity constraint. To deal with this issue, condition coverage requires that each condition in a decision is evaluated both true and false at least once.

Multi-Condition Coverage

Multi-condition coverage requires that all possible combinations of the conditions in a decision be exercised and every logic operator take on all possible values.

MCC is ideal for dealing with faulty boolean operators.

A test of \(n\) conditions requires \(2^n\) tests.

Condition/Decision Coverage

C/DC exercises both outcomes of each condition AND of each decision.

Modified Condition/Decision Coverage

MC/DC requires that each condition should independently affect the decision outcome such that the effect of each condition is tested relative to other conditions.

Consider the following table:

C1C2C1 or C2
TFT
FTT
FFF

This illustrates:

  • full decision coverage
    • \(C_1\) and \(C_2\) are both tested for T and F
  • full condition coverage
    • \(C_1 \vee C_2\) is tested for both T and F

MC/DC requires a minimum of \(n + 1\) tests:

  • \(C_1 \text{ and } C_2 \text{ and } ... \text{ and } C_n\)
    • one test for all \(C_i \in \{C_1, C_2, \dots, C_n\}\) being true
    • one test for each \(C_i\) being false while all others are true.
  • \(C_1 \text{ or } C_2 \text{ or } ... \text{ or } C_n\)
    • one test for all \(C_i \in \{C_1, C_2, \dots, C_n\}\) being false
    • one test for each \(C_i\) being true while all others are false.

The MC/DC test suite does not typically cover all combinations of the decision's conditions, meaning it doesn't always fail on an incorrect decision.

For example, \(C_1\) or \(C_2\) does not differentiate from \(C_1\) xor \(C_2\).

MC/DC may also be inadequate when one of the conditions is wrong. It's still considered a viable option for high-assurance software (NASA).

Code Coverage vs. Fault Detection: Summary

No coverage criterion can guarantee the necessity constraints of faulty conditions (e.g., \(a > 0\) vs. \(a \geq 0\)).

Faulty Boolean OperatorsFaulty ConditionsFaulty Then BranchesFaulty Else Branches
Condition CoverageSomeMaybeMaybeMaybe
Decision CoverageSomeMaybeYesYes
Condition/DecisionSomeMaybeYesYes
MC/DCMajorityMaybeYesYes
Multi-ConditionAllMaybeYesYes

Note that the satisfaction of reachability and necessity constraints of a fault doesn't necessarily reveal the fault because of the propagation constraint.

Subsumption Relationships

Subsumption Relationship


Given coverage criteria \(V_1\) and \(V_2\), \(V_1\) is said to subsume \(V_2\) if and only if every est suite that satisfies \(V_1\) also satisfies \(V_2\).

When \(V_1\) subsumes \(V_2\), \(V_1\) is typically more effective in fault detection. It's also typically more expensive.

Among the coverage criteria discussed, MC/DC is a viable option if time permits. It's more effective than decision, condition, and condition/decision coverage but less expensive than multi-condition coverage: \(O(n)\) vs. \(O(2^n)\).

These methods are necessary yet not sufficient for high assurance. Effective testing also requires the application of black-box techniques.

Code Coverage Goals

The coverage goals of a project depend on the desired quality level and available resources and the development process.

In an agile process like TDD, since unit testing is performed alongside regular development, code coverage can be retained at an almost constant level throughout the process.

In a non-agile process like waterfall where testing isn't performed until production code is ready, the code coverage landscape is very different. Coverage starts at 0% and increases over time.

In practice, we may first create tests to meet the weaker coverage criteria:

  • statement coverage to branch coverage to MC/DC

Weaker coverage may be more cost effective in terms of average number of faults detected per test

Coverage Tool: EclEmma

Very little to say here, almost entirely skipped over in class but present in the final exam subjects.

  • Based on JaCoCo (Java Code Coverage) Library
    • Coverage results are immediately summarized and highlighted in the Java source code editor
      • Green: fully covered lines
      • Yellow: partly covered lines (some statements or branches missed)
      • Red: lines that have not been executed

From here, it's suggested that we find YouTube videos to learn more. So ends CS449!