McCabe's idea was that finding "basic paths-that when taken in combination will generate every possible path". This is much less than the maximum possible number of combinations of groups of statements and is close to the minimum number of tests required for 100% path coverage. In addition, it calculates the number of flows without analyzing the conditional expressions, which can give higher complexity for some code (such as an else if that tests on the same variable as the initial if). This also means that the McCabe complexity is related to logical flow, but not data flow, threading, or arithmetic complexity.
The following examples will attempt to describe McCabe's cyclomatic complexity and show some of the limitations. Hopefully an improved measure of complexity can be found that closely indicates the number of tests that should be performed.
The paper printed in IEEE can be found here:
McCabe Complexity
Another version can be found here: Structured
Testing
Wikipedia has an article about some complexity measures:
Programming_complexity
McCabe's original paper has some graph analysis ideas, followed by a keyword approach to finding complexity. Many of the tools use a keyword approach such as:
if(c1) { a(); } else { b(); }The number of possible flows through this code is:
Execution | Condition c1 |
---|---|
Function a() is executed | true |
Function b() is executed | false |
In graph theory, the edges are the connection lines between the nodes.
With McCabe complexity, an if statement always adds a complexity of one to an existing path. This is true whether or not the conditional evaluates to a constant true or false, and the branch is always or never executed.
An if/else also only adds a complexity of one since the only difference is when the other path is taken. With McCabe complexity, an else statement is much simpler than two if statements even though the else condition path's expression is equivalent to "if(!c1)".
This shows that the conditional expressions are important when evaluating complexity, and that McCabe complexity is a model, but is not accurate even for purely logical path complexity.
if(c1) { a(); } else if(c2) { b(); }This example is similar this:
if(c1) { a(); } else { if(c2) { b(); } }The number of possible flows through this code with two independent conditions is:
Execution | Condition c1 | Condition c2 |
---|---|---|
No functions are executed | false | false |
Function a() is executed | true | true or false |
Function b() is executed | false | true |
If both conditionals were testing on the same variable, then the execution is more like a switch/case, and complexity is similar to Example 1.
if(c1) { a(); if(c2) { b(); } }The number of possible flows through this code is:
Execution | Condition c1 | Condition c2 |
---|---|---|
No functions are executed | false | true or false |
Function a() is executed | true | false |
Function a() and b() are executed | true | true |
In this example, the complexity is higher than in example 2 if the statements executed alter values that affect each other. This also means that a function or method that has side effects could be more complex.
if(c1) { a(); } if(c2) { b(); }The number of possible flows through this code is:
Execution | Condition c1 | Condition c2 |
---|---|---|
No functions are executed | false | false |
a() is executed | true | false |
b() is executed | false | true |
a() and b() are executed | true | true |
if(c1) { a(); } if(c2) { b(); } if(c3) { c(); }The number of possible flows through this code is:
Execution | Condition c1 | Condition c2 | Condition c3 |
---|---|---|---|
No functions are executed | false | false | false |
a() is executed | true | false | false |
b() is executed | false | true | false |
c() is executed | false | false | true |
a() and b() are executed | true | true | false |
b() and c() are executed | false | true | true |
a() and c() are executed | true | false | true |
a(), b(), and c() are executed | true | true | true |
This example shows that when analyzing graphs, the McCabe complexity (7) does not match the number of combinations (8). The discrepency is that the there is no extra edge indicated for the abc path compared to the ab and bc paths.
switch(c1) { case 1: a(); break; case 2: case 3: b(); break; }There are many variations of McCabe case statement complexity. I haven't found any variations that account for fall through case statements.
switch(c1) { case 1: a(); break; case 4: b(); case 5: c(); break; default: b(); break; }
if(cond1() && cond2()) a();This is equivalent to the following.
if(cond1()) { if(cond2()) a(); }C++ has short-circuit evaluation for the logical or and logical and operators. This means not all conditions (which could be statements) are executed within a conditional test.
if(c1 && c2) a();This is equivalent to the following.
if(c1) { if(c2) a(); }Simple counting of logical operators as keywords is not a very accurate measure compared to other keywords.
if(v1 == 1) { }Since this is not common, it is not required that this be analyzed differently than the default behavior.
Combinations | McCabe Graph |
McCabe Keyword |
ACQC | vsCCM | Source Monitor | Oovcde McCabe |
|
---|---|---|---|---|---|---|---|
Example 1 If Else |
2 | 2 | 2 | 2 | 2 | 3 | 2 |
Example 2 If/Else If |
3 | 3 | 3 | 3 | 3 | 3 | 3 |
Example 3 Nested If |
3 | 3 | 3 | 3 | 3 | 3 | 3 |
Example 4 Sequential If |
4 | 4 | 3 | 3 | 3 | 3 | 3 |
Example 5 3 Sequential Ifs |
8 | 7 | 4 | 4 | 4 | 4 | 4 |
Example 6 case |
3 | 3 | 4 | 4 | 4 | 5 | 4 |
Example 7 case/default |
4 | 4 | 4 | 4 | 4 | 5 | 4 |
Example 8 logical or/and statements |
4 | 4 | 2:3 | 2 | 3 | 3 | 2 |
Example 9 logical or/and |
2 | 2 | 2:3 | 2 | 3 | 3 | 2 |
There are many documents on the web indicating that switch case counts are too high. They may be a little high as seen in this table, but the actual reason is that if statement counts in many cases are too low. I have not found any tools that do not decrease the counts on case statements with no intervening statements. I haven't found any tools that incorrectly account for default statements.
There are many types of complexity, and I have not found a list of the varying types, so some of the names are made up here. They are listed in roughly increasing order here. Types of complexity:
int average(int a, int b) { return (a+b)/2; }McCabe complexity indicates 1, so only one test is needed to fully test this. Obviously this is wrong. It may be possible to use some set theory similar to what is used in abstract interpretation.
There is no mathematically automated way to find the correct number of tests for testing an arithmetic/numeric algorithm. Some examples of algorithms that could be difficult to test would be something like finding n digits of pi, or sorting a set of data.
Structures and classes increase the complexity based on the number of variables in a struct or class are accessed. In C++, this is more difficult to determine the number of unique members accessed because different methods may be accessing the same class variable.
The number of values read adds to the number of tests that must be done. Fewer state variables are also simpler than more state variables. The following piece of code is an example where two independent variables are more complex than one.
val = ((ULONG) (val1 << 24) + (ULONG) (val2 << 16);
The number of tests required for a boolean parameter is 2. The number of tests required for an unsigned is generally at the boundaries (2), and perhaps a test in the middle. The number of tests required increase depending on how the parameter is used. The number of tests for a signed parameter is at the boundaries (2), plus usually at zero, and possibly at other values.
An interesting point for C++ is that the size_t type is an unsigned type, but uses "static_cast<size_t>(-1)", which means the maximum value, as an invalid value. This also should be tested at the min, max, and middle value.
There is a school of thought that in C++, the unsigned value is dangerous, but I think the complexity indication is more important, and that the danger should be understood. An example of the danger is below where the i variable is never less than zero since it is unsigned:
for(size_t i=5; i>=0; i++) // NEVER FINISHES!!! { }
The minimum number of tests required for parameters is additive for parameters that are independent of each other. For two parameters, add the complexity of each parameter to each other. It is impossible to determine if external input parameters are independent and must be tested combinatorially. From the perspective of the function that is being examined, the external parameters are independent.
var.vt = VT_ARRAY | VT_UI1;Output parameters do not increase test complexity of the function being analyzed. They increase complexity of other functions.
Intermediate parameters do not increase test complexity. Buffer indexing and pointers are usually more complex (can introduce more errors). Typically indexing is just increment operations and limit tests. These do not increase the number of tests if they are dependent on input parameters.
Of course, Wikipedia has an article about boundary value analysis:
Boundary
Value Analysis
And for abstract interpretation:
Abstract
Interpretation
This tool calculates statements in a block.
Gnu
complexity tool
If the condition is based on an input parameter, and the parameter is not used elsewhere, the complexity should not be increased over the complexity of the input parameter.
The number of tests on an input parameter can increase the complexity. If there is one conditional based on an input parameter, and another test is added, the complexity increases by two if the parameters are sequential.
Sequential if statements produce 2^n combinations, where n is number of if statements in sequence. This is not useful as the number of tests. The minimum number of tests that is reasonable is main branch(1), plus 3 branches(3), plus additional tests for combinations (n-1) = 2*n.
If a function is dominated by control complexity (instead of data complexity), the control complexity can be up to a factor of 2 more than McCabe complexity. This will happen when conditials are combinatorial (such as sequential if statements).
In general, a const method will not increase complexity as much as a non-const method.
Since the test complexity is about the function being analyzed, complexity introduced by calling other functions is not added to the current function.
Control Flow Details
If the input variable is used in a condition, the complexity is not double counted. The largest value of the condition or variable complexity is used in each case.
Example | Desired Value | Oovcde |
---|---|---|
Example 1 If Else |
2 | 2 |
Example 2 If Else If |
2 or 3 depending on condition | 3 or 4 |
Example 3 Nested If |
3 | 3 |
Example 4 Sequential If |
4 | 4 |
Example 5-a 3 Sequential Ifs-comb |
6 | 6 |
Example 5-b 3 Sequential Ifs-same |
4 | 4 |
Example 6 case |
3 | 4 |
Example 7 case/default |
4 | 4 |
Example 8 logical or/and statements |
4 | ? - Depends on conditions |
Example 9 logical or/and |
2 | ? - Depends on conditions |
Other tools would consider all of the following a complexity of one.
Desired data complexity values:
Example | Desired Value | Oovcde |
---|---|---|
Example D-1 Input Boolean |
2 | 2 |
Example D-2 Input Unsigned |
3 | 3 |
Example D-3 Input Signed |
4 | 4 |
Example D-4 Input Pointer |
4 | 4 |
Example D-5 Read Member |
4 | 4 |
Example D-6 Write Member |
1 | 1 |
Example D-7 Read Pointer Member |
4 | 4 |
Example D-9 Call Function Value Param |
1 | 1 |
Example D-10 Call Function Ref Param |
4 | 4 |
Example D-11 Call Function Return |
4 | 4 |
Example D-8 Write Pointer Member |
1 | 4 |
int ComplexityDataTest::testInputBoolParam(bool v1) { return !v1; }
int ComplexityDataTest::testInputUnsignedParam(unsigned int v1) { return v1 / 2; }
int ComplexityDataTest::testInputSignedParam(int v1) { return v1 / 2; }
int ComplexityDataTest::testInputPointerParam(int *p1) { return *p1 / 2; }
int ComplexityDataTest::testReadMemberRef() { return mMember1 / 2; }
void ComplexityDataTest::testWriteMemberRef() { mMember1 = 8; }
int ComplexityDataTest::testReadPointerMemberRef() { int *p = &mMember1; return *p; }
void ComplexityDataTest::testWritePointerMemberRef() { int *p = &mMember1; *p = 8; }
void ComplexityDataTest::testMemberFuncVal() { mChild.funcVal(1); }
int ComplexityDataTest::testMemberFuncRef() { int val; mChild.funcRef(val); return val; }
int ComplexityDataTest::testMemberFuncRet() { return mChild.funcRet(); }
Adding the data complexity does increase some functions a great deal that initially had a lower complexity. These functions sometimes do not look complex if the input data is not combinatorial, but stringently testing each function with all available input does require more tests.
The Oovcde program outputs both complexity figures (McCabe and combined control and data complexity) so that they can be compared and sorted on easily.