Topics of the lecture:
The role of testing in software construction.
Different types of testing.
Validation and Verification.
Testing is purportedly a procedure to establish whether the system meets the specification requirements, but in reality it seeks to uncover operational faults (bugs) in software. These are properly known as "defects".
The term "bug" has been used for minor faults in
machinery since the 19th Century or earlier. A real "bug"
(a fried moth) was found in an early computer, the Harvard Mark II
machine in Sept. 1947 and noted by programming pioneer Grace
Hopper—but this was certainly not the first use of the term.
Unit testing
Integration testing
Validation and Verification
Resources limitations, errors and recovery (esp. in RealTime Systems)
Usability testing
Performance testing
Definition
V&V (Verification and Validation) should begin at the earliest
possible stages of the software development cycle:
because the cost of debugging grows exponentially with the
progression through the project.
Requirements
Specification
Planning
Design
Implementation
Integration
Maintenance
phases of the SDLC
Software
Development Life Cycle
Prototype components are built for client demonstration.
Such
components need be neither complete, nor reliable.
A mathematical description of the software system is
constructed.
Deriving and proving properties
mathematically can uncover
incompleteness, ambiguity, and contradictions.
IEEE definitions
A failure occurs when the program behaviour violates the specification.
A fault is a problem in the program code, which can lead to a failure when executed. Faults are commonly known as bugs.
An error is a mistaken decision by an implementer that leads them to create program code containing a fault.
Some authors (e.g. Bertrand Meyer) do not use these terms, but call them "fault", "defect" and "error", respectively.
A software product is correct
if it will always behave as specified.
(Which doesn't mean that it is what the customer wants.)
This is hard to do for small software, and very difficult to do for large scale software, requires great expertise.
None of these alone is sufficient.
Testing is the most common,
and every programmer needs to know how to
test properly
to get programs that are closer to being "correct"
in
least time and less wasted effort.
All of these can use a technique called
In unit testing the class is isolated from the other classes:
Problems can be located faster.
Testing can begin before the system is complete.
Testing of components can be more thorough.
Consider a class X under test:
The role of the classes that normally will use X is filled by a test harness.
One way to implement this in Java is to provide a main
method in a class that allows it to be tested by itself. (This is
only feasible if the class can be used in a simple context for
testing.)
The method may be retained in the code even when the
class is incorporated into a larger program that has its own main
method.
The harness is a small program that is designed to exercise X.
The role of the classes that X in turn expects to use, but have not been implemented (or would complicate the testing, or would take too long) may be replaced by test stub classes or methods.
The test stubs may be written to report or record just how they were used by X.
And this is how the poor stubs feel afterwards — when the test succeeds and when it fails:

Correctness of a program is not an absolute, but relative (achieved in the conditions which are defined as operational for this program).
Testing is intended to increase our confidence in the correctness of something, in the conditions in which we intend to use it.
The primary objective of testing is to make the system fail!
two quotes by famous scientists
Program testing can be a very effective way to show the presence of bugs, but it is hopelessly inadequate for showing their absence.
Edsger W. Dijkstra computer scientist
Compare this with an earlier statement:
Experiment can only prove that a theory wrong, never that it's correct.
Albert Einstein physicist
Suppose you want to test a 64-bit floating point division
routine.
There are 2128 combinations. At 1 test per nanosecond
(1 billionth of a second), it'll take 1022
years!
(the age of Universe is currently estimated to be 15
billion years, ie 1.5 * 1010).
The challenge is to find a much smaller number of inputs that are
likely to
make the system fail,
and to know enough that we can trace back
failures to the faults that cause them.
How do we choose such tests? how do we design our test cases?
Used individually, specification testing is slightly more effective
than structural testing.
But, it is best to use both.
Test selection strategies
/** Telephone call billing information */
public class Call {
private int minutes; //duration of the call
private int cost()
//the cost of the call in cents
SPECIFICATION: 0 minutes costs $0;
less than 2 minutes costs $0.20;
up to 60 minutes costs $0.20 plus 30c per minute over 2 minutes;
over 60 minutes is capped at $20.

|
input value |
expected output |
|---|---|
|
-1 |
error |
|
0 |
0 |
|
1 |
20 |
|
2 |
20 |
|
3 |
50 |
|
60 |
1760 |
|
61 |
2000 |
These cases cover:
|
are these statements true? |
If we are testing a method to search for an element in an array, we might test using:
Here the test cases are derived from the implementation of the program. The component is a white box, and its implementation is examined.
Test selection strategies:
In branch coverage we must ensure that every branch is executed at least once.
In path coverage we must ensure every possible path is
executed at least once
(which is not feasible in general).
Here is the body of the implementation of the telephone billing function.
/** Telephone call billing information */
public class Call {
private int minutes; //duration of the call
//the cost of the call in cents
private int cost() {
assert (minutes >= 0);
if (minutes = 0) {
return 0;
} else if (minutes > 0 && minutes <= 2 {
return 20;
} else if (minutes 2 && minutes <= 60) {
return 20 + (minutes - 2) * 30;
} else {
return 2000;
}
}
} Consider a program segment with two sequential binary choice points:
Clearly, the path coverage is more thorough test, but:
The code with n branches requires up to 2n tests (and it may be neither feasible, nor practical for complex code);
Path coverage can be positive (no bugs revealed) for all paths, but the code can be nevertheless wrong.
Consider the following method:
public void maxThree(int x, int y, int z) {
if (x > y) {
return x;
} else {
return y;
}
}In white box testing we rely on the code to infer the behaviour of the program, ie we construct our tests based on the code structure. For the above snippet, all paths which we can come up with (below) test correct, yet the code is wrong! There is a danger to rely on the code when designing the test.
|
Input |
Expected |
Actual |
|
x=3 |
||
|
y=2 |
3 |
3 |
|
z=1 |
||
|
x=2 |
||
|
y=3 |
3 |
3 |
|
z=1 |
For loops
Full path coverage is almost impossible: need all values of the loop limit.
Branch coverage requires only two tests: one in which the loop is not entered, and another test in which the loop is entered
Example of structural testing for binary search:
Unit testing is the closest among all kind of testing activities to code writing. This is what the programmer who writes the code should do. In fact, writing the code for unit tests is the same kind of activity. Unlike debugging, which is a process of going from knowing that the program is broken and you look for the programming error, testing is a systematic attempt to reveal bugs in a program which is thought to be working correctly. Some high level programming languages provide a mechanism to effectively check whether the behaviour of a particular unit (class, method) is the same as it should be according to the specification and the contract of this unit. This mechanism is:
If, say a method foo() in a class Bar should
Return a value valueX, or
Leave the object in a certain state state,
then assertions can be used to check if the actual return value value is the same
as the required valueX (or similar test for the actual and required states of the object).
assert(value == valueX);
If the two values (or, states) are not equivalent (the equivalence needs to be defined accordingly), the assertion violation causes the testing program to report it.
A particularly useful setup which makes the process of unit testing very productive is provided by the framework of classes (originally developed for Java, but now available for every language) known as
nit
Testing Frameworkmore on this in the next lecture
The JUnit testing framework allows the developer to define a reciprocal test class for every class in the development software. The test classes can be extended incrementally, organised in suitable collections (called suites) for testing a particular aspect of the unit behaviour. The test suites can include ever increasing number of test cases which can be devised from the specification and contracts, by analysing the equivalence classes and boundary values etc. Often, the test data (expected, or required, values) represent assets (it may be costly to obtain them, eg, using real experiments etc). The JUnit approach allows to use these assets effectively (selection of tests). JUnit also provides special Runner classes for conducting the test runs, using either an easy to use GUI, or generating test reports which you can process using additional tools.Copyright © 2006,2007, Alexei Khorev, Chris Johnson, The Australian National University