Archive for September, 2009

Bickering about unit testing

Posted in Testing on September 25th, 2009 by kay – 1 Comment

Doubts on the effectiveness of unit testing

Unit testing has entered the programming mainstream with XUnit packages and derivations of them. They are available for all mainstream programming languages. It is not normal today shipping an OSS project without any tests. Programmers can read test cases like behavioral specifications of APIs and they often learn a lot about a system from this sort of code reading ( at least I do ).

Still unit testing is disputed as a reasonable practice by many respected programmers and I wonder if guys like Joel Spolsky or James Coplien aren’t basically right? Isn’t it true that UTs have to be permanently adapted as our code base changes and doesn’t this imply a significant maintenance overhead even and foremost in early phases? Coplien suggests design-by-contract as a more lightweight and DRY alternative to writing UTs: place pre and post-conditions directly into the code and check the available units i.e. the interface specifications. Isn’t this far more agile and won’t better coding practices make UTs go away just like many of the once celebrated design pattern go away when using powerful language level concepts like multimethods and higher order functions?

Black box testing

When you work as a tester in the industry you essentially specify and implement test-suites according to specifications. Your product is not the system under test ( SUT ). You are not interested in the inner working of a system and its components. The SUT is a black-box and the SUT code might change arbitrarily. If any code is exposed it is SUT API code being accessible by clients application like your test app. The API might even be fully away though and instead you’ll test in- and outgoing commands sent for and back between your test app and the SUT according to a specified command protocol. All of those tests are functional- or system level tests and the tested units remain hidden. As a tester you don’t care about the way the system is built but only how it behaves.

Can we use our standard UT frameworks to implement black box tests? Well, isn’t this actually their most frequent use?

Are there any UTs around?

What if the most common unit tests we are finding in the wild are functional or system blackbox tests applied to API level functions/classes, implemented in one of the available unit testing frameworks? Some of the system components are abstracted away and get replaced by mock objects representing networks or C/S databases but this just avoids system integration tests. A close reading of unit testing might indeed lead to Jim Copliens conclusion that they are better implemented as pre and post-conditions but you won’t test a system on such a fine grained level. Using UT frameworks for functional tests has short comings but it doesn’t mean they are not used for them. When the interface is kept small the likelihood that it gets badly broken when you evolve your system is manageable. This is the prime reason why programmers do not suffer from writing UTs and maintenance costs are kept under control. Every software tester in the industry knows that writing tests takes much effort and is very costly but changes in public APIs isn’t a major reason.

UTs and beyond

The missing link between between current UT systems and a test-system for all kinds of SUTs is a dataflow connection which triggers tests in a particular order. By this I mean that each test can produce data as a side-effect which can be required within another setup of a test-case. In Junit4 we have @before and @after annotations for running setups and tear-downs unconditionally. When adding two more annotations @require and @provide it becomes possible to specify conditions on running tests by means of the need of data. A test-runner has to match the @required data against the @provided ones and determines a schedule.

In case of Java this can be checked at compile time using an annotation processor. In .NET one might apply those checks once the assemblies are loaded during initialization of the test-runner. The only disadvantage of load-time checks is that all available test-modules have to be loaded initially and not on demand.

Choosers and ChooserMixins in C++ and Python

Posted in Chooser, CPP, Python, Testing on September 12th, 2009 by kay – 2 Comments

Chooser Objects

From time to time I’m amazed to find a simple algorithm which seemed to be a low hanging fruit which was just overlooked. In this particular case it is about generating and utilizing test data in a both simple and flexible manner. Mark Seaborn described the method in his outstanding blog article How to do model checking of Python code. He distilled what we might call the Chooser Algorithm from a scientific paper which buries the message under all sorts of methodological considerations and special case treatments and other bloat. This is sad because good algorithms are the crown jewels of programming. It also helped that he provided an implementation in Python and not in C or some sloppy computing-scientist-only pseudo code notation which changes from author to author.

We can motivate Chooser objects as follows.

Suppose you have a control flow statement defined in a function f. The path the flow control takes is determined by the value of some variable x:

def f(*args):
    ...
    x = g(*args)
    if x>0:
        ...
    else:
        ...

When we want to test the if-statement alone we can ignore the value of x computed by g. A simple method to achieve this is to introduce a for-loop in the code which iterates over a range of values which represent jumps to the individual if-statement branches:

def f(*args):
    ...
    x = g(*args)
    for x in (1,-1):
        if x>0:
            ...
        else:
            ...

However, this is a quite heavy change and we would likely not want to repeat this at another place. Instead of adding a for-loop we can introduce a non-deterministic choice over the values 1 and -1 and pull the iteration, represented by the loop, out of the function:

def test(chooser):
    def f(*args):
        ...
        x = g(*args)
        x = chooser.choose([1,-1])
        if x>0:
            ...
        else:
            ...
    f(1,2,3)  # call f with appropriate arguments

Here we inserted a call to choose which represents a set of choices. No new control flow is introduced. The function f must be called as many times as there are choices passed to choose.

The repeated call of f is managed by a new function check which is part of the Chooser Algorithm. It actually calls the test function which has a uniform interface and keeps a single chooser parameter.

class ModelCheckEscape(Exception): pass
 
def check(func):
    stack = [[]]
    while stack:
        chosen = stack.pop()
        try:
            func(Chooser(chosen, stack))
        except ModelCheckEscape:
            pass

The check function creates a Chooser object and passes it to func which is represents the system under test. The Chooser constructor takes two arguments. One is a list called chosen popped from a stack of such lists, the other one is the stack itself which might be filled with new lists.

class ModelCheckEscape(Exception): pass
 
class Chooser(object):
    def __init__(self, chosen, stack):
        self._chosen = chosen
        self._stack  = stack
        self._it     = iter(chosen)
 
    def choose(self, choices):
        try:
            choice = self._it.next()
            if choice not in choices:
                raise Exception("Program is not deterministic")
            return choice
        except StopIteration:
            self._stack+=[self._chosen + [choice] for choice in choices]
            raise ModelCheckEscape()

This is the definition of the Chooser object. It is a tiny bit of elementary but ingenuous code. In order to understand what it does consider the following test function with its three calls of choose:

def test(chooser):
    x = chooser.choose([True, False])
    if x:
        y = chooser.choose(["a", "b"])
    else:
        z = chooser.choose(["y", "z"])

On each choose call a value is returned from the _it iterator. Those values must conform to the choices passed to choose for every call of choose. Otherwise a ChooserException is raised. So we expect _it to be an iterator wrapped around lists like [True, “a”], [True, “b”], [False, “y”], [False, “z”]. Those lists are associated with the choices being made at (x, y) or (x, z).

In fact we observe some more of those lists, starting with the empty list [] and the incompletely filled lists [True] and [False]. When _it is wrapped around an incomplete list one of the choose calls will raise a StopIteration exception at _it.next(). Assume for example that _it = iter([True]) then _it is already exhausted after x and choose and will raise StopIteration at the definition of y. At this point each of the choices at y i.e. “a” and “b” will produce a new list. Those lists are [True, “a”] and [True, “b”] which are now complete. New lists are pushed on the stack as long as incomplete lists are popped from the stack incheck().

As a special case we consider a simple linear sequence of choose calls

def test(chooser):
    x = chooser.choose([True, False])
    y = chooser.choose(["a", "b"])
    z = chooser.choose(["y", "z"])

The set of complete lists according to this sequence will be the Cartesian product of the choices: {True, False} x {“a”, “b”} x {“y”, “z”}. If you just want Cartesian products there are more efficient alternatives to create them though.

These are the Chooser basics. For Python you can download the used code here.

Choosers in C++

I gave a C++ and STL based implementation of Chooser objects. The Chooser C++ API closely follows the Python example. You can download the code from the linked document.

In its most general form the choose method has following signature:

    template <typename Container>
    typename Container::value_type choose(Container& choices)

The return type is derived from the containers value_type attribute. Other than this the algorithm only relies on iterators which means that any STL container can be used. We can rewrite the simple test function above in C++:

void test(Chooser& chooser) {
    int x = chooser.choose(2);
    if x {
        string s = "ab";
        char y = chooser.choose(s);
    }
    else {
        string s = "yz";
        char z = chooser.choose(s);
    }
}

This is not all that much overhead. In case of the x definition we use an overloaded version of choose which takes a single integer parameter k. This is equivalent to a choice of values within the range {0, 1, …, k-1}. The most relevant case may be choose(2) which is the boolean choice.

The string type is an STL container type as well. More precisely it is a typedef for basic_string<char>. We can create a string object with a string literal but we cannot pass a string literal directly to choose which expects an explicit reference to a container from which the return type is derived ( char in this case ).

ChooserMixin classes

Suppose we want to introduce Chooser objects into arbitrary methods of an existing class. The Chooser Algorithm is implemented s.t. a Chooser object is explicitly passed as a parameter but this would require changes in a methods interface, something we try to avoid.

Visibility of Chooser instances in the local scope of a method can also be achieved by making them global or member variables. An inexpensive method which is safer than using globals is to use a mixin class. The mixin class defines aChooser instance and if some class wants to use it, it derives from the mixin.

class ChooserMixin {
protected:
    Chooser chooser;
public:
    void test() = 0;
 
    void check()
    {
        ...
        this-&gt;chooser = Chooser(chosen, queue);
        test();
        ...
    }
}

The test method is abstract. If f is the method we want to check, then the implementation of test would just invoke f with appropriate parameters:

void test() {
    f(arg1, arg2, ...);
}
It’s easy to change test without touching any other source code.

More advantages of ChooserMixins

When we use ChooserMixin we can define the choices C being used in chooser.choose(C) also as member variables. This makes choices configurable. A subclass of a ChooserMixin might read data from an external file or a database and populate the C container.

I wonder if it’s even possible to get rid of T x = chooser.choose(C)assignments in method source when using data binding techniques? In JavaFX we can restate the assigment in the form

var x = bind chooser.choose(C)

The bound variable x is updated whenever C is changed. Instead of creating a new instance of Chooser on each iteration, we replace the members defined in a single instance and trigger updates of C which in turn causes chooser.choose(C) to produce a new value. It remains to be examined if this idea is somehow practical.

Python – Hibernate – Jynx

Posted in Hibernate, Jynx, Jython on September 4th, 2009 by kay – 2 Comments

Jynx 0.4 goes Hibernate

In Jynx 0.4 JPA/Hibernate annotations are supported. Although this is still work in progress some of the more complex nested annotations were tested as well as Hibernate extension annotations which cannot be single-name imported along with the corresponding JPA annotations without conflicts.

Jynx 0.4 provides other new features as well. One can now use @signature decorators to express Java method overloading. A simple Java parser is integrated. A Java parser was necessary to improve the Java class detection heuristics used to determine required imports when a Java proxy is created from a Jython class and compiled dynamically. Finally there is a new @bean_property decorator which creates a private attribute foo along with public getters and setters given a bean_property decorated method def foo(_):_. Full documentation of Jynx as well as its changes can be found here.

Using Hibernate from Jython

Starting and closing sessions and managing simple transactions is not difficult in Hibernate. In Jynx two context managers for with-statements are defined which hide open+close and begin+commit/rollback boilerplate from the programmer. Code for Hibernate sessions and transactions lives then in with-statement blocks.

class hn_session(object):
    '''
    Context manager which opens/closes hibernate sessions.
    '''
    def __init__(self, *classes):
        sessionFactory = buildSessionFactory(*classes)
        self.session   = sessionFactory.openSession()
 
    def __enter__(self):
        return self.session
 
    def __exit__(self, *exc_info):
        self.session.close()
 
class hn_transact(object):
    '''
    Context manager which begins and performs commits/rollbacks hibernate transactions.
    '''
    def __init__(self, session):
        self.tx = session.beginTransaction()
 
    def __enter__(self):
        return self.tx
 
    def __exit__(self, type, value, traceback):
        if type is None:
            self.tx.commit()
        else:
            self.tx.rollback()

A simple session using a single Entity Bean may then look like:

from __future__ import with_statement
 
from jynx.lib.hibernate import*
 
@Entity
class Course(Serializable):
    @Id
    @Column(name="COURSE_ID")
    @signature("public int _()")
    def getCourseId(self):
        return self.courseId
 
    @Column(name="COURSE_NAME", nullable = False, length=50)
    @signature("public String _()")
    def getCourseName(self):
        return self.courseName
 
    @signature("public void _(String)")
    def setCourseName(self, value):
        self.courseName = value
 
    @signature("public void _(int)")
    def setCourseId(self, value):
        self.courseId = value
 
with hn_session(Course) as session:
    course  = Course()
    course.setCourseId(121)
    course.setCourseName(str(range(5)))
    with hn_transact(session):
        session.saveOrUpdate(course)

Boilerplate Reduction

The standard class decorator for creating a Java class from a Jython class in Jynx is @JavaClass. In Jynx 0.4 some slightly extended decorators are introduced in particular @Entity and @Embeddable. Not only do they make Jython code more concise because one doesn’t have to stack @Entity and @JavaClass but translating with @Entity turns some automatically generated Java attributes into transient ones i.e. a @Transient annotation is applied which prevents those attributes to be mapped to table columns.

The massive boilerplate needed for defining a valid Entity Bean in the preceding example can be reduced using the @bean_property decorator:

@Entity
class Course(Serializable):
    @Id
    @Column(name="COURSE_ID")
    @bean_property(int)
    def courseId(self): pass
 
    @Column(name="COURSE_NAME", nullable = False, length=50)
    @bean_property(String)
    def courseName(self): pass

Applied to def courseId(self): pass the @bean_property decorator will cause the following Java code translation

    @Id @Column(name="COURSE_ID") private int courseId;
    int getCourseId() { return courseId; }
    int setCourseId(int value) { courseId = value; }

which specifies a complete Java Bean property.

Example

In the following example two Entities are associated using a one-to-one mapping between primary keys.

@Entity
class Heart(Serializable):
    @Id
    @bean_property(int)
    def id(self):pass
 
@Entity
class Body(Serializable):
    @Id
    @bean_property(int)
    def id(self):pass
 
    @OneToOne(cascade = CascadeType.ALL)
    @PrimaryKeyJoinColumn
    @bean_property(Heart)
    def heart(self):pass

Now we can check the behavior:

# session 1
with hn_session(Heart, Body) as session:
    body = Body()
    heart = Heart()
    body.heart = heart
    body.id = 1
    heart.id = body.id
    with hn_transact(session):
        session.saveOrUpdate(body)
        session.saveOrUpdate(heart)
 
# session 2
with hn_session(Heart, Body) as session:
    with hn_transact(session):
        b = session.get(Body, 1)
        assert b
        assert b.heart
        assert b.heart.id == 1

Summary

With Hibernate support in Jython we notice another clear departure from the CPython world and its web frameworks and components. Hibernate is distinctively Java and special techniques are needed to create compile time Java properties in a dynamic language. Jython has long been a second citizen in Python land. I suspect this is going to change with support of Java frameworks which alone have as many users/downloads as Python.