Choosers and ChooserMixins in C++ and Python

Chooser Objects

From time to time I’m amazed to find a simple algorithm which seemed to be a low hanging fruit which was just overlooked. In this particular case it is about generating and utilizing test data in a both simple and flexible manner. Mark Seaborn described the method in his outstanding blog article How to do model checking of Python code. He distilled what we might call the Chooser Algorithm from a scientific paper which buries the message under all sorts of methodological considerations and special case treatments and other bloat. This is sad because good algorithms are the crown jewels of programming. It also helped that he provided an implementation in Python and not in C or some sloppy computing-scientist-only pseudo code notation which changes from author to author.

We can motivate `Chooser` objects as follows.

Suppose you have a control flow statement defined in a function `f`. The path the flow control takes is determined by the value of some variable `x`:

def f(*args):
    ...
    x = g(*args)
    if x>0:
        ...
    else:
        ...

When we want to test the if-statement alone we can ignore the value of `x` computed by `g`. A simple method to achieve this is to introduce a for-loop in the code which iterates over a range of values which represent jumps to the individual if-statement branches:

def f(*args):
    ...
    x = g(*args)
    for x in (1,-1):
        if x>0:
            ...
        else:
            ...

However, this is a quite heavy change and we would likely not want to repeat this at another place. Instead of adding a for-loop we can introduce a non-deterministic choice over the values 1 and -1 and pull the iteration, represented by the loop, out of the function:

def test(chooser):
    def f(*args):
        ...
        x = g(*args)
        x = chooser.choose([1,-1])
        if x>0:
            ...
        else:
            ...
    f(1,2,3)  # call f with appropriate arguments

Here we inserted a call to `choose` which represents a set of choices. No new control flow is introduced. The function `f` must be called as many times as there are choices passed to `choose`.

The repeated call of `f` is managed by a new function `check` which is part of the `Chooser Algorithm`. It actually calls the `test` function which has a uniform interface and keeps a single `chooser` parameter.

class ModelCheckEscape(Exception): pass
 
def check(func):
    stack = [[]]
    while stack:
        chosen = stack.pop()
        try:
            func(Chooser(chosen, stack))
        except ModelCheckEscape:
            pass

The `check` function creates a `Chooser` object and passes it to `func` which is represents the system under test. The `Chooser` constructor takes two arguments. One is a list called chosen popped from a stack of such lists, the other one is the stack itself which might be filled with new lists.

class ModelCheckEscape(Exception): pass
 
class Chooser(object):
    def __init__(self, chosen, stack):
        self._chosen = chosen
        self._stack  = stack
        self._it     = iter(chosen)
 
    def choose(self, choices):
        try:
            choice = self._it.next()
            if choice not in choices:
                raise Exception("Program is not deterministic")
            return choice
        except StopIteration:
            self._stack+=[self._chosen + [choice] for choice in choices]
            raise ModelCheckEscape()

This is the definition of the `Chooser` object. It is a tiny bit of elementary but ingenuous code. In order to understand what it does consider the following test function with its three calls of `choose`:

def test(chooser):
    x = chooser.choose([True, False])
    if x:
        y = chooser.choose(["a", "b"])
    else:
        z = chooser.choose(["y", "z"])

On each `choose` call a value is returned from the `_it` iterator. Those values must conform to the choices passed to `choose` for every call of `choose`. Otherwise a `ChooserException` is raised. So we expect `_it` to be an iterator wrapped around lists like `[True, “a”], [True, “b”], [False, “y”], [False, “z”]`. Those lists are associated with the choices being made at (`x`, `y`) or (`x`, `z`).

In fact we observe some more of those lists, starting with the empty list `[]` and the incompletely filled lists `[True]` and `[False]`. When `_it` is wrapped around an incomplete list one of the `choose` calls will raise a `StopIteration` exception at `_it.next()`. Assume for example that `_it = iter([True])` then `_it` is already exhausted after `x` and `choose` and will raise `StopIteration` at the definition of `y`. At this point each of the choices at `y` i.e. “a” and “b” will produce a new list. Those lists are `[True, “a”]` and `[True, “b”]` which are now complete. New lists are pushed on the stack as long as incomplete lists are popped from the stack in`check()`.

As a special case we consider a simple linear sequence of `choose` calls

def test(chooser):
    x = chooser.choose([True, False])
    y = chooser.choose(["a", "b"])
    z = chooser.choose(["y", "z"])

The set of complete lists according to this sequence will be the Cartesian product of the choices: `{True, False} x {“a”, “b”} x {“y”, “z”}`. If you just want Cartesian products there are more efficient alternatives to create them though.

These are the `Chooser` basics. For Python you can download the used code here.

Choosers in C++

I gave a C++ and STL based implementation of `Chooser` objects. The Chooser C++ API closely follows the Python example. You can download the code from the linked document.

In its most general form the `choose` method has following signature:

    template <typename Container>
    typename Container::value_type choose(Container& choices)

The return type is derived from the containers `value_type` attribute. Other than this the algorithm only relies on iterators which means that any STL container can be used. We can rewrite the simple `test` function above in C++:

void test(Chooser& chooser) {
    int x = chooser.choose(2);
    if x {
        string s = "ab";
        char y = chooser.choose(s);
    }
    else {
        string s = "yz";
        char z = chooser.choose(s);
    }
}

This is not all that much overhead. In case of the `x` definition we use an overloaded version of `choose` which takes a single integer parameter `k`. This is equivalent to a choice of values within the range `{0, 1, …, k-1}`. The most relevant case may be `choose(2)` which is the boolean choice.

The `string` type is an STL container type as well. More precisely it is a `typedef` for `basic_string`<`char`>. We can create a `string` object with a string literal but we cannot pass a string literal directly to `choose` which expects an explicit reference to a container from which the return type is derived ( `char` in this case ).

ChooserMixin classes

Suppose we want to introduce `Chooser` objects into arbitrary methods of an existing class. The Chooser Algorithm is implemented s.t. a `Chooser` object is explicitly passed as a parameter but this would require changes in a methods interface, something we try to avoid.

Visibility of `Chooser` instances in the local scope of a method can also be achieved by making them global or member variables. An inexpensive method which is safer than using globals is to use a mixin class. The mixin class defines a`Chooser` instance and if some class wants to use it, it derives from the mixin.

class ChooserMixin {
protected:
    Chooser chooser;
public:
    void test() = 0;
 
    void check()
    {
        ...
        this-&gt;chooser = Chooser(chosen, queue);
        test();
        ...
    }
}

The `test` method is abstract. If `f` is the method we want to check, then the implementation of `test` would just invoke `f` with appropriate parameters:

void test() {
    f(arg1, arg2, ...);
}

It’s easy to change `test` without touching any other source code.

More advantages of ChooserMixins

When we use `ChooserMixin` we can define the choices `C` being used in `chooser.choose(C)` also as member variables. This makes choices configurable. A subclass of a `ChooserMixin` might read data from an external file or a database and populate the `C` container.

I wonder if it’s even possible to get rid of `T x = chooser.choose(C)`assignments in method source when using data binding techniques? In JavaFX we can restate the assigment in the form

`var x = bind chooser.choose(C)`

The bound variable `x` is updated whenever `C` is changed. Instead of creating a new instance of `Chooser` on each iteration, we replace the members defined in a single instance and trigger updates of `C` which in turn causes `chooser.choose(C)` to produce a new value. It remains to be examined if this idea is somehow practical.

This entry was posted in Chooser, CPP, Python, Testing. Bookmark the permalink.

2 Responses to Choosers and ChooserMixins in C++ and Python

  1. Kris Kowal says:

    This is a great observation. I also find it useful to conflate “choosers”, “validators”, and “normalizers”. For example, a chooser that raises a validation error, like your chooser exception, with a helpful message when the value is outside the domain. A normalizer might convert a variety of strings to possibly one of the valid choices. And, as you point out, classes of choosers, validators, and normalizers make good mixins, to construct a normalizer, validator, chooser pipelines. I also prefer to use the function call as the interface for these one-function objects, overriding call so cheap function adapters are possible to mix with objects with more complex polymorphic needs.

  2. kay says:

    Yes, Kris. When we keep attention on data binding the requirement for validators and normalizers naturally comes up. For example dependency properties in WPF consider them already and provide an API. To connect the pipeline one might just have to study the relationship between choosers and data binding as I briefly mentioned in the final, speculative remarks.

Leave a Reply

Your email address will not be published. Required fields are marked *

*