Table Of Contents

Previous topic

T3Number

Next topic

Changes

This Page

T3Table

Introduction

A T3Table is a composite data type which is a crossover between a C-struct, a formal grammar and a spreadsheet. As we progress we can add even more dimensions to its description. Since a T3Table looks like a data type which swallowed a whole framework, a complex assemblage, it might be surprising that it comes out quite lean. This is because a T3Table is special purpose and only some of the aspects of the mentioned data structures are reflected in its making.

T3Tables are designed to handle structured binary data like TLVs, ATRs, TCP Headers and so on. A correctly defined T3Table can represent any of those structures and act as a context sensitive parser on flat binary data. What makes T3Tables special is that the parse trees are also T3Tables. Parsing becomes an act of self-reproduction. A T3Table clones itself with special data.

Unparsing a T3Table yields another Hex number. This makes T3Tables ideally suited to control variation on data and the production of test data.

About this document

  • In the 1st section we give a detailed description of the T3Table, its methods and operators.
  • In the 2nd section we add the T3Row and T3Binding classes to the picture.
  • In the 3rd section two subclasses of T3Table are introduced, which are T3Bitmap and T3Set.
  • In the final section we take a closer look on the design of T3Tables.

The T3Table class

Methods

T3Table()

A T3Table is created argument-less.

T3Table.add(pattern = 0, **field)

The add method is used to add one row to the T3Table. The data passed to add must suffice to create a T3Row object. Variants are:

  1. t.add(R) with type(R) = T3Row.

    This is possibly the most obvious construction. It is convenient to create a T3Row less explicitly.

  2. t.add(P, s = V).

    This passes a pattern P and a key-value pair s = V. The name s becomes the name of the row and we can access the row s using the notation t.s.

    Both P and V can be of a variatey of types:

    1. Types of row-value V

      1. T3Binding

        T3Bindings will be studied in greater detail below. A T3Binding is not a row-value but a ValueBinding. A ValueBinding is a callable that produces a row-value once a function or operator tries to access the row-value.

      2. T3Table

        T3Tables can be row-values. This allows for nested T3Tables.

      3. NoneType

        None is a special value and considered below

      4. Other types

        The T3Table._coerce() method is applied to the input data.

    2. Types of pattern P

      1. T3PatternObject

        Pattern objects of this kind are defined in the module t3.pattern.py.

      2. T3Table

        A T3Table implements the t3.pattern.T3Matcher interface and can therefore be a pattern! This can be interesting for subclasses of T3Table such as T3Bitmap.

      3. int

        An integer k is turned into T3PatternWildcard(k). If data implements __getitem__ this matches data[:k].

      4. str, unicode

        Objects of type str or unicode are parsed into values of type T3PatternObject.

      5. T3Number

        With k = int(P) the argument is used like an int type.

      6. Callable

        A callable must have the signature (T3Table, T3Number) -> T where T is one of type a) - e). At T3Table.match() we take a closer look at pattern matching.

  3. t.add(x = V).

    In this form the pattern is omitted and the default value 0 for it is used. The 0 value will be wrapped into the pattern T3PatternWildcard(0) which is the pattern that matches 0 digits of input data when t.match(data) is applied.

    The effect of the 0-pattern is the following: when t.match(data) is called which creates a clone of t, say u then u.x = None.

    Note

    There must at least one T3Row in a T3Table that matches. Otherwise t.match(data) raises a MatchingFailure exception.

T3Table.match(data)

The match function creates a new T3Table using input data and the pattern defined for the T3Rows of the table.

Example

>>> t = T3Table().add(1, s = 0).add(1, t = 0)
>>> m = t.match("89 56")
>>> t2 = m.value
t2:
    s: 78
    t: AF

The inner working of t.match(data) can be illustrated by the following simplified algorithm

def match(t, data):
    m = T3Match(T3Number.NULL, data)
    table = copy(t)
    for row in table._rows:
        m = row.match(m.rest)
        if not m:
            raise MatchingFailure(m)
    m.value = table
    return m

Each row matches a piece of the data object, produces a T3Match return value and continues with the unmatched rest. If a T3Row fails to match the computation will be cancelled and an MatchingFailure exception is raised. Otherwise a copy of the input T3Table storing the values of the match is attached to the T3Match object which will be returned.

T3Table._coerce(rowvalue)

A T3Table class embodies a fixed default row value type. Types such as integers or strings might then be converted into that type on row value assignment or on other occasions. Known default row value types are

  • Hex – T3Table
  • Bin – T3Bitmap

Overwrite _coerce in subclasses when needed.

T3Table.find(rowname)

This function is used to find a row with a given name. It is a breadth first search method i.e. it looks for a row name on a given axis and recurses into a sub-T3Table on failure. The function returns the row-value of the found, None otherwise.

Example

>>> t1 = T3Table().add(s = 1).add(r = 1)
>>> t2 = T3Table().add(t = t1).add(r = 2)
>>> t2
t2:
    t:
        s: 01
        r: 01
    r: 02
>>> t2.find("r")
02
>>> t2.find("s")
01
T3Table.get_value()

t.get_value() concatenates the values of the T3Rows of t except of those which are None and returns that concatenation. Often get_value() is used implicitly and one writes Hex(t) instead.

Example

>>> t = T3Table().add(s = 1).add(r = 2)
>>> t
t:
    s: 01
    r: 02

>>> Hex(t)
01 02

Operators

The operators used on T3Tables are

Operation Result Notes
t[s] T3Row s of t (1)
t.s value of T3Row s of t (2)
t.s = v substitute value of T3Row s of t with new value v (3)
s in t True if s a valid T3Row name in t, False otherwise  
t1 // t2 new T3Table which is the concatenation of t1 and t2  
t << data parses data using the definition of t (4)
len(t) number of rows of t  
copy(t) a copy of t. The rows of t are also copied  
iter(t) an iterator over the rows of t  
t(x = a, ...) a copy of t with row value substitutions copy(t).x = a (5)

Notes :

    1. For convenience a T3Row implements a subset of the list type protocol in particular the methods

      • __len__
      • __getitem__
      • __iter__

      This means that a single T3Row can be treated as a 1-element list of T3Rows. So t[s][0] or for for row in t[s]: ... can be applied even if t[s] is of type T3Row.

      Example

      >>> t = T3Table().add(s = 0).add(s = 1).add(r = 2)
      >>> for row in t["r"]:
      ...     print row
      ...
      <t3table.T3Row 'r = 02'>
      
      >>> for row in t["s"]:
      ...     print row
      ...
      <t3table.T3Row 's = 00'>
      <t3table.T3Row 's = 01'>
      
    2. An AttributeError is raised if s is not a valid name of a row in t.

    1. If s is the name of a T3Row of t then t.s is the value t[s].get_value() if t[s] is a T3Row. If otherwise t[s] is a list of T3Rows the list [r.get_value() for r in t[s]] is returned.

      Note

      Use the comprehension [r.get_value() for r in t[s]] if you are in doubt about the cardinality of the T3Row with name s. This way you can avoid to deal with variants.

      Admittedly I haven’t found an elegant solution to this API puzzle. Row/Value access is optimized for a 1-1 relationship between rows and names which is also quite the norm.

      Example

      >>> t = T3Table().add(s = 0).add(s = 1).add(r = 2)
      >>> t.r
      02
      >>> t.s
      [00, 01]
      
    2. An AttributeError is raised if s is not a valid name of a row in t.

    1. :ref:None assignments are discussed below.

    2. Unpacking assignments are required for multiple rows with the same name

      >>> t = T3Table().add(s = 0).add(s = 1).add(r = 2)
      >>> t.s
      [00, 01]
      >>> t.s = [7, 8]
      >>> t.s
      [07, 08]
      

      Assigning a wrong number of arguments results in a ValueError

      >>> t.s = [0]
      Traceback (most recent call last):
          File "<interactive input>", line 1, in <module>
      ValueError: need more than 1 value to unpack. 2 expected
      
  1. The operator t << data always returns either a new T3Table as a “parse tree” or raises a MatchingFailure exception with a T3MatchFail object as the exception value.
  1. On the surface copy(t) is redundant and can be replaced with t() which also produces a copy of t. But t() is actually very different underneath.

    The reason for this is that a T3Table t2 can be the value of a T3Row t1 and we’d like to create a copy of t1 with some row values modified in t2

    t1.t2(x = a, y = b)   # this should create a copy of t1 with two changes in t2
    

    Each T3Table has a parent and for t1 and t2 the following assumptions are true

    assert t2._parent == t1
    assert t1._parent == None
    

    So when t2(x = a, y = b) is evaluated what is actually copied is the root of the tree in which t2 is a node

    def copy_root(self):
        if self._parent is None:
            return copy(self)
        else:
            return self._parent.copy_root()
    

    Copying the root of t2 will also copy t2. Finally the changes to copy(t2) in x and y are performed just as expected.

Special row value assignments

The optionality of a T3Table row can be controlled setting the row value None. The idea is that None makes a row “invisble”: it doesn’t contribute anything the value of the T3Table and the row isn’t displayed. It also doesn’t affect the parsing process

>>> t = T3Table().add(1, u = 0).add(1, v = 1).add(1, w = 2)
>>> t
t:
    u: 00
    v: 01
    u: 02

Now set a row value to None

>>> t.v = None
>>> t
t:
    u: 00
    u: 02
>>> Hex(t)
00 02
>>> t << '03 04 05'
<__main__.T3Table object at 0x0377F8D0>
    u: 03
    w: 04

This notwithstanding the row is still present

>>> t["v"]
<t3table.T3Row 'v = None'>

The rules become somewhat more complex when data binding is involved which is handled in the next section.

Data Binding and T3Tables

For collection data types such as lists, arrays, dicts or tuples in Python or another high level language like Java, causally dependent data such as the size of such a collection are not perceived as an integral part of the object representation. Maybe they are but they are implementation details of the particular language not part of the language definition. This is very different from lower level languages like C or Pascal where e.g. strings are composite data, consisting of a trailing zero to determine the string end or a leading byte which stores the length of a Pascal string. The programmer is responsible for the integrity of the data structure and trades comfort for control.

T3Tables are used to represent composite data built in the spirit of Pascal or C. Unlike those the integrity is directly maintained within those types using data binding.

Example

Array = T3Table()
Array.add(1, Length = binding.table("Data", len))
Array.add("*", Data = '00')

With this definition we get

>>> Array
Array:
    $Length: 01
    Data: 00

>>> Array.Data = '01 02 03'
>>> Array
Array:
    $Length: 03
    Data: 01 02 03

The expression binding.table("Data", len) works as follows:

Whenever the row Length is accessed the value of Array.Data is passed to len and the result is assigned to the the value of Length

Length.value = len(Array.Data)

We could express the relationship shorter by introducing a new operator that Python lacks

Length.value <- len(Array.Data)

This would imply that Length.value is updated whenever Array.Data changes.

Representation

Rows with a value binding are prefixed with a ‘$’ sigil as shown above.

Data binding and cloning

This behaviour doesn’t get lost when Array is cloned

>>> NewArray = Array(Data = '0F 02')
>>> NewArray
NewArray:
    $Length: 02
    Data: 0F 02

Data binding and parsing

When data is parsed into a new T3Table, value bindings will be ignored

>>> P = Array << '02 00'
>>> P
P:
    $Length: '02'
    Data: '00'

This is useful when you want to check the parsed data

>>> assert P.Length == len(P.Data), "FAIL: Length must be '%s'. '%s' found instead."%(len(P.Data), int(P.Length))
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
AssertionError: FAIL: Length must be '1'. '2' found instead.

Also cloning won’t affect the parsed result

>>> Hex(P) == Hex(P())
True

It is also possible to set the value of a data bound row directly

>>> P.Length = '03'
>>> P
P:
    $Length: '03'
    Data: '00'

Recomputation applies once a data element is updated other than the data bound row

>>> P.Data += 0     # a change which updates P without changing Data
>>> P
P:
    $Length: '01'   # the row value is re-computed
    Data: '00'

Caution

When you try to update multiple rows in a copy expression such as P(Data = '00', Length = '03') and one of those rows is value bound, such as Length, a warning will be issued. This is because a dict is passed to P and the update order remains undetermined. If Length is updated before Data the result might differ from Data being updated before Length.

My advice: never update value bound rows when you copy a T3Table.

Binding to tables

A T3Table can be considered as an own sort of scope for data bindings. We have already seen how to bind a function to a row of a table

Array.add(1, Length = binding.table("Data", len))

On binding assignment the table isn’t passed to the binding here but this happens when the value of Length gets fetched. The Length row holds a reference to its containing T3Table and this table is assigned to the binding when Length.get_value() is applied.

Passing a row name is not the only possibility to bind to table data. Other options are expressed through match codes. The code strings are listed below

Code Result Notes
“name” Row name of this table  
"*" This table  
".*" T3Table built from all rows succeeding this row.  
"*." T3Table built from all rows preceding this row  
"*/" Parent table of this table may be None
"*/" .. "/" n-th grandparent table os this table may be None

..function :: binding.table(match_code, callback)

The match_code is described in the table above. The callback is a function of a single argument and return value.

Dynamically scoped variables

The data binding mechanism supports bindings to arbitrary objects, not just to the table which contains the binding. This binding type is provided by the function binding.dynamic()

binding.dynamic(name[, callback])

name refers to the name of a variable which gets fetched. The callback parameter is an optional function of a single variable which returns a single value.

The name dynamic was chosen because binding.dynamic(...) is a dynamically scoped variable. If a binding binding.dynamic("X") is defined, the binding doesn’t refer to an X at the location of the definition of the binding, but an X at the location of evaluation or the call of the binding

>>> K = 42
>>> B = binding.dynamic("K")
>>> def foo(binding):
...     K = -42
...     return binding.get_value()
...
>>> B.get_value()  # binding to K evaluated
42
>>> foo(B)         # binding to another K, defined inside foo
-42
>>> B.get_value()  # binding to the original K
42

Compare this to the lexically scoped binding of variables in the Python interpreter

>>> K = 42
>>> def static():
...     return K
...
>>> def dynamic():
...     return binding.dynamic("K").get_value()
...
>>> def foo(binding):
...     K = -42
...     return binding()
...
>>> static()
42
>>> dynamic()
42
>>> foo(static)
42
>>> foo(dynamic)
-42

T3Table subclasses

The classes introduced in this section are used to refine T3Tables ( T3Bitmap ), provide an evaluation context for assertions about parsing results ( T3TableContext ) or they enhance pattern matching ( T3Set, T3Repeater ).

T3Set

A T3Table matches pattern defined in T3Rows in sequential order. If a sequential order A B C ...' of objects doesn’t matter and a permutation C A B ... of them was equally valid we would need a pattern of the form (A | B | C | ...)+ to apply a successful match.

T3Sets are built around the idea of ignoring the sequential order but unlike the rule (A | B | C | ...)+ it accepts only permutations of {A, B, C, ...} no repitition of any one of its elements: once e.g. B has been matched the matching process continues with {A, C, ...} - {B}. It is allowed to terminate before all sub-pattern matched.

Adding a row to a T3Set S takes the following form

S.add(prefix, key = value)

For example

Tlv = T3Table().add(1, Tag = '00').add(1, Len = ...) ...

S.add(0x89, T_89 = Tlv )
S.add(0xA6, T_A6 = Tlv )
...

The prefix has a different meaning than it had in a T3Table where a 0x89 integer pattern was the length of data to match. In a T3Set it is actually the value of the key = value pair which does the match e.g. the Tlv value. The prefix acts as a selector of the matching value object. If the Tlv table was used to match data unspecifically it would match any TLV whatsoever.

T3Repeater

Unlike the other classes mentioned in this section the T3Repeater is not a sublcass of T3Table. It merely holds a T3Table object as a member variable.

When the T3Set enhanced pattern matching through the introduction of alternative row matches of the form A | B, the T3Repeater can be perceived as the Kleene star A* or one of its delimited variants.

T3Bitmap

T3TableContext