## Posts Tagged ‘Add new tag’

### Universal Set in Python

Posted in Mathematics, Python on April 16th, 2009 by kay – 5 Comments

### Building sets from set() and ANY

When defining a regular expression there is a well known duality between a set of characters specified by enclosing characters or character ranges within square brackets […] and defining the pattern of the complement set [^…] by preceding the caret symbol ^ .

• [0-9] — matches all digits
• [^0-9] — matches all non-digits

The nature of the complement set mentioned here is less precisely defined and is implementation dependent in practice. If a regexp matcher knows about ASCII only then [^0-9] is the set of all ASCII characters which are not digits.

Suppose now we want to build a character matching set from sets in the sense of a set data-type as inhabited in Python as a builtin. This is easy because we start from the empty set() and add just more elements. Each finite set can be constructed in this way and we can establish a set ~ pattern correspondence:

set(range(9)) ~ [0-9]

But how do we find a corresponding set for [^0-9]?

This cannot be finitely constructed from the bottom up using set() as a start value but it has to be finitely de-constructed from the set of all elements. This looks like we are running into unlimited comprehension and the Russel paradox but actually the set theory we are using is quite weak and element membership can be decided just by doing hashtable lookups.

Instead of constructing a set from set() we want to (de-)construct it from ANY. The name ANY is not chosen by chance because we want to express the correspondence with . pattern:

ANY ~ .

( Actually ANY is a little stronger because . doesn’t match newline characters unless explicitly specified by the DOTALL flag in the re.compile function. )

Using ANY we can easily state the above correspondence:

ANY – set(range(9)) ~ [^0-9]

### Implementation of ANY

We define ANY as an instance of univset. The univset has a set-like interface and implements set operations consistently on sets of the type ANY-A, ANY-B,… where A, B… are finite sets constructed from set(). For example

ANY-A ∩ ANY-B = ANY – ( A ∪ B )
ANY-A ∪ B = ANY – ( A – B )

```class univset(object): def __init__(self): self._diff = set()   def __sub__(self, other): S = univset() if type(other) == set: S._diff = self._diff | other return S else: S._diff = self._diff | other._diff return S   def __rsub__(self, other): return other &amp; self._diff   def __contains__(self, obj): return not obj in self._diff   def __and__(self, other): return other - self._diff   def __rand__(self, other): return other - self._diff   def __repr__(self): if self._diff == set(): return "ANY" else: return "ANY - %s"%self._diff   def __or__(self, other): S = univset() S._diff = self._diff - other return S   def __xor__(self, other): return (self - other) | (other - self)   def add(self, elem): if elem in self._diff: self._diff.remove(elem)   def update(self, elem): self._diff = self._diff - other   def __ror__(self, other): return self.__or__(other)   def union(self, other): return self.__or__(other)   def difference(self, other): return self.__sub__(other)   def intersection(self, other): return self.__and__(other)   def symmetric_difference(self, other): return self.__xor__(other)   def issubset(self, other): if type(other) == set: return False if issubset(other._diff, self._diff): return True return False   def issuperset(self, other): if self._diff &amp; other: return False return True   def __lt__(self, other): return self.issubset(other)   def __eq__(self, other): if type(other) == set: return False try: return self._diff == other._diff except AttributeError: return False   def __ne__(self, other): return not self.__eq__(other)   def __le__(self, other): return self.__lt__(other) or self.__eq__(other)   def __gt__(self, other): return self.issuperset(other)   def __gt__(self, other): return self.issuperset(other) or self == other```

Examples

```ANY = univset() NON_DIGITS = ANY - set(range(9)) assert 8 not in NON_DIGITS assert 'a' in NON_DIGITS assert 0 in NON_DIGITS | set([0])```

### The statement that never was

Posted in General, Python on April 15th, 2009 by kay – 1 Comment

The Python release archive is a nice repository for archaeological findings. My biggest surprise was the existence of an access statement and access keyword in the pre-history of Python 1.5.0. The syntax was already specified in release 1.0 with an accompanying accessobject.c and it began to fade in Python 1.4.

### The access statement

That is how the access statement was defined in 1.0

```access_stmt: 'access' ('*' | NAME (',' NAME)*) ':' accesstype (',' accesstype)* accesstype: NAME+ # accesstype should be ('public' | 'protected' | 'private') ['read'] ['write'] # but can't be because that would create undesirable reserved words!```

The access_stmt was a small statement i.e. one that doesn’t contain a block and could be used like print, del or exec.

So the proper use of the access statement would be like

```class Point: access x, y, z: private write, public read```

or more compact:

```class Point: access *: private write, public read```

The word public wasn’t a language keyword which made it possible to use it in this way:

```class AccessModifier: access public: public access private: public access protected: public```

Note that the access statement was purely declarative even more so than Java’s where you can write

`public int x = 10;`
and define x while specifying its type and access modifier. It was also more elaborated in terms of intended protections by means of differentiating between read and write.

### How it never took off and gone away

In the HISTORY file of Python 1.0 we find the notice

There’s a new reserved word: “access”. The syntax and semantics are still subject of of research and debate (as well as undocumented), but the parser knows about the keyword so you must not use it as a variable, function, or attribute name.

In Python 1.4 the access statement was already uncommented from the Grammar.

The story ends with the following remark in the HISTORY of Python 1.5:

‘access’ is no longer a reserved word, and all code related to its implementation is gone (or at least #ifdef’ed out). This should make Python a little speedier too!