{"id":626,"date":"2009-05-14T11:53:47","date_gmt":"2009-05-14T10:53:47","guid":{"rendered":"http:\/\/fiber-space.de\/wordpress\/?p=626"},"modified":"2016-06-11T18:17:12","modified_gmt":"2016-06-11T17:17:12","slug":"pattern-matching-with-tupletrees","status":"publish","type":"post","link":"http:\/\/fiber-space.de\/wordpress\/2009\/05\/14\/pattern-matching-with-tupletrees\/","title":{"rendered":"Pattern matching with TupleTrees"},"content":{"rendered":"<p>As it seems advanced dispatch schemes are <a href=\"http:\/\/siddhi.blogspot.com\/2009\/05\/pattern-matching-with-peak-rules.html\">discussed<\/a> <a href=\"http:\/\/monkey.org\/~marius\/pattern-matching-in-python.html\">right now<\/a> under the <em>pattern matching<\/em> label. In this article I want to discuss another solution using a data-structure called the `TupleTree` ( also known as prefix tree or trie ). The tuple tree solution comes closer to PEAK rules than to Marius Eriksens pattern matching engine. I find it more elegant than the PEAK rules solution and there is less boilerplate. I can&#8217;t say much about the generality of PEAK rules though and they might cover a lot more.<\/p>\n<p>A `TupleTree` is an efficient way to store tuples by factoring out common prefixes. Suppose you have a set of tuples:<\/p>\n<p>`{(a, b, c), (a, b, d), (a, c, d), (b, d, d)}` then you can store the same information using a tree structure<\/p>\n<p>`{(a, (b, ((c,) (d,))), (c, d)), (b, d, d)}`<\/p>\n<p>Searching in the tuple tree is of complexity `O(log(n))` and can degenerate to `O(n)` if there is isn&#8217;t much to factorize.<\/p>\n<p>This isn&#8217;t too interesting in the discussion of pattern matching schemes if we wouldn&#8217;t introduce two different kinds of wildcards or symbolic pattern called `ANY` and `ALL`.<\/p>\n<p><strong>ANY<\/strong> &#8211; a pattern that matches any symbol but with the lowest priority: if there is a tuple `(ANY, Y, &#8230;)` in the tuple tree then `(X, Y, &#8230;)` is matched by `(ANY, Y, &#8230;)` iff there isn&#8217;t a more specific matching tuple `(X, Y, &#8230; )` in the tree.<\/p>\n<p><strong>ALL<\/strong> &#8211; a pattern that matches any symbol but with the same priority as a more specific symbol. If there is a tuple `(ALL, Y, &#8230;)` in the tuple tree then `(X, Y, &#8230;)` is matched by `(ALL, Y, &#8230;)` <em>and<\/em> by a more specific tuple `(X, Y, &#8230; )` if present. This means `ALL` creates an ambiguity.<\/p>\n<p>We can consider `ALL` as a variable and we eliminate the ambiguity using value substitution. Let&#8217;s say we have a set of tuples `{(ANY, Y), (X, Y), (ALL, Z)}` then elimination of `ALL` leads to `{(ANY, Y), (ANY, Z), (X,Y), (X, Z)}` and the tuple tree `{(ANY, (Y,), (Z,)), (X, (Y,), (Z,))}`.<\/p>\n<h4>TupleTree implementation<\/h4>\n<p>First we define the mentioned pattern `ANY` and `ALL`<\/p>\n<pre lang=\"python\">class Pattern:\r\n    def __init__(self, name):\r\n        self.name = name\r\n    def __repr__(self):\r\n        return \"&lt;P:%s&gt;\"%self.name\r\n\r\nANY = Pattern(\"ANY\")\r\nALL = Pattern(\"ALL\")<\/pre>\n<p>Now we create the `TupleTree` object. The `TupleTree` implements two methods `insert` and `find`. The `insert` method takes a tuple and a key as parameters. It inserts the tuple and stores a `key` at the location of the tuple. The `find` method takes a tuple and returns a key if it was inserted at the location of the tuple.<\/p>\n<pre lang=\"python\">class TupleTree(object):\r\n    def __init__(self):\r\n        self.branches = {}\r\n        self.key = None\r\n        self.all = None\r\n\r\n    def insert(self, args, value):\r\n        if len(args) == 0:\r\n            self.key = value\r\n            return\r\n        first = args[0]\r\n        if first == ALL:\r\n            for node in self.branches.values():\r\n                node.insert(args[1:], value)\r\n            self.all = (args[1:], value)\r\n        elif first in self.branches:\r\n            node = self.branches[first]\r\n            node.insert(args[1:], value)\r\n            if self.all:\r\n                node.insert(*self.all)\r\n        else:\r\n            tree  = TupleTree()\r\n            self.branches[first] = tree\r\n            tree.insert(args[1:], value)\r\n            if self.all:\r\n                node.insert(*self.all)\r\n\r\n    def find(self, args):\r\n        first = args[0]\r\n        if first in self.branches:\r\n            node = self.branches[first]\r\n        elif ANY in self.branches:\r\n            node = self.branches[ANY]\r\n        if len(args) == 1:\r\n            return node.key\r\n        else:\r\n            return node.find(args[1:])<\/pre>\n<h4>The Dispatcher<\/h4>\n<p>It is easy to define a dispatcher that matches argument tuples against a tuple in a `TupleTree`. Handler functions which are decorated by a `Dispatcher` object are stored as tuple tree keys. Those handler functions are retrieved from the `TupleTree` when the `apply` method is called with concrete arguments.<\/p>\n<pre lang=\"python\">class Dispatcher(object):\r\n    def __init__(self):\r\n        self.ttree = TupleTree()\r\n\r\n    def __call__(self, *args):\r\n        def handler(f):\r\n            self.ttree.insert(args, f)\r\n            return f\r\n        return handler\r\n\r\n    def apply(self, *args):\r\n        handler = self.ttree.find(args)\r\n        if not handler:\r\n            raise ValueError(\"Failed to find handler that matches arguments\")\r\n        else:\r\n            return handler(*args)<\/pre>\n<h4>Example<\/h4>\n<p>As an example we create a new `Dispatcher` object and decorate handler functions using it.<\/p>\n<pre lang=\"python\">alt = Dispatcher()\r\n\r\n@alt(\"\/\", ANY)\r\ndef not_a_resource(path, method):\r\n    print \"not a resource\"\r\n\r\n@alt(ANY, \"GET\")\r\ndef retrieve_resource(path, method):\r\n    print \"retrieve resource\"\r\n\r\n@alt(ANY, \"POST\")\r\ndef update_resource(path, method):\r\n    print \"update resource\", path\r\n\r\n@alt(ALL, \"PUT\")\r\ndef create_new_resource(path, method):\r\n    print \"create new resource\", path\r\n\r\n@alt(ANY, ANY)\r\ndef invalid_request(path, method):\r\n    print \"invalid request\", path<\/pre>\n<p>Notice that the `create_new_resource` handler is called when the HTTP command is `PUT` is passed even when the path is the root path &#8220;\/&#8221;. This is caused by the `ALL` pattern in the first argument. For all other commands a &#8220;not a resource&#8221; message is printed.<\/p>\n<pre>&gt;&gt;&gt; alt.apply(\"\/home\", \"PUT\")\r\ncreate new resource \/home\r\n&gt;&gt;&gt; alt.apply(\"\/\", \"PUT\")\r\ncreate new resource \/\r\n&gt;&gt;&gt; alt.apply(\"\/\", \"GET\")\r\nnot a resource\r\n&gt;&gt;&gt; alt.apply(\"\/home\", \"GET\")\r\nretrieve resource \/home\r\n&gt;&gt;&gt; alt.apply(\"\/home\", \"PAUSE\")\r\ninvalid request PAUSE<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>As it seems advanced dispatch schemes are discussed right now under the pattern matching label. In this article I want to discuss another solution using a data-structure called the `TupleTree` ( also known as prefix tree or trie ). The &hellip; <a href=\"http:\/\/fiber-space.de\/wordpress\/2009\/05\/14\/pattern-matching-with-tupletrees\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[4],"tags":[],"_links":{"self":[{"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/posts\/626"}],"collection":[{"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/comments?post=626"}],"version-history":[{"count":16,"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/posts\/626\/revisions"}],"predecessor-version":[{"id":2225,"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/posts\/626\/revisions\/2225"}],"wp:attachment":[{"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/media?parent=626"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/categories?post=626"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/tags?post=626"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}