{"id":194,"date":"2009-03-22T19:05:05","date_gmt":"2009-03-22T18:05:05","guid":{"rendered":"http:\/\/fiber-space.de\/wordpress\/?p=194"},"modified":"2022-10-12T10:11:19","modified_gmt":"2022-10-12T09:11:19","slug":"contextual-keywords","status":"publish","type":"post","link":"http:\/\/fiber-space.de\/wordpress\/2009\/03\/22\/contextual-keywords\/","title":{"rendered":"Contextual Keywords"},"content":{"rendered":"\n<h3>Keywords and Contextual Keywords<\/h3>\n\n\n\n<p>A language keyword is a reserved word that cannot be used as an identifier for variables, constants, classes etc. Often it is an element of language syntax. Syntactical structure is simplified by means of keywords. A keyword can be thought of as a &#8220;semantic marker&#8221; on many occasions i.e. it denotes a location where something particular meaningful is happening. The<br>`while` keyword points to a while-loop, the `class` keyword to a class definition etc. Other keywords like `as` or `in` link together expressions and have a fixed location within a statement.<\/p>\n\n\n\n<p>A <em>contextual keyword<\/em> is sort of a hybrid. It acts much like a keyword but is not reserved. The word `as` was such a contextual keyword in Python until it became a proper keyword in version 2.5. The C# language defines lots of contextual keywords besides the regular ones. MSDN <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/the35c6y.aspx\">defines<\/a> a contextual keyword as:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p>A contextual keyword is used to provide a specific meaning in the code, but it is not a reserved word [in C#].<\/p><\/blockquote>\n\n\n\n<p>Contextual keywords in C# are used properly in the grammar description of the language &#8211; like regular keywords. For example the `add_accessor_declaration` is defined by the following C# rule:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">add_accessor_declaration: [attributes] \"add\" block<\/pre>\n\n\n\n<h3>Keywords in EasyExtend<\/h3>\n\n\n\n<p>In EasyExtend keywords are implicitly defined by their use in a grammar file. The &#8220;add&#8221; string in the `add_accessor_declaration` becomes automatically a keyword. Technically a keyword is just an ordinary name and the tokenizer produces a `NAME` token. It&#8217;s easy to verify this: when we inspect the token stream of a function definition `def foo(): pass` we&#8217;ll notice that `def ` and `pass` are mapped onto the `NAME` token:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"> &gt;&gt;&gt; def foo(): pass\n[token&gt;\n----------------------------------------------------------------.\n Line  | Columns | Token Value      | Token Name    | Token Id  |\n-------+---------+------------------+---------------+-----------+\n 1     | 0-3     | 'def'            | NAME          | 1 -- 1    |\n 1     | 4-7     | 'foo'            | NAME          | 1 -- 1    |\n 1     | 7-8     | '('              | LPAR          | 7 -- 7    |\n 1     | 8-9     | ')'              | RPAR          | 8 -- 8    |\n 1     | 9-10    | ':'              | COLON         | 11 -- 11  |\n 1     | 11-15   | 'pass'           | NAME          | 1 -- 1    |\n 1     | 15-15   | '\\n'             | NEWLINE       | 4 -- 4    |\n 2     | 0-0     | ''               | ENDMARKER     | 0 -- 0    |\n----------------------------------------------------------------'\n<\/pre>\n\n\n\n<p>In the parse tables keywords are preserved and we will find a state description like `(&#8216;pass&#8217;, 1, 273)` which explicitly refer to the `pass` keyword. It will be clearly distinguished from token id&#8217;s of type `NAME` which are used otherwise ( e.g. for `foo` ). Now the `pass` token in the token-stream has precisely following structure `[1, &#8216;pass&#8217;, 1, (11, 15)]`. Usually the token id is all that the parser needs to know but when the parser encounters a `NAME` token the token value is a keyword the keyword is turned into a token type. We can summarize this using following function<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"Python\" class=\"language-Python\">\ndef compute_token_type(tok):\n    tok_type  = tok[0]      # the standard token type\n    tok_value = tok[1]\n    if tok_type == token.NAME:\n        if tok_value in keywords:\n            return tok_value    # tok_value is a keyword and used as a token type\n    return tok_type\n<\/code><\/pre>\n\n\n\n<h3>Contextual Keywords in EasyExtend<\/h3>\n\n\n\n<p>Unlike keywords which can simply be determined from the grammar this isn&#8217;t possible with contextual keywords. They have to be made explicit elsewhere. When I considered contextual keywords in EasyExtend I refused to create special infrastructure for them but used a simple trick instead. Suppose one defines following token in a`Token.ext` file<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">    CONTEXT_KWD: 'add' | 'remove'<\/pre>\n\n\n\n<p>This is just an ordinary token definition. When `parse_token.py` is generated from `Token`+`Token.ext` we&#8217;ll find following settings<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">    CONTEXT_KWD = 119896<\/pre>\n\n\n\n<p>and<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">    token_map = {\n        ...\n        'add|remove': 119896,\n        ...\n    }<\/pre>\n\n\n\n<p>The `parse_token.py` provides obviously sufficient information to determine contextual keywords form the `CONTEXT_KWD` token definition. We exploit this in the next function definition:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"Python\" class=\"language-Python\">def generate_contextual_kwds(parse_token):\n    try:\n        ctx_kwd = parse_token.CONTEXT_KWD\n        t_map   = swap_dict(parse_token.token_map)      # swap_dict turns dict keys\n                                                        # into values and values into keys\n        contextual_keywords = set( t_map[ctx_kwd].split(\"|\") )\n        assert contextual_keywords &lt;= keywords\n        return contextual_keywords\n    except AttributeError:\n        return set()<\/code><\/pre>\n\n\n\n<p>Now we have to understand the semantics of contextual keywords. The following code illustrates the behavior of the parser in the presence of contextual keywords.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"Python\" class=\"language-Python\">def compute_next_selection(token_stream, parse_table):\n    tok = token_stream.current()\n    tok_type = compute_token_type(tok)\n    selection = parse_table.selectable_ids()   # provides all token ids and nonterminal ids which are\n                                               # admissible at this state\n    for s in selection:\n        if s == tok_type or tok_type in first_set(s):\n            return s\n    # here we start to deal with contextual keywords\n    elif tok_type in contextual_keywords:\n        # replace the token type of the contextual keyword by NAME\n        tok_type = token.NAME\n        for s in selection:\n            if s == tok_type or tok_type in first_set(s):\n                return s\n    raise ParserError<\/code><\/pre>\n\n\n\n<p>When a keyword is needed in the particular context ( tok_type was accepted in the first for-loop ) either the tok_type itself is returned or a node id of a non-terminal that derives the tok_type ( tok_type is in the `first_set` of the node id ). If this fails it is still permitted that instead of the contextual keyword the `NAME` token is examined in the same way as the keyword. So if a particular context requires a contextual keyword this keyword is provided otherwise it is checked whether the context just requires a `NAME` and the `NAME` is provided instead. This has the practical implication that a contextual keyword can be used as an identifier of a variable.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Keywords and Contextual Keywords A language keyword is a reserved word that cannot be used as an identifier for variables, constants, classes etc. Often it is an element of language syntax. Syntactical structure is simplified by means of keywords. A &hellip; <a href=\"http:\/\/fiber-space.de\/wordpress\/2009\/03\/22\/contextual-keywords\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[16,13,12,4],"tags":[],"_links":{"self":[{"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/posts\/194"}],"collection":[{"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/comments?post=194"}],"version-history":[{"count":45,"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/posts\/194\/revisions"}],"predecessor-version":[{"id":2288,"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/posts\/194\/revisions\/2288"}],"wp:attachment":[{"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/media?parent=194"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/categories?post=194"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/fiber-space.de\/wordpress\/wp-json\/wp\/v2\/tags?post=194"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}