Trails in a Langscape | Projects and projections

Pyload part I ( Path objects )

Posted on 7. April 2009 by kay

This article is the first in a series that studies the design of a module import system. Although this work is situated in the Python context it can stay on its own to a large extent and ideas may be transferred to other systems and languages. We will provide as much integration with Python as necessary but keep the structure as general as possible. The article series will roughly cover following topics:

Path objects – module paths and their representation
ModuleNode objects – loading and caching modules using Path objects
Import system – binding ModuleNode objects to import statements
Implementation – putting it all together in an EasyExtend langlet as a reference implementation

In this article we will discuss `Path` objects. This topic is foundational and a bit dry but I hope the reader will be compensated by the elegance of some its constructions. `Path` objects are used to establish a relationship between internally used name spaces and external path structures in physical or logical “media” like file-systems, zip-files or the web. Those path structures can also be fully abstract and represented by special data-structures only. We’ll provide several examples.

First some terminology is introduced. Many of the notions given here a rather abstract but they mostly capture what people know about Python modules, packages and file system paths anyway. They provide a conceptual background for specifications given later.

Terminology

A module name is an ordinary Python name: a finite sequence of alphanumeric characters and underscores starting with an alphabetic character or underscore. A module path is a dot-separated sequence of module names. It is possible for a module path to be preceded by dots. In that case a module path is called a relative path. `A.B.C` and `..A.B.C` are both module paths but only the latter one is relative. The names of a module path are also called its components. So `A`, `B`, `C` are the components of the module path `A.B.C`.

Besides module paths we consider external paths. The intuitive meaning of an external path is that of a pointer to a location of a module in some medium. Most commonly file system paths are used as external paths: modules are represented as files and the dot separators are mapped onto file-system separators. Throughout this article we use a slash “/” as an external path separator. So `A.B.C` is a module path and `A/B/C` is an external path. A proper external path definition is given below using a `Path` abstract base class.

A module can be loaded from an external path which yields an interpreter level <`module`> object. Each <`module`> object shall have a unique module name. If `M` is a known module name we write <`M`>. It is also possible to load <`module`> objects from builtins or even create fresh <`module`> objects on the fly. In any case we still consider a <`module`> being loaded from a path. If no such path is available we associate the <`module`> with the empty path.

A module path `A.B.C…` is valid if an external path `…/A/B/C/…` exists and `` can be loaded from `…/A`, `` can be loaded from `…A/B` etc.

Posted in Python | Comments Off on Pyload part I ( Path objects )

Broken beyond repair – Pythons import system

Posted on 3. April 2009 by kay

It often happens that language features we have once distasted become comfortable after a while. We learn to live with them, take their advantages and forget about their annoyance. For many Python developers things like the explicit `self` in the argument list of a method is among them or `len` as a function. Experienced Python programmers are usually weakly excited about alternative proposals and the longer this quirk is discussed their mood turns towards -0 which is encoded as “why bothering?”.

Other features are the cause of continuous struggle and we never get used to them but don’t talk much about them because they aren’t low hanging fruits. They might even turn from bad to worse in the course of improving them in some respects. Those features feel like being broken beyond repair which means that it’s not easy to patch them but they need a fundamental redesign. In my opinion Pythons import machinery has this quality and it was regrettably forgotten in the many “cleanups” of Python 3.

Pythons import machinery is complicated to understand but this alone isn’t an argument against it. Many systems are far more complex but they are well thought out. The import machinery seems to be designed by implementation and cobbled together. This has finally lead to accidental design artifacts like the module/script distinction which makes sense only in the light of Pythons import implementation.

Python always suffered from the problem that two modules `M1.py` and `M2.py` which import a third module `M3.py` from different import paths also receive different module objects ``. This is as if we gave up coordinate independence but believed as we are moving through the space all objects are changing continuously. The objects themselves not their perspective of them. We really get different objects and we can’t compare our data with each other.

Not only are module objects path dependent but the success of a module import may depend on the liveliness of other modules. Suppose `M1.py` defines the following relative import: `from . import M2`. This import fails if the package `P` containing both `M1.py` and `M2.py` has not been imported already. When running `M1.py` from the command line it’s not relevant that we declared `P` as a package by placing an `__init__.py` file into it we still get the increasingly famous error message:

ValueError: Attempted relative import in non-package

This is not a Monty Python skit. The package exists and exists not. The interpreter doesn’t make any use of the availability of package `P` as long as it hasn’t been imported explicitly. But why does it has to be imported only to find `M2.py`? There is no obvious answer. It’s just the way it is implemented.

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. (Donald E. Knuth)“

Back to the first problem. Once `M1.py` has imported `M3.py` the `` will be accessible on M1’s import path in the module cache `sys.modules` which is a flat dictionary of `{module-path: }` pairs. The module `M2.py`uses a different import path and a new and different module `` is created which is cached using another module-path.

This lookup with a module-path string is fast and it is usually faster than ordinary attribute lookup in objects which has to jump through a nested dictionary. I do think that’s basically it. If we implement a proper module cache providing the moderate performance of attribute lookups in objects we can get rid of the tedious import machinery that Python has right now. But I’d like to go even a little further and claim that a whole import system can be designed directly from a tiny set of objects and their properties. No Finders, Loaders, Importers and all the clunky stuff which pretends to be objects but is actually a service which serves the void.

I’ll flesh this idea out in a series of articles on this blog soon. This can be understood then as the constructive part of my criticism.

Newsflash: chances are that higher Python officers will succeed in confusing the situation even more. PEP 382 is on the way. Good luck with explaining why Python needs two notions of packages whereas it is difficult enough to justify even one. It’s not a coincidence though that this is placed in the context of setuptools. When Pythons import system is a tragedy than distutils and the layers above are the accompanying farce.

Posted in Python | 9 Comments

Contextual Keywords

Posted on 22. March 2009 by kay

Keywords and Contextual Keywords

A language keyword is a reserved word that cannot be used as an identifier for variables, constants, classes etc. Often it is an element of language syntax. Syntactical structure is simplified by means of keywords. A keyword can be thought of as a “semantic marker” on many occasions i.e. it denotes a location where something particular meaningful is happening. The
`while` keyword points to a while-loop, the `class` keyword to a class definition etc. Other keywords like `as` or `in` link together expressions and have a fixed location within a statement.

A contextual keyword is sort of a hybrid. It acts much like a keyword but is not reserved. The word `as` was such a contextual keyword in Python until it became a proper keyword in version 2.5. The C# language defines lots of contextual keywords besides the regular ones. MSDN defines a contextual keyword as:

A contextual keyword is used to provide a specific meaning in the code, but it is not a reserved word [in C#].

Contextual keywords in C# are used properly in the grammar description of the language – like regular keywords. For example the `add_accessor_declaration` is defined by the following C# rule:

add_accessor_declaration: [attributes] "add" block

Keywords in EasyExtend

In EasyExtend keywords are implicitly defined by their use in a grammar file. The “add” string in the `add_accessor_declaration` becomes automatically a keyword. Technically a keyword is just an ordinary name and the tokenizer produces a `NAME` token. It’s easy to verify this: when we inspect the token stream of a function definition `def foo(): pass` we’ll notice that `def ` and `pass` are mapped onto the `NAME` token:

>>> def foo(): pass [token> ----------------------------------------------------------------. Line | Columns | Token Value | Token Name | Token Id | -------+---------+------------------+---------------+-----------+ 1 | 0-3 | 'def' | NAME | 1 -- 1 | 1 | 4-7 | 'foo' | NAME | 1 -- 1 | 1 | 7-8 | '(' | LPAR | 7 -- 7 | 1 | 8-9 | ')' | RPAR | 8 -- 8 | 1 | 9-10 | ':' | COLON | 11 -- 11 | 1 | 11-15 | 'pass' | NAME | 1 -- 1 | 1 | 15-15 | '\n' | NEWLINE | 4 -- 4 | 2 | 0-0 | '' | ENDMARKER | 0 -- 0 | ----------------------------------------------------------------'

In the parse tables keywords are preserved and we will find a state description like `(‘pass’, 1, 273)` which explicitly refer to the `pass` keyword. It will be clearly distinguished from token id’s of type `NAME` which are used otherwise ( e.g. for `foo` ). Now the `pass` token in the token-stream has precisely following structure `[1, ‘pass’, 1, (11, 15)]`. Usually the token id is all that the parser needs to know but when the parser encounters a `NAME` token the token value is a keyword the keyword is turned into a token type. We can summarize this using following function

def compute_token_type(tok): tok_type = tok[0] # the standard token type tok_value = tok[1] if tok_type == token.NAME: if tok_value in keywords: return tok_value # tok_value is a keyword and used as a token type return tok_type

Contextual Keywords in EasyExtend

Unlike keywords which can simply be determined from the grammar this isn’t possible with contextual keywords. They have to be made explicit elsewhere. When I considered contextual keywords in EasyExtend I refused to create special infrastructure for them but used a simple trick instead. Suppose one defines following token in a`Token.ext` file

CONTEXT_KWD: 'add' | 'remove'

This is just an ordinary token definition. When `parse_token.py` is generated from `Token`+`Token.ext` we’ll find following settings

CONTEXT_KWD = 119896

and

token_map = { ... 'add|remove': 119896, ... }

The `parse_token.py` provides obviously sufficient information to determine contextual keywords form the `CONTEXT_KWD` token definition. We exploit this in the next function definition:

def generate_contextual_kwds(parse_token): try: ctx_kwd = parse_token.CONTEXT_KWD t_map = swap_dict(parse_token.token_map) # swap_dict turns dict keys # into values and values into keys contextual_keywords = set( t_map[ctx_kwd].split("|") ) assert contextual_keywords <= keywords return contextual_keywords except AttributeError: return set()

Now we have to understand the semantics of contextual keywords. The following code illustrates the behavior of the parser in the presence of contextual keywords.

def compute_next_selection(token_stream, parse_table): tok = token_stream.current() tok_type = compute_token_type(tok) selection = parse_table.selectable_ids() # provides all token ids and nonterminal ids which are # admissible at this state for s in selection: if s == tok_type or tok_type in first_set(s): return s # here we start to deal with contextual keywords elif tok_type in contextual_keywords: # replace the token type of the contextual keyword by NAME tok_type = token.NAME for s in selection: if s == tok_type or tok_type in first_set(s): return s raise ParserError

When a keyword is needed in the particular context ( tok_type was accepted in the first for-loop ) either the tok_type itself is returned or a node id of a non-terminal that derives the tok_type ( tok_type is in the `first_set` of the node id ). If this fails it is still permitted that instead of the contextual keyword the `NAME` token is examined in the same way as the keyword. So if a particular context requires a contextual keyword this keyword is provided otherwise it is checked whether the context just requires a `NAME` and the `NAME` is provided instead. This has the practical implication that a contextual keyword can be used as an identifier of a variable.

Posted in csharp, EasyExtend, Grammars, Python | Comments Off on Contextual Keywords

CPython vs IronPython

Posted on 10. March 2009 by kay

Two video shots tell more than 2000 words.

CPython

Get the latest Flash Player to see this player.

IronPython

Get the latest Flash Player to see this player.

Posted in Python | Comments Off on CPython vs IronPython

The thunk_stmt and programmable semantics In Python

Posted on 9. March 2009 by kay

Seems as if there was just a single direction of syntactical improvement of Python these days which is the introduction of Ruby style blocks. Tav suggests to reuse Pythons with-statement:

with_stmt: "with" expression (["as" target] | ["do" [parameter_list]]) ":" suite
with_stmt: "with" expression (["as" target] | ["do" [parameter_list]]) ":" suite

Guido responds promptly and rejects the semantic dispersion of the with-statement. No meaning changer by using just a magical `do` keyword. In Tavs vision the do-branch of the with-statement is turned into an anonymous closure that is called by a special `__do__` function. The `__do__` shall also be customizable which finally leads to DSLs:

But more enterprising __do__ functions could do interesting things like rewriting the AST of the BLOCK_CODE. Given that this is already done by many many Python libraries, it’s not much of a stretch of the imagination. The benefit? Super nice Pythonic DSLs!

Well it is quite easy to put this idea upside down. Instead of re-purposing a special purpose statement like `with` one can simply turn each expression and “small statement” e.g. raise, assert, return, assignment etc. into a compound statement using following definition

thunk_stmt: small_stmt ':' suite
thunk_stmt: small_stmt ':' suite

One has to elaborate on it a little and change rules of Pythons existing grammar for LL(1) conformance

stmt: compound_stmt single_input: NEWLINE | compound_stmt NEWLINE compound_stmt: ( if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | thunk_stmt ) thunk_stmt: small_stmt ( ':' suite | (';' small_stmt)* [';'] NEWLINE )
stmt: compound_stmt single_input: NEWLINE | compound_stmt NEWLINE compound_stmt: ( if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | thunk_stmt ) thunk_stmt: small_stmt ( ':' suite | (';' small_stmt)* [';'] NEWLINE )

Here the `thunk_stmt` is a fusion of the definition stated above and the `simple_stmt` of Pythons grammar. This definition doesn’t lead to programmable syntax but programmable semantics which is what Tav intends to support with a customizable `__do__`.

An obvious use case would be property definitions:

x = property(): def get(self): return self._x def set(self, value): self._x = value
x = property(): def get(self): return self._x def set(self, value): self._x = value

It simply appends a block of code to a valid assignment `x = property()`.

Same goes with

employees.select.do(emp)
employees.select.do(emp)

which might be extended to

employees.select.do(emp): if emp.salary > developer.salary: return fireEmployee(emp) else: return extendContract(emp)
employees.select.do(emp): if emp.salary > developer.salary: return fireEmployee(emp) else: return extendContract(emp)

Without an appended block the expression

employees.select.do(emp)
employees.select.do(emp)

might simply mean

employees.select(lambda emp: emp)
employees.select(lambda emp: emp)

It should be clear that Python needs a protocol to support this which means here that a `__special__` function is attached to an object and that for user defined classes this `__special__` function can be overwritten.

Posted in DSL, Grammars, Python | 4 Comments

Lonely Python is Intelligent Design and how to liberate from it

Posted on 7. March 2009 by kay

Python is doomed. Well, of course it is not yet doomed. Python is an evolutionary dead end living lonely on its own fitness plateau. It is clear that Python won’t have offspring because of significant whitespace aka indentation sensitivity. No children, no future.

You might ask where are the news? There are none and we knew this for ever and a decade but sometimes it is in the news again. Read this post by the author of a programming language called Reia. He struggles endlessly with indentation delimited blocks and finally abandons them for Ruby style `end` terminators ( not before a popularity poll in the Web 2.0 age and consulting the author of the failed Logix project who is now a Ruby advocate). Finally his language looks like Ruby which demonstrates the evolutionary fitness of Ruby. Python on the other hand is a piece of Intelligent Design. Guido van Rossum has managed a ” beautiful set of compromises” when he designed the Python grammar. So it ends up in overspecialization whereas languages like JavaScript are flowing more with the nature. To illustrate this nature you might examine the file jQuery.js that implements the popular jQuery library. The impatients might immediately scroll to the last line. It’s sweet.

Some people have argued that Haskell style layouts may save the soul of significant whitespace. Transplanted to Python this would imply mandatory explicit block delimiters that can also be omitted for the now classical indentation style. If you are not willing to decide just give multiple choice style freedom to anyone.

In the rest of the article I’m discussing optional explicit delimiters for Python built into the grammar and making the statement/expression distinction weightless. I promise not to cringe when it becomes “ugly”. Not behaving like little girls seems to be an evolutionary dead end among male, adult programmers as well but I take it with stoicism.

Grammar Extension

atom: ( '(' [embedded_stmt|yield_expr] ')' |'[' [listmaker] ']' | '{' [dictmaker] '}' | '`' testlist1 '`' | NAME | NUMBER | STRING+) trailer: '(' [arglist|embedded_suite] ')' | '[' subscriptlist ']' | '.' NAME embedded_stmt: ( embedded_funcdef | embedded_if_stmt | embedded_while_stmt | embedded_for_stmt | embedded_try_stmt | embedded_with_stmt | embedded_classdef | embedded_small ) embedded_small: small_stmt (';' small_stmt)* embedded_suite: embedded_stmt+ embedded_funcdef: [decorators] 'def' NAME parameters ':' embedded_suite embedded_if_stmt: ( 'if' test ':' embedded_suite ('elif' test ':' embedded_suite)* ['else' ':' embedded_suite]) embedded_while_stmt: 'while' test ':' embedded_suite ['else' ':' embedded_suite] embedded_for_stmt: ( 'for' exprlist 'in' testlist ':' embedded_suite ['else' ':' embedded_suite] ) embedded_try_stmt: ('try' ':' suite ((except_clause ':' embedded_suite)+ ['else' ':' embedded_suite] ['finally' ':' embedded_suite] | 'finally' ':' embedded_suite)) embedded_with_stmt: 'with' test [ with_var ] ':' embedded_suite embedded_classdef: 'class' NAME ['(' [testlist] ')'] ':' embedded_suite
atom: ( '(' [embedded_stmt|yield_expr] ')' |'[' [listmaker] ']' | '{' [dictmaker] '}' | '`' testlist1 '`' | NAME | NUMBER | STRING+) trailer: '(' [arglist|embedded_suite] ')' | '[' subscriptlist ']' | '.' NAME embedded_stmt: ( embedded_funcdef | embedded_if_stmt | embedded_while_stmt | embedded_for_stmt | embedded_try_stmt | embedded_with_stmt | embedded_classdef | embedded_small ) embedded_small: small_stmt (';' small_stmt)* embedded_suite: embedded_stmt+ embedded_funcdef: [decorators] 'def' NAME parameters ':' embedded_suite embedded_if_stmt: ( 'if' test ':' embedded_suite ('elif' test ':' embedded_suite)* ['else' ':' embedded_suite]) embedded_while_stmt: 'while' test ':' embedded_suite ['else' ':' embedded_suite] embedded_for_stmt: ( 'for' exprlist 'in' testlist ':' embedded_suite ['else' ':' embedded_suite] ) embedded_try_stmt: ('try' ':' suite ((except_clause ':' embedded_suite)+ ['else' ':' embedded_suite] ['finally' ':' embedded_suite] | 'finally' ':' embedded_suite)) embedded_with_stmt: 'with' test [ with_var ] ':' embedded_suite embedded_classdef: 'class' NAME ['(' [testlist] ')'] ':' embedded_suite

This is a conservative grammar extension of the official Python 2.5 Grammar. Conservative means that all expressions / statements written in Python 2.5 remain valid. Those non-terminals that substitute existing ones are `atom` and `trailer`. In detail the `testlist_gexp` non-terminal in `atom` is replaced by the newly defined `embedded_stmt` non-terminal and instead of an `arglist` an `embedded_suite`can be used . The `embedded_stmt` non-terminal is the gateway for all kinds of statements embedded into expressions. The statement-expression distinction becomes de facto obsolete.
What happens to significant whitespace? Post-processing the token stream filters whitespace inside of parentheses. Hence the token stream won’t contain `NEWLINE`, `INDENT`and `DEDENT` token within expressions. All those `embedded_xxx` non-terminals are only adjustments of existing ones reflecting the absence of the mentioned whitespace token in expressions. Notice that you can’t mix. Once you are living in a grammatical expression whitespace becomes definitely insignificant.

The grammar is not LL(1) and you need a stronger parser than Pythons own one to handle it. There is also no behavior associated with the new definitions. Semantics shall be entirely neglected and only the future shape of things shall be highlighted. Here are a few examples:

Multiline Lambda

lambda x,y: ( z = x+y; # semicolons always separate small statements print z; return z )(3, 2)
lambda x,y: ( z = x+y; # semicolons always separate small statements print z; return z )(3, 2)

Long Assertions

assert ( if x>0: x == 2 else: x == -2)
assert ( if x>0: x == 2 else: x == -2)

The Return of the Return

def foo(x): return ( return )
def foo(x): return ( return )

My Property

x = property( def fget(self): return self._x def fset(self, value): self._x = value )
x = property( def fget(self): return self._x def fset(self, value): self._x = value )

Function in the middle

x + (def fact(n): if n<=1: return 1 else: return fact(n-1))(7) + z
x + (def fact(n): if n<=1: return 1 else: return fact(n-1))(7) + z

Posted in DSL, Grammars, Python | 2 Comments

Future !

Posted on 25. February 2009 by kay

Posted in General | 2 Comments

Refactoring the ANSI C grammar

Posted on 21. February 2009 by kay

A nice blog article by Eli Bendersky about the distinction between ASTs and CSTs.

After having clarified the concepts Eli tried to make the case for ASTs. He presents the C grammar or more precisely the part of the C grammar that deals with declarations and mentions the obscurity of the grammar that is immediately reflected in the structure of the CST. ASTs serve as some sort of cleanup. This is all true: the C grammar has obscure parts and his ASTs are cleanups but does it justify the extra effort being spent for creating and maintaining ASTs?

A grammar is just another computer program written in (E)BNF and you can strive for clean and accessible BNF programs just like you do in any other programming language. So what about refactoring? Now the ANSI C grammar might not be among the finest parts of the C language but it’s still authoritative and it’s also not easy to show that another grammar describes the same language syntax.

I’ve nevertheless attempted to rewrite the declaration part of the C grammar. There are a few omissions here. For example storage class specifiers and type qualifiers like `const` are missing and there are also no function declarations. It shall be easy though to add them given the syntactical description of the top level `declaration` rule described below. There are also just a few type names for testing purposes. Determining type names is an entirely different issue we do not want to treat in this blog post.

declaration: typename declarator declarator: simple_declarator | func_ptr_declarator | ptr_declarator | array_declarator simple_declarator: simple_declarator_base | '(' simple_declarator ')' simple_declarator_base: NAME | '(' NAME ')' ptr_declarator: ptr_declarator_base | '(' ptr_declarator ')' ptr_declarator_base: ptr_or_ref declarator func_ptr_declarator: func_ptr_declarator_base | '(' func_ptr_declarator ')' func_ptr_declarator_base: '(' ( ptr_declarator | array_declarator ) ')' arglist array_declarator: array_declarator_base | '(' array_declarator ')' array_declarator_base: ( ptr_declarator | simple_declarator | func_ptr_declarator ) subscript_list subscript_list: ('[' [NAME | Intnumber] ']')+ ptr_or_ref: '*'+ ['&'] | '&' arglist: '(' [parameters] ')' parameters: declaration (',' declaration)* typename: 'int' | 'float' | 'char' | 'void'
declaration: typename declarator declarator: simple_declarator | func_ptr_declarator | ptr_declarator | array_declarator simple_declarator: simple_declarator_base | '(' simple_declarator ')' simple_declarator_base: NAME | '(' NAME ')' ptr_declarator: ptr_declarator_base | '(' ptr_declarator ')' ptr_declarator_base: ptr_or_ref declarator func_ptr_declarator: func_ptr_declarator_base | '(' func_ptr_declarator ')' func_ptr_declarator_base: '(' ( ptr_declarator | array_declarator ) ')' arglist array_declarator: array_declarator_base | '(' array_declarator ')' array_declarator_base: ( ptr_declarator | simple_declarator | func_ptr_declarator ) subscript_list subscript_list: ('[' [NAME | Intnumber] ']')+ ptr_or_ref: '*'+ ['&'] | '&' arglist: '(' [parameters] ')' parameters: declaration (',' declaration)* typename: 'int' | 'float' | 'char' | 'void'

This grammar can be used to accurately parse following expressions:

int x; int *x; int (x); int (*x); int ((*x)); int ((*x)()); int ((*x)())[]; int ((*x)(int y))[]; int ((*x)(int y, char z[4]))[]; int ((*x)(int y, char z[4]))[][1]; char **foo [][8]; char (**foo [][8]); char (**foo [][8])(); char ((**foo [][8])()); char *((**foo [][8])()); char (*(**foo [][8])()); char (*(**foo [][8])())[]; char *(*(**foo [][8])())[];
int x; int *x; int (x); int (*x); int ((*x)); int ((*x)()); int ((*x)())[]; int ((*x)(int y))[]; int ((*x)(int y, char z[4]))[]; int ((*x)(int y, char z[4]))[][1]; char **foo [][8]; char (**foo [][8]); char (**foo [][8])(); char ((**foo [][8])()); char *((**foo [][8])()); char (*(**foo [][8])()); char (*(**foo [][8])())[]; char *(*(**foo [][8])())[];

There is still some overhead because we have pairings like `simple_declarator` and `simple_declarator_base`. These distinctions have the only purpose of putting the basic form into parentheses.

A parse tree display for

int (*x)();
int (*x)();

looks like this in EasyExtend:

declaration -- S`116994 -- 258 typename -- S`117008 -- 272 NAME -- T`116737 -- 1 L`1 int declarator -- S`116995 -- 259 func_ptr_declarator -- S`117000 -- 264 func_ptr_declarator_base -- S`117001 -- 265 LPAR -- T`116743 -- 7 L`1 ( ptr_declarator -- S`116998 -- 262 ptr_declarator_base -- S`116999 -- 263 ptr_or_ref -- S`117005 -- 269 STAR -- T`116752 -- 16 L`1 * declarator -- S`116995 -- 259 simple_declarator -- S`116996 -- 260 simple_declarator_base -- S`116997 -- 261 NAME -- T`116737 -- 1 L`1 f RPAR -- T`116744 -- 8 L`1 ) arglist -- S`117006 -- 270 LPAR -- T`116743 -- 7 L`1 ( RPAR -- T`116744 -- 8 L`1 ) SEMI -- T`116749 -- 13 L`1 ;
declaration -- S`116994 -- 258 typename -- S`117008 -- 272 NAME -- T`116737 -- 1 L`1 int declarator -- S`116995 -- 259 func_ptr_declarator -- S`117000 -- 264 func_ptr_declarator_base -- S`117001 -- 265 LPAR -- T`116743 -- 7 L`1 ( ptr_declarator -- S`116998 -- 262 ptr_declarator_base -- S`116999 -- 263 ptr_or_ref -- S`117005 -- 269 STAR -- T`116752 -- 16 L`1 * declarator -- S`116995 -- 259 simple_declarator -- S`116996 -- 260 simple_declarator_base -- S`116997 -- 261 NAME -- T`116737 -- 1 L`1 f RPAR -- T`116744 -- 8 L`1 ) arglist -- S`117006 -- 270 LPAR -- T`116743 -- 7 L`1 ( RPAR -- T`116744 -- 8 L`1 ) SEMI -- T`116749 -- 13 L`1 ;

Still much clutter but now it is easy to identify the semantically relevant entities and search for them. In EasyExtend we use simple functions like `find_node` and `find_all` which keep a CST node and a `node id` parameter and perform a depth-first search.

Posted in C, Grammars | Comments Off on Refactoring the ANSI C grammar

Countdown

Posted on 8. February 2009 by kay

The Flusser hierarchy

4 the natural man in the state of concrete spatio-temporal experience.
No mind in escape.

3 the modeler and maker whose hands shape the things and brings them to rest.

2 the artist imagines the world. Eternal return and circulation on a magical surface.

1 the author linearizes the flow of things in symbols and historical time.

0 the programmer computes the it from the bit.

Posted in F, General | Comments Off on Countdown

Newer posts →

Search for:

Recent Posts

Recursions in Python 3.8

Kites on a zebra crossing: an algorithmic challenge

Greedy grammars and Any

The state of the Trail parser generator

LL(*) faster than LL(1)?

Archives

July 2019

June 2012

May 2012

April 2012

September 2011

April 2011

January 2011

December 2010

November 2010

August 2010

July 2010

June 2010

April 2010

March 2010

February 2010

January 2010

December 2009

October 2009

September 2009

August 2009

July 2009

June 2009

May 2009

April 2009

March 2009

February 2009

Categories
Categories

Meta

Log in

Entries feed

Comments feed

WordPress.org

4	the natural man in the state of concrete spatio-temporal experience. No mind in escape.
3	the modeler and maker whose hands shape the things and brings them to rest.
2	the artist imagines the world. Eternal return and circulation on a magical surface.
1	the author linearizes the flow of things in symbols and historical time.
0	the programmer computes the it from the bit.

Terminology

Path objects

FSPath objects

ZipPath objects

TuplePath objects

Keywords and Contextual Keywords

Keywords in EasyExtend

Contextual Keywords in EasyExtend

CPython

IronPython

Recent Posts

Archives

Categories

Meta