Redesign of the code.py and codeop.py modules

Brett Cannon asks for modules of the stdlib to be redesigned. I find the idea rather bizarre to initiate a poll for this but maybe that’s just the future of programming where the quality of an implementation is judged by democratic voting. So I immersed into the hive mind and voted for distutils. Seems like Tarek Ziade addresses this already but I’m not entirely sure he goes far enough. Last time I looked at the source code there were still all kinds of compiler modules in the lib which contain config information closely coupled with application code. That’s not so nice and mostly a refactoring bit.

Some other of the stdlib modules I’d rewrite are not mentioned in the voting list. Maybe they are not sexy enough for the majority of web programmers that dominate all the discussions about Python? Among my favorites are code.py and codeop.py. Here is a brief but incomplete list of requirements and refactorings.

  • The heuristics used to determine incomplete Python commands in _maybe_compile is pretty weak.
  • Can you tell the difference between Compile, CommandCompiler and compile_command in codeop.py?
  • Encapsulate the raw_input function in interact within a method that can be overwritten.
  • provide two methods at_start and at_exit in InteractiveConsole to make startup and shutdown customizable.
  • Separate interactive loop from line processing and implement the line processor as a generator. It’s easier to write custom interactive loops for systems that interface with Python. The default interact method becomes
    def interact(self):
        self.at_start()
        try:
            gen_process = self.process_line()
            line = None
            while True:
                try:
                    prompt = gen_process.send(line)
                    line   = self.user.get_input(prompt)
                except StopIteration:
                    break
        finally:
            self.at_exit()
  • Move the the line terminating heuristics from _maybe_compile into process_line and define a try_parse function together with a try_compile function. I’d go a little further even and define a try_tokenize function which isn’t essential though.
  • Provide a subclass for interactive sessions which can be recorded and replayed and command line options accordingly. This is optional though and not part of a redesign strategy.

There are other modules I’d like to rewrite such as tokenizer.py. Having a lexer in the stdlib which handles Python as a special case would be quite a big deal IMO. But it’s delicate and I struggle with writing lexers which can be both extended in a simple way ( without the trouble of running into ordered choice problems of the current regular expression engine ) and have a high performance. So far I only accomplished the first of the goals, at least partially, but not the second one.

  1. Brett says:

    Well, the poll is just because I am curious, not because I actually expect the poll to lead to anything happening.

    As for fixing code and codeop, it’s probably needed. =) Both modules are rather old and it’s most of the older code that shows its lack of support for what the current Python API style is.

  2. kay says:

    The poll is a funny idea and I made a little fun of it. So everyone had some benefit 🙂

    Go on, Brett. It’s very good that someone seriously cares. As for code.py and codeop.py I’d volunteer once there is a process – not sure you’ll simply use the PEP process for stdlib redesign? – and activities takes shape.

  1. There are no trackbacks for this post yet.

Leave a Reply