Pyload part I ( Path objects )

This article is the first in a series that studies the design of a module import system. Although this work is situated in the Python context it can stay on its own to a large extent and ideas may be transferred to other systems and languages. We will provide as much integration with Python as necessary but keep the structure as general as possible. The article series will roughly cover following topics:

Path objects – module paths and their representation
ModuleNode objects – loading and caching modules using Path objects
Import system – binding ModuleNode objects to import statements
Implementation – putting it all together in an EasyExtend langlet as a reference implementation

In this article we will discuss `Path` objects. This topic is foundational and a bit dry but I hope the reader will be compensated by the elegance of some its constructions. `Path` objects are used to establish a relationship between internally used name spaces and external path structures in physical or logical “media” like file-systems, zip-files or the web. Those path structures can also be fully abstract and represented by special data-structures only. We’ll provide several examples.

First some terminology is introduced. Many of the notions given here a rather abstract but they mostly capture what people know about Python modules, packages and file system paths anyway. They provide a conceptual background for specifications given later.

Terminology

A module name is an ordinary Python name: a finite sequence of alphanumeric characters and underscores starting with an alphabetic character or underscore. A module path is a dot-separated sequence of module names. It is possible for a module path to be preceded by dots. In that case a module path is called a relative path. `A.B.C` and `..A.B.C` are both module paths but only the latter one is relative. The names of a module path are also called its components. So `A`, `B`, `C` are the components of the module path `A.B.C`.

Besides module paths we consider external paths. The intuitive meaning of an external path is that of a pointer to a location of a module in some medium. Most commonly file system paths are used as external paths: modules are represented as files and the dot separators are mapped onto file-system separators. Throughout this article we use a slash “/” as an external path separator. So `A.B.C` is a module path and `A/B/C` is an external path. A proper external path definition is given below using a `Path` abstract base class.

A module can be loaded from an external path which yields an interpreter level <`module`> object. Each <`module`> object shall have a unique module name. If `M` is a known module name we write <`M`>. It is also possible to load <`module`> objects from builtins or even create fresh <`module`> objects on the fly. In any case we still consider a <`module`> being loaded from a path. If no such path is available we associate the <`module`> with the empty path.

A module path `A.B.C…` is valid if an external path `…/A/B/C/…` exists and `` can be loaded from `…/A`, `` can be loaded from `…A/B` etc.

Terminology

Path objects

FSPath objects

ZipPath objects

TuplePath objects

Recent Posts

Archives

Categories

Meta