Skip to content

Instantly share code, notes, and snippets.

@jckarter
Created August 23, 2012 19:57
Show Gist options
  • Save jckarter/3440892 to your computer and use it in GitHub Desktop.
Save jckarter/3440892 to your computer and use it in GitHub Desktop.
Factor modules design

A package system for Factor

Factor's current module system and development model is heavily monolithic, reflecting Factor's history as a small-scale hobby project. The default settings encourage development within the Factor source tree and make it difficult to accommodate packaged read-only installations or projects developed outside of the source tree. Altering the module search path to accommodate these use cases is currently a manual and error-prone process. This proposal describes a new model for module lookup and installation with the following improvements:

  • a standardized search path for modules, allowing for global, user-local, project-local, and package-local module installation and lookup
  • a naming scheme for global and local modules
  • a "package" concept to enable compartmentalization of development work and streamline distribution and use of Factor code outside of the main repository

Goals

  • allow Factor to be installed following native platform conventions (/usr/local, .app bundle/App Store, Program Files) without modification or configuration
  • provide a standardized module search system that accommodates compartmentalized project development and does not require manual manipulation of search paths
  • localize the package and module namespace to mitigate potential name collisions at the word, module, or package levels and further compartmentalize project development
  • enable development of tools to automate installation and management of third-party source code
  • streamline some inconvenient aspects of Factor module layout, collating metadata into a single file and accommodating quick-and-dirty development by not requiring directory hierarchies for every module

Documentation

Modules

A module is a loadable unit of Factor source code. On disk, a module named foo consists of either:

  • A standalone Factor source file foo.factor, or
  • A directory containing a source file foo/foo.factor and none, any, or all of the following optional metadata files:
    • foo/foo-docs.factor, containing documentation for the module
    • foo/foo-tests.factor, containing the test suite for the module
    • foo/module.factormodule, a YAML text file containing a top-level map form with none, any, or all of the following key-value pairs:
      • author, the author's name as a string, or a sequence of authors' names as strings
      • platforms, the platforms under which the module is supported as a sequence of strings. The module will refuse to load under platforms it is not supported.
      • resources, a sequence of strings of relative pathnames of files that are required by the module
      • summary, a string describing the module
      • tags, a sequence of strings categorizing the module

Symbols defined in a source file are implicitly defined inside that source file's module. The IN: form to set the current module is only valid in interactive contexts. (A private form IN-UNSAFE: may be necessary for some low-level bootstrap code that currently uses IN: in nonstandard ways.)

The module name packages is reserved, since this directory name is used as the package-local package directory.

Factor source code is executed within the current module, the module that corresponds to that source file. The current module controls the following aspects of the Factor system:

  • Definition forms create new words inside the current module.
  • The current package, which controls module and package lookup, is the package in which the current module resides.

In an interactive context, the current module defaults to a virtual module (that is, a module without backing source code) named scratchpad in the current project. The current module may be changed with the IN: module.name form.

Packages

A package is a collection of related modules under a common subdirectory. Any directory containing Factor source files may serve as a package directory. A package comprises zero or more modules, stored inside the package directory hierarchically. The module named foo.bar is stored as a source file foo/bar.factor or directory foo/bar, as described under Modules, inside the package directory. It is an error if both the file foo/bar.factor and directory foo/bar exist in the same package.

The package directory may contain a packages subdirectory; if present, this directory will be searched for packages used locally by the package.

A package directory may contain a file package.factorpackage, a YAML text file containing a top-level map form with none, any, or all of the following key-value pairs:

  • name, the canonical name of the package as a string. This need not correspond to the directory name or local installed name used to reference the package.

  • uri, a URI string identifying the canonical origin resource for this package, such as a version control repository or tarball.

  • version, a string identifying the version of the package.

  • author, the author's name as a string, or a sequence of authors' names as strings

  • platforms, the platforms under which the package is supported as a sequence of strings. Modules within the package will refuse to load on unsupported platforms.

  • resources, a sequence of strings of relative pathnames of files that are required by the package

  • summary, a string describing the package

  • tags, a sequence of strings categorizing the package

  • requires, a sequence of maps describing packages the package depends on. Each map must have the following key-value pairs:

    • name, the install name for the package as a string.
    • uri, a URI string identifying a resource from which the package can be installed, such as a version control repository or an archive file.
    • version, a string identifying the version of the package that should be installed. If uri identifies a version control repository, this string may be used as a version or tag identifier.

    The name and uri specified for a needed package need not correspond to the package's own canonical name and uri.

  • requires-factor, a string identifying the Factor version required by this package.

A package directory may contain a file image.factorimage; if present, launching the Factor VM from inside the package directory will load this image by default.

When loading source code, the current package is the package in which the current module resides. The current package controls the following aspects of the Factor system:

  • Local module names are looked up inside the current package's directory.
  • The current package's packages directory is searched first for packages referenced by absolute module names.

In an interactive context, the project is the active package being manipulated by that session. When Factor is launched from the command line, the project defaults to the current directory, and can be changed by the PROJECT: form. From inside the Factor UI, projects can be opened using the 'Open' menu item or by the PROJECT: form from an open listener window.

Module names

Modules may be referenced in USING: or similar forms by global name, absolute name, or local name.

Global names have the form of a URI, in other words, scheme://host/path. The module source will be looked for at that URI. file:, git:, git+ssh:, http:, and https: URIs are supported.

Absolute names have the form package:foo.bar, which refers to the module at the path foo/bar or foo/bar.factor in the package package. An absolute name may appear with an empty package name, such as :foo.bar, which refers to a module in the standard factor package.

Local names have the form foo.bar, which refers to the module at the path foo/bar in the current package. Packages must work independent of their installed name; it is an error for a package to refer to itself by absolute name.

Module lookup

Local module names are looked up only in the current package directory. Absolute module names are looked up in the directory of the first package found by package lookup. Global module names (that is, URIs) are only searched for at the specified URI.

Modules referenced by a global name are considered to be standalone. There is no current package while loading a global named module, so such modules may not import using local module names and may only import by absolute or global module name.

Package lookup

When a package name is referenced, it is searched for in the following places in order:

  • The package-local packages directory. This directory contains Factor packages installed for use only by the current package. This directory is the packages subdirectory of the current package.
  • The user-local packages directory. This directory contains Factor packages installed by the current user for all projects.
    • Unix-style: ~/.factor/<version>/packages
    • Apple-style: ~/Library/org.factorcode.Factor/<version>/packages
    • Windows-style: %USERPROFILE%\AppData\Factor\<version>\packages
  • The site-local packages directory. This directory contains Factor packages installed for all users on the current machine.
    • Unix-style: /usr/local/lib/factor/<version>/packages
    • Apple-style: /Library/org.factorcode.Factor/<version>/packages
    • Windows-style: %FACTOR_INSTALL_PATH%\packages
  • The core directory. This directory contains the packages that make up Factor's standard library and should not be altered outside of Factor's installation process.
    • Unix-style: /usr/local/lib/factor/<version>/core
    • Apple-style: Factor.app/Contents/Resources/core
    • Windows-style: %FACTOR_INSTALL_PATH%\core

If package lookup resolves a package name to the current package, it is an error. Packages should work independent of their installed name. Packages may not be interdependent.

Different packages may have different sets of packages installed in their package-local packages directory. These package-local packages are only visible to modules in that package. The same package name may refer to a different package-local package from the perspective of a different package.

Standard package

Factor comes with a standard package named factor, whose name is reserved. It is an error if the package name factor resolves to anything other than the core factor package distributed with Factor.

Import forms

Factor source code may use the following syntax forms to access symbols in other modules. These syntax words are defined in the factor:syntax module.

  • USING: module module ... ; imports all public words from each module listed.
  • USE: module imports all public words from module.
  • FROM: module => word word ; imports the named public words from module.
  • QUALIFIED: module ; imports all public words from module as module:name.
  • QUALIFIED-AS: module => alias ; imports all public words from module as alias:name.

Export form

Imported symbols are private to the module; if a module foo imports blub from module bar, importers of foo will not see blub. The form EXPORT: word will reexport word through the active module. The form EXPORTS: word word ... ; will reexport all the listed words.

Packages in the interactive environment

Factor's developer environments, such as the command-line listener and the UI, operate on an active package, referred to as the project. Definition and import forms entered at a listener prompt are resolved with the project as the current package. The IN: module form may be used from a listener prompt to change the active module. If IN: names an absolute module path in another package, the active package for the listener is not changed. A different project may be opened with PROJECT: path, where path is the filesystem path to the new project directory.

The UI provides a document-based interface to projects. The 'Open' command in the UI opens a listener window with the chosen directory as its project.

Implementation work

  • Add package-based lookup to the vocab loader.
  • The vocab system may need structural changes to support packages.
  • All import forms in all source code will need to be altered to use packaged module names.
  • The modules currently shipped under core and basis can be packaged into a factor package. The contents of extra, and perhaps some parts of basis, can be broken up into separate packages and distributed separately. We could perhaps start by shipping a monolithic factor-extra package to put off more detailed breakup work.
  • Module metadata would be coalesced from the current multiple-text-files format into single YAML metadata files as described above.
  • The build system should gain the ability to build and test not only the Factor distribution, but separately-distributed packages.
  • Some new developer tools for maintaining and installing packages:
    • A package installer for searching for and installing packages from a central repository, similar to gem or npm
    • A bundle installer for installing package dependencies derived from package requires metadata
    • A metadata generator for packages, automatically serializing metadata from VCS and import forms used in the package's source code
  • The UI should be given document-based features as described above. UI tool windows would be associated with an active project, and the menu would provide New/Open/Save-type commands for creating, opening, and saving Factor projects

Questions

  • Should modules still be called vocabularies for tradition's sake?
  • There is no yaml parser in Factor currently, but there is a JSON parser. YAML is more convenient to hand-edit, but writing a parser is nontrivial. Since JSON is a subset of YAML, we could use JSON for module and package metadata files to begin with and generalize to YAML when/if a YAML parser is written.
  • The core/basis separation for bootstrapping is useful as a sandbox to prevent the bootstrap process from pulling in more than it's intended to. How to migrate the separation to the package system without exposing two separate standard modules?
  • The package:module naming scheme proposed requires all source code to be modified, which sucks. Should we use a naming scheme that allows some backward compatibility?
  • If we're going to change the module syntax anyway, should we also redesign import forms? A single Java- or Python-style import syntax would be more elegant than USING:/QUALIFIED:/FROM:/etc.
  • Package dependencies may require more detailed version information, given Factor's infrequent release schedule and frequent compatibility breakage
  • Would the resource: pseudo-path still be useful with this scheme?
  • Should the deploy tool work at the package level instead of/in addition to the module level?
@erg
Copy link

erg commented Aug 23, 2012

What about the case when you have package:foo.bar and the files are foo/bar.factor and foo/bar/bar.factor -- error?

Can we design for different versions of Windows/OS X/Linux Distro+Linux Kernel? Some kind of "kernel 3.2 or greater", "ubunu 10.04" etc kind of requirements.

Modules that are platform-restricted should not load any deeper modules. If package:foo.bar.windows is tagged as Windows7 then you shouldn't have to tag package:foo.bar.windows.baz as well. Perhaps a USE-UNSAFE: would be useful to get around the restriction if you wanted.

I agree we could start with JSON and autoconvert it to YAML someday.

Circular module dependencies would be awesome.

@mrjbq7
Copy link

mrjbq7 commented Aug 23, 2012

At some point it might be nice to move this conversation to a Factor issue to track it with the repo -

@erg
Copy link

erg commented Aug 23, 2012

I like the terminology packages, modules, functions, etc.

Is there a way we could overhaul private as well? USE: math. would be math.private maybe?

Could you refer to words defined by foo-tests.factor in another file? I've wanted to do this before--maybe docs and tests files shouldnt be special-cased?

"resource:" is kind of a bad idea. Maybe resource"data.txt" and module"math/numbers.bin".

@erg
Copy link

erg commented Aug 23, 2012

The above docs could describe a naming convention for platform-specific files. The current Factor naming convention is pretty good--windows/unix/linux/macosx, maybe just document it as part of the platform version section?

@jonenst
Copy link

jonenst commented Aug 23, 2012

What about inverting ":foo" and "foo", ie ":foo" is the local module name and "foo" is an absolute name like "factor:foo".Most USE:'d packages will come from factor, not from the current package. And it will make the non-standard modules standout with the extra ":"

USE: sequences kernel arrays :utils

vs

USE: :sequences :kernel :arrays utils

However, this means that for the factor package, we should remove the rule of referencing the current package only with a local name.

@jonenst
Copy link

jonenst commented Aug 23, 2012

In the description of the new needed UI package actions, regarding "Open/New/Save",

  • open; why do we need it if we the have "PROJECT: path" syntax word ?
  • new; is it just a mkdir then open ? Or is there some scaffolding (to be defined)? An eclipse-style wizard ?
  • save; what is there to save ?

IMHO, this seems to early to add to the UI listener. However, those features will feel more important and coherent when factor gets it's own editor (that slava said was coming in factor 2.0 :))

@jonenst
Copy link

jonenst commented Aug 23, 2012

in "Packages in the interactive environment", we should clarify that there is both a current module and a current package. Also clarify the functions of these two things:

  • current module:
    • words defined interactivaly are created in there
    • always in the search path for words
    • ... what else ?
  • current package
    • prepends "packages" directory of the project to the package seach path
    • uses the current project for the "local modules" lookup
    • ... what else ?

@jimmack1963
Copy link

I was hoping the definition of module could be expanded to include a URI or be expandable to include git and other services (source code in a wiki?) Load a gist or codebin paste in one step?

@jonenst
Copy link

jonenst commented Aug 23, 2012

Contrary to what is written in the introduction "a standardized search path for modules, allowing for global, user-local, project-local, and package-local module installation and lookup", and as mrjbq7 noted, per project module installation and lookup is not discussed in the rest of the document (or I missed it)

@mrjbq7
Copy link

mrjbq7 commented Aug 23, 2012

Any thoughts to how you would incorporate a CPAN / CRAN / PyPI type system?

@jckarter
Copy link
Author

@mrjbq7 @jonenst Re: VCSes and package installation, it was my intention that you could reference a VCS in a package's metadata. For instance, a package could have the following metadata:

name: foo
requires:
  - name: bar
    uri: git+ssh://[email protected]/alansmithee/bar.git
    version: abc123
  - name: bas
    uri: git+ssh://[email protected]/

And a package installer could use that metadata to download and install the refids specified by the versions in the referenced repos. I didn't describe a package installer in detail because that felt like it should be a second project after the infrastructure is set up.

@jckarter
Copy link
Author

@erg Having both foo/bar.factor and foo/bar should be an error, yes. I think circular module dependencies within a module should be supported, but I don't think packages can circularly reference each other without losing name independence and complicating package management tools.

@jckarter
Copy link
Author

@jimmack1963 Being able to reference standalone modules by a URI sounds like a useful feature.

@jckarter
Copy link
Author

@jonenst Re: inverting the meaning of local and absolute module names: That is a good point. I was thinking that in practice local module imports might outnumber standard-library imports, but I'm probably wrong about that.

@jckarter
Copy link
Author

@jonenst Re: new/open/save, It was my thinking that Factor packages are roughly analogous to Eclipse/VS/Xcode projects, and that it would be a useful familiar UI paradigm to carry over to the Factor environment, with or without a native editor. You could have multiple listeners open for different projects. You could of course also use the command-line forms from a UI listener window if you prefer.

@jckarter
Copy link
Author

@mrjbq7 @jonenst @jimmack1969 I've updated the gist to (hopefully) address your comments; let me know if I missed anything or if you have further comments.

@jckarter
Copy link
Author

I've also opened an issue as requested by @mrjbq7: https://github.com/slavapestov/factor/issues/641

@jimmack1963
Copy link

@jckarter Thanks. I read this to indicate the uri at the package level could be used as instruction for initial load, and update purposes. By putting it at the module level, I was also trying to convey that a module could dynamically loaded from the web, without any guaranteed/explicit local staging. My goal would be more like push source code deployment, for higher level functionality rather than core routines that one would want to control versions for. Would you see any use to a configuration mechanism where the latest version of the package could be automatically retrieved and built? Clearly bad for many purposes, but a cheap way to build or deploy self-updating applications.

@erg
Copy link

erg commented Aug 24, 2012

How about twitter: URIs for those tweetable code snippets?

@jckarter
Copy link
Author

@erg Doesn't twitter.com already have raw text URLs for tweets?

@stylewarning
Copy link

Should modules have version numbers?

Should it be possible to have multiple versions of the same package/module? And if so, should it be possible to load version-specific packages from source files?

It's inevitable that some packages will rely on older APIs. That's not ideal, but that's the fact of the matter. It might be advantageous to even be able to specify version-specific dependencies (i.e., this package requires the math package, >= 2.0.0.). By default, the bare name (e.g., math) could mean the latest version of the package.

Just some thoughts.

@jckarter
Copy link
Author

@tarballs-are-good It's intended that, by having package-local packages, you can have multiple versions of the same package loaded to fulfill the dependencies of different modules, in order to address that exact issue.

@erg
Copy link

erg commented Apr 21, 2013

@erg
Copy link

erg commented Apr 25, 2013

Support doing things the XDG way for a global install?
http://standards.freedesktop.org/basedir-spec/basedir-spec-0.6.html

@andreaferretti
Copy link

I admit I have not read into detail the above specification, but I think that an approach in the style of Metacello (from the SmallTalk community) would be simple to develop and fit Factor nicely.

There need not be changes to the way vocabularies are loaded: instead one can make some words that change the vocab roots according to some configuration files. One could have a FACTOR-ROOT/cache directory in addition to work, core, and extras, where packages are arranged by version. Then some machinery could set the right vocab-roots based on a specification.

One would need

  • some words to create structure that describe packages (Maven-like). This would allow to declaratively say that my project, at version x, consists of this and that vocabulary, and relies on project foo at version y and project bar at version z
  • some words to automate fetching the dependencies of my project recursively from common repositories (one could start with github)
  • a word to set the vocab roots according to my dependencies

Ideally, I would like to write a configuration file that lists my dependencies and then do something like

USE: my project.config
myconfig set-deps

On top of this, one could develop GUI tools to automate writing specification files, or even suggesting officially endorsed packages. See for instance the GUI tools in Pharo for a great example of this.

The advantage would be that we would not have any changes to the way vocabularies are loaded, and the change would be much more incremental.

If anyone thinks that this makes any sense, I can try to add more detail, but just playing with Pharo and Metacello should give a better idea

@lolbinarycat
Copy link

Would it still be possible to create stand-alone scripts (likely with a shebang)? This is one of my favorite uses of ruby, and while I haven't tried it with factor, I would imagine it could be fairly useful.

@lolbinarycat
Copy link

Also, how would this handle stuff like c library dependencies?

@lolbinarycat
Copy link

Here's a way this could work while being backward compatible:

  1. they're still called vocabs (so we don't have to rename all the words that refer to them as such)
  2. keep the old import words USING: etc, they keep the same semantics, searching all packages. (now old code still works)
  3. add a new IMPORT: parsing word, that replaces the old ones and uses the package semantics described above (I suggest looking at haskell's import for reference)
  4. encourage use of new IMPORT: in new code.

@lolbinarycat
Copy link

I also suggest renaming "absolute names" to "packaged names". I think "absolute" and "global" are too similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment