fmeum/bzlmod-repo-name.md

## bzlmod-repo-name.md

      
    Raw
  

              bzlmod-repo-name.md
            
          
    Current situation

I frequently encountered the following two situations when working with and designing for bzlmod.
http_archive as a module extension

A typical usage of the direct analogue of the current http_archive repository rule for MODULE.bazel would probably look as follows:
http = use_extension("@bazel_tools//...", "http")
http.archive(name = "foobar", url = "...")
use_repo(http, "foobar")
There are two problems with this:

"foobar" has to be repeated in http.archive and use_repo.
As far as I understand, since all repositories created by a single module extension can see each other, this could lead to clashes if a module that is a transitive dependency of the current module also happens to use http.archive(name = "foobar").

"one repo per non-Bazel dep" module extensions

Rulesets that manage external, non-Bazel dependencies, often create a repository per external dep (see e.g. Gazelle). With bzlmod, the pattern of using one module tag per non-Bazel dep nicely models the dependency requirements even for transitive module dependencies.
For example, a hypothetical bzlmod-ified Gazelle would allow for the following:
go_dep = use_extension("@bazel_gazelle//...", "go_dep")
go_dep.mod(importpath = "golang.org/x/errors", version = "...")
use_repo(go_dep, "org_golang_x_errors")
There is no repetition here since there is no need to specify a "name" attribute on the tag. However, there are still some things that are not so nice about this pattern:

When the list of Go dependencies gets larger, it becomes difficult to match and sync the list of go_dep.mod lines to the repositories listed in use_repo.
Users have to be aware of the "algorithm" that turns an import path into a repository name.

(Let's disregard for the moment that in the particular case of Go dependencies, gazelle will probably take over 1. and 2. - it should just serve as a concrete example for the more general external deps case)
Proposal

In both situations described above, the reason why they were less verbose with WORKSPACE compared to with MODULE.bazel is that they use the probably very common one tag, one repo pattern - while bzlmod can handle more general situations, it doesn't offer any handy shortcuts for this idiom.
I am thus proposing the following:


Add a function mark_resolved_to to the module_ctx passed to a module extension that takes a module tag and a repository name as arguments and internally marks the tag as corresponding to the particular repository, which must be instantiated by the module extension. Multiple tags can be associated to a single repository in this way, but it is an error if more than one repository is associated with the same tag.


Either of the following (a) is a bit more concise, but b) mimics the repo_name attribute on bazel_dep, which is nice for consistency):
a) In MODULE.bazel, an assignment statement some_name = some_ext.some_tag(...) makes the repository associated with the tag visible as some_name. If the module extension hasn't called mark_resolved_to for this tag, fail with an error.
b) Add a magic repo_name attribute to all tags. If it is set and the tag is not associated with a repository, fail with an error. If it is set and the tag is associated with a repository, make the repository visible under the given name.


Of course, names and/or syntax aren't set in stone, the core of this proposal is merely to let module extensions establish a link between tags and repos.
Benefits

Going over the introductory examples, the mentioned problems are solved as follows:
http_archive as a module extension

This could become:
http = use_extension("@bazel_tools//...", "http")
# Option 2.a)
foobar = http.archive(url = "...")
# Option 2.b)
http.archive(repo_name = "foobar", url = "...")

The name of the repository no longer has to be repeated.
Internally, the http extension can choose any name for the repository that is certain not to collide with other names generated by the same extension (e.g., module_name.$url_safe_chars.$hash_of_url).

"one repo per non-Bazel dep" module extensions

This could become:
go_dep = use_extension("@bazel_gazelle//...", "go_dep")
# Option 2.a)
org_golang_x_errors = go_dep.mod(importpath = "golang.org/x/errors", version = "...")
# Option 2.b)
go_dep.mod(repo_name = "org_golang_x_errors", importpath = "golang.org/x/errors", version = "...")

An external repository declaration corresponds to a single line, containing both the name of the repo and the tag.
Users no longer have to be aware of the naming scheme used internally by the module extension. They can use it (as in the example), but aren't forced to.