We dropped Lerna from our monorepo architecture in PouchDB 6.0.0. I got a question about this from @reconbot, so I thought I'd explain our reasoning.
First off, I don't want this post to be read as "Lerna sucks, don't use Lerna." We started out using Lerna, but eventually outgrew it because we wrote our own custom thing. Lerna is still a great idea if you're getting started with monorepos (monorepi?).
Backstory:
- https://pouchdb.com/2016/01/13/pouchdb-5.2.0-a-better-build-system-with-rollup.html
- pouchdb/pouchdb#5545
- https://github.com/boennemann/alle
Second off, it's good to understand why you might want a monorepo in the first place:
- Contributions are easier (no more "this issue is in the wrong repo, please post it here")
- Cross-package changes are easier (just make one git commit)
- Testing is easier (test the whole thing, don't make separate test suites or copy tests)
Third off, it's worth understanding why you might want separate packages instead of sub-modules
(e.g. require('lodash.uniq')
instead of require('lodash/uniq')
). In the case of PouchDB, 99% of the reason we wanted
separate packages was because the main pouchdb
module has a dependency on leveldown
, which is a native module that
takes foreeeeeever to npm install
and may actually fail on certain architectures. Ditto sqlite3
, which is used by
the optional node-websql
adapter.
Combine that with the fact that we have lots of different adapters that you can use to
mix-and-match your own custom build, and the fact that plugins frequently
want to re-use little bits of functionality from PouchDB core (such as the ajax()
module), and publishing separate packages suddenly makes a lot of sense.
OK, so here's the deal with Lerna. Basically Lerna has three different steps:
lerna bootstrap
, which links all of your sub-packages together so you can easily test them without a lot ofnpm link
ing. Lerna does this by creating separatenode_modules
for each sub-package, then inserting pseudo-packages that simplyrequire()
the parent package. (E.g.packages/a/node_modules/b/index.js
will simply containmodule.exports = require('../../b');
.) This is a neat trick that avoids a lot ofnpm link
ing (which in my experience can be very faily w.r.t. circular dependencies and have surprising side effects due to symbolic links).lerna run
, which you would normally use to run your build inside each sub-package, e.g.lerna run build
will runnpm run build
inside each sub-package.lerna publish
, which publishes all packages to npm and does some other magic to update git tags, etc.
Let's dissect each one of those steps. For lerna boostrap
, we were actually using this in PouchDB, and this was the main benefit we were getting out of Lerna.
For lerna run
, we were originally using it to run Rollup in each sub-package, but quickly realized that with ~30 packages, running 30 Node processes for each one (i.e. doing npm run build
30 times) was too slow. It made more sense to just write one big build.js
script that built each sub-package inside of a single Node process. I can't remember the speedup, but it was something like 60 seconds vs 5 seconds (those numbers are completely made up).
For lerna publish
, we actually don't use Lerna's "independent" mode (which is what Babel uses correction: Babel uses "locked" mode, see comment below). Independent mode would mean that every sub-package would have its own semver and would get updated accordingly when its dependencies got updated, but we figured this would be way too complicated for PouchDB users, and it was simpler to just lock everything to a single version. Therefore we didn't really need lerna publish
– we could just run npm publish
in a loop, and that was good enough (along with a script to update the version number in every package.json
, which is equally easy to write).
So that leaves us with lerna bootstrap
. After talking with Stephan Boennemann, though, and reading his Alle proposal, I realized we could avoid it entirely by simply renaming the packages/
folder to packages/node_modules
. Because of how the require()
algorithm works, any reference to e.g. require('pouchdb-ajax')
from within packages/node_modules/pouchdb
will resolve to packages/node_modules/pouchdb-ajax
, because require()
just walks up the file tree until it finds a node_modules
folder with a sub-folder that matches the package name. This cuts out the lerna boostrap
step, which shaved about 30 seconds off of our npm install
time (which is huge when we have dozens of Travis builds).
Using the "Alle" model also allowed us to move all of the sub-package's dependencies up to the top-level package.json
, which worked around a current Greenkeeper limitation, which is that it doesn't work with monorepos if your dependencies are declared anywhere but the top-level package.json
in the repo root. But by having our dependency versions declared at the top level (and a script to update the sub-package's package.json
s right before publishing), we can continue using Greenkeeper like normal. (Again, this works because of how require()
works; it just keeps walking up until it finds the right node_modules
.) As an added bonus, we don't have to try to coordinate the versions of dependencies used by sub-packages (which is a real problem we ran into).
So that's it! We switched to the "Alle" model because it worked better for us. On the other hand, I wouldn't discourage anybody from using Lerna, because it provides a lot of good out-of-the-box tools for working with monorepos, and if you have a more complex setup than ours (e.g. you're using independent versioning), then it can save you a lot of boilerplate. And even if you find you can speed up your builds by removing Lerna and going custom, there's no reason not to start with a Lerna-style build system.
Hi!
I also started a project as an alternative to Lerna, another reason for me to try alternatives was the lack of support to other kind of deliverables (like a container image for example)
The tool that I am working on will help you to work with multiple packages/deliverables and their versions, but using multiple repositories.
The approach that I am taking is to use a separate git repository as a database, that repository will have all the information about which is the current version for a package, current commit, dependent/dependant relations and a lot more to help you know that you are using the right information.
I would love feedback on the approach or I am also looking for Beta testers, if you are interested please let me know.
You can check the details on the tool and workflow on www.repoflow.com
Thanks a lot