We dropped Lerna from our monorepo architecture in PouchDB 6.0.0. I got a question about this from @reconbot, so I thought I'd explain our reasoning.
First off, I don't want this post to be read as "Lerna sucks, don't use Lerna." We started out using Lerna, but eventually outgrew it because we wrote our own custom thing. Lerna is still a great idea if you're getting started with monorepos (monorepi?).
Backstory:
- https://pouchdb.com/2016/01/13/pouchdb-5.2.0-a-better-build-system-with-rollup.html
- pouchdb/pouchdb#5545
- https://github.com/boennemann/alle
Second off, it's good to understand why you might want a monorepo in the first place:
- Contributions are easier (no more "this issue is in the wrong repo, please post it here")
- Cross-package changes are easier (just make one git commit)
- Testing is easier (test the whole thing, don't make separate test suites or copy tests)
Third off, it's worth understanding why you might want separate packages instead of sub-modules
(e.g. require('lodash.uniq')
instead of require('lodash/uniq')
). In the case of PouchDB, 99% of the reason we wanted
separate packages was because the main pouchdb
module has a dependency on leveldown
, which is a native module that
takes foreeeeeever to npm install
and may actually fail on certain architectures. Ditto sqlite3
, which is used by
the optional node-websql
adapter.
Combine that with the fact that we have lots of different adapters that you can use to
mix-and-match your own custom build, and the fact that plugins frequently
want to re-use little bits of functionality from PouchDB core (such as the ajax()
module), and publishing separate packages suddenly makes a lot of sense.
OK, so here's the deal with Lerna. Basically Lerna has three different steps:
lerna bootstrap
, which links all of your sub-packages together so you can easily test them without a lot ofnpm link
ing. Lerna does this by creating separatenode_modules
for each sub-package, then inserting pseudo-packages that simplyrequire()
the parent package. (E.g.packages/a/node_modules/b/index.js
will simply containmodule.exports = require('../../b');
.) This is a neat trick that avoids a lot ofnpm link
ing (which in my experience can be very faily w.r.t. circular dependencies and have surprising side effects due to symbolic links).lerna run
, which you would normally use to run your build inside each sub-package, e.g.lerna run build
will runnpm run build
inside each sub-package.lerna publish
, which publishes all packages to npm and does some other magic to update git tags, etc.
Let's dissect each one of those steps. For lerna boostrap
, we were actually using this in PouchDB, and this was the main benefit we were getting out of Lerna.
For lerna run
, we were originally using it to run Rollup in each sub-package, but quickly realized that with ~30 packages, running 30 Node processes for each one (i.e. doing npm run build
30 times) was too slow. It made more sense to just write one big build.js
script that built each sub-package inside of a single Node process. I can't remember the speedup, but it was something like 60 seconds vs 5 seconds (those numbers are completely made up).
For lerna publish
, we actually don't use Lerna's "independent" mode (which is what Babel uses correction: Babel uses "locked" mode, see comment below). Independent mode would mean that every sub-package would have its own semver and would get updated accordingly when its dependencies got updated, but we figured this would be way too complicated for PouchDB users, and it was simpler to just lock everything to a single version. Therefore we didn't really need lerna publish
– we could just run npm publish
in a loop, and that was good enough (along with a script to update the version number in every package.json
, which is equally easy to write).
So that leaves us with lerna bootstrap
. After talking with Stephan Boennemann, though, and reading his Alle proposal, I realized we could avoid it entirely by simply renaming the packages/
folder to packages/node_modules
. Because of how the require()
algorithm works, any reference to e.g. require('pouchdb-ajax')
from within packages/node_modules/pouchdb
will resolve to packages/node_modules/pouchdb-ajax
, because require()
just walks up the file tree until it finds a node_modules
folder with a sub-folder that matches the package name. This cuts out the lerna boostrap
step, which shaved about 30 seconds off of our npm install
time (which is huge when we have dozens of Travis builds).
Using the "Alle" model also allowed us to move all of the sub-package's dependencies up to the top-level package.json
, which worked around a current Greenkeeper limitation, which is that it doesn't work with monorepos if your dependencies are declared anywhere but the top-level package.json
in the repo root. But by having our dependency versions declared at the top level (and a script to update the sub-package's package.json
s right before publishing), we can continue using Greenkeeper like normal. (Again, this works because of how require()
works; it just keeps walking up until it finds the right node_modules
.) As an added bonus, we don't have to try to coordinate the versions of dependencies used by sub-packages (which is a real problem we ran into).
So that's it! We switched to the "Alle" model because it worked better for us. On the other hand, I wouldn't discourage anybody from using Lerna, because it provides a lot of good out-of-the-box tools for working with monorepos, and if you have a more complex setup than ours (e.g. you're using independent versioning), then it can save you a lot of boilerplate. And even if you find you can speed up your builds by removing Lerna and going custom, there's no reason not to start with a Lerna-style build system.
Some clarifications/lerna issues/prs (I think I've talked about this with nolan before). Thanks for writing this, it's super important for people/users to know what issues projects have with monorepos/lerna/etc and we need to know the pain points to move forward and make management easier!
Babel currently uses the "fixed/locked" mode of lerna not independent mode. That was added after in [email protected] - used in https://github.com/cloudflare/cf-ui, and we are trying it in https://github.com/babel/babili at the moment.
lerna bootstrap
currently uses the proxying feature, but that is because there were some issues with symlinking which should be resolved in lerna/lerna#237. Also I'd be totally cool with trying out thepackages/node_modules
idea for lerna repos. Not having a bootstrap step at all sounds great. If it can work the same as before and without that setup 👍We actually don't use
lerna run
either in Babel. You can just build with your own script/gulp task on a glob.We've mentioned this before but I think the next big step is just to do: lerna/lerna#174 (make lerna a monorepo) and separate out the commands. Lerna is just an api/common set of commands that can be used in a lot of projects. I think the idea there is to move commands into userland and then figure out what can be useful to everyone and standardize on it somehow so it's not all custom (may be difficult to do, but worth figuring out/exploring).
Haven't had much time recently for this/to think about it much though. If anyone has more ideas we'd appreciate discussion in the issues I've linked or a new issue!