Skip to content

Instantly share code, notes, and snippets.

@djspiewak
Last active May 22, 2024 19:02
Show Gist options
  • Save djspiewak/7a81a395c461fd3a09a6941d4cd040f2 to your computer and use it in GitHub Desktop.
Save djspiewak/7a81a395c461fd3a09a6941d4cd040f2 to your computer and use it in GitHub Desktop.

Explaining Miles's Magic

Miles Sabin recently opened a pull request fixing the infamous SI-2712. First off, this is remarkable and, if merged, will make everyone's life enormously easier. This is a bug that a lot of people hit often without even realizing it, and they just assume that either they did something wrong or the compiler is broken in some weird way. It is especially common for users of scalaz or cats.

But that's not what I wanted to write about. What I want to write about is the exact semantics of Miles's fix, because it does impose some very specific assumptions about the way that type constructors work, and understanding those assumptions is the key to getting the most of it his fix.

For starters, here is the sort of thing that SI-2712 affects:

def foo[F[_], A](fa: F[A]): String = fa.toString

foo { x: Int => x * 2 }

We're calling the foo function, passing a value of type Int => Int, which is syntactic sugar for Function1[Int, Int]. In the meantime, foo expects a parameter of type F[A], where F[_] and A are unsolved type variables. This will not compile.

The reason it does not compile is because Function1 takes two type parameters, whereas F[_] only takes one. The compiler tries to instantiate F[_] = Function1[_, _], but it doesn't work because the number of parameters doesn't line up, and it fails the build.

With Miles's fix, the above snippet compiles without modification and behaves exactly as you would expect. In fact, nearly every example of SI-2712 magically springs to life and just starts working exactly the way you want it to. Specifically in the above example, the compiler will see that we are passing a value of a type constructor with two parameters (Function1) to a function (foo) that expects a value of a type constructor with one parameter (F[_]), and it will handle the situation by assuming that there is some other type constructor which is a partial application from the left of Function1.

The key thing here though is that the compiler is assuming we want a partial application from the left. This is a really specific assumption, and a necessary one, since clearly this situation is a bit ambiguous. The compiler assumes that F[_] = [A]Function1[Int, A] is the correct unification. But if you think about it, there's nothing really wrong with making the following unification instead: F[_] = [A]Function1[A, Int]. It is, in a sense, an arbitrary choice, and while the compiler will be very consistent about moving left-to-right in this way, it is something we need to keep in mind.

One way to think about this situation is to "pretend" that all Scala type constructors are curried functions, rather than uncurried functions of multiple parameters. This is a bit of a fantasy, since Scala type constructors are not curried and you cannot use them as such. For example, the following does not work:

def foo[F[_], A](fa: F[A]) = fa.toString

foo[Function1[Int], Int](x => x * 2)

Notice how we're "partially applying" the Function1 type constructor by only specifying its first parameter. This doesn't work. Not now, and not with Miles's change. But this is a useful mnemonic, as it were, for understanding the precise semantics of Miles's fix.

The fix assumes that type constructors are always partially applied in a left-to-right order. Whenever the compiler infers a type constructor which has a higher arity than the type constructor it really needs, it will assume that the type constructor it was supposed to infer was some partial application of the larger one. For example:

// required
F[_]

// provided
Foo[Int, String, Boolean]

// inferred
[A]Foo[Int, String, A]

Or another example:

// required
F[_, _, _]

// provided
Foo[Monad, Int, String, Boolean, Unit]

// inferred
[A, B, C]Foo[Monad, Int, A, B, C]

The compiler will only ever infer these "placeholder" parameters on the right side of the type constructor. Never on the left. This has some very significant implications for the way in which you design data types.

A good example of this is Either (\/ in scalaz; Xor in cats). The \/ type has a Functor instance, which defines a map function that transforms the "right" case of the either. Specifically:

def eitherFunctor[T]: Functor[T \/ ?] = new Functor[T \/ ?] {
  def map[A, B](fa: T \/ A)(f: A => B): T \/ B = fa.fold(identity, f)
}

This is what is known as a "right-biased either". We are specifically making the assumption that, when you call the map function, you want it to apply to the right-hand side, and not the left-hand side. There is nothing special about right vs left, we just chose to assume you wanted the right. And as it turns out, the is exactly the sort of assumption which is rewarded by Miles's fix for SI-2712. For example:

def i2s[F[_]: Functor](fa: F[Int]): F[String] = fa map { _.toString }

i2s(\/-(42))    // it works!

The type constructor we have down at the call site for i2s is \/[Nothing, Int], and the type constructor that is expected by the i2s declaration is F[_], and so the compiler will infer a solution for F[_] = [A]\/[Nothing, A]. Which, as it turns out, is exactly what we want, since this inference will find us the right-biased Functor instance for \/ and life will be good.

Unfortunately, if we had made a difference choice with our functor definition, biasing to the left rather than to the right, we would be in serious trouble. For example, here is a Functor instance for Scalactic's Or data type (which is a left-biased either):

def orFunctor[T]: Functor[Or[?, T]] = new Functor[Or[?, T]] {
  def map[A, B](fa: Or[A, T])(f: A => B): Or[B, T] = fa map f    // we can just use the map function defined on Or
}

So far so good, but the problem is that our i2s example will no longer work:

def i2s[F[_]: Functor](fa: F[Int]): F[String] = fa map { _.toString }

i2s(Good(42))    // nooooooope!

The compiler attempts to infer F[_] = [A]Or[Int, A], but that is incorrect and will not correspond to a Functor instance, since our Functor for Or is left-biased and not right-biased.

In other words, right-biasing is really important with this change. You should always, always order the type parameters in your type constructor from least-to-most specific. In other words, if your type constructor takes multiple parameters, you should order them such that the parameters you expect users to vary (i.e. write a map function over, change frequently between functions, etc) should be right-most, while the parameters you expect to be consistent (i.e. the same at multiple unrelated places in the code) should be left-most.

@guizmaii
Copy link

👍

@heyrutvik
Copy link

very well explained! 👍

@heyrutvik
Copy link

heyrutvik commented Apr 29, 2018

Hey @akhileshs,
As @djspiewak has explained in this gist, foo needs a type constructor with one hole and Function1 has two holes. If you directly pass { x: Int => x * 2 } to foo it won't work.
It's working fine for you because you are providing correct type (with one hole) to foo. The type you are creating IntConsumer[A] has one hole and foo expects the same. It should be provided by compiler for you, so you don't have to write type alias by yourself. just imagine every time you have to create type alias to work with some code which expects type constructor with different number of holes from the one you have. 😨

@softinio
Copy link

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment