Recently, I found myself in need to precisely understand Scala's core typechecking rules. I was particulary interested in understanding rules responsible for typechecking signatures of members defined in classes (and all types derived from them). Scala Language Specification (SLS) contains definition of the rules but lacks any examples. The definition of the rules uses mutual recursion and nested switch-like constructs that make it hard to follow. I've written down examples together with explanation how specific set of rules (grouped thematically) is applied. These notes helped me gain confidence that I fully understand Scala's core typechecking algorithm.
Let's quote the Scala spec for As Seen From (ASF) rules numbered for an easier reference:
The notion of a type T in class C seen from some prefix type S makes sense only if the prefix type S has a type instance of class C as a base type, say S′#C[T1,…,Tn]. Then we define as follows.
- If S = ϵ.type, then T in C seen from S is T itself.
- Otherwise, if S is an existential type S′ forSome { Q }, and T in C seen from S′ is T′, then T in C seen from S is T′ forSome {Q}.
- Otherwise, if T is the i'th type parameter of some class D, then
- If S has a base type D[U1,…,Un], for some type parameters [U1,…,Un], then T in C seen from S is Ui.
- Otherwise, if C is defined in a class C′, then T in C seen from S is the same as T in C′ seen from S′.
- Otherwise, if C is not defined in another class, then T in C seen from S is T itself.
- Otherwise, if T is the singleton type D.this.type for some class D then
- If D is a subclass of C and S has a type instance of class D among its base types, then T in C seen from S is S.
- Otherwise, if C is defined in a class C′, then T in C seen from S is the same as T in C′ seen from S′.
- Otherwise, if C is not defined in another class, then T in C seen from S is T itself.
- If T is some other type, then the described mapping is performed to all its type components.
If T is a possibly parameterized class type, where T's class is defined in some other class D, and S is some prefix type, then we use "T seen from S" as a shorthand for "T in D seen from S".
Rule 1 is the base case for the recursive definition of ASF. Altough you cannot write ϵ.type
in a Scala program, the ASF might be sent to the first rule by rules that strip the prefix: 2, 3.2 and 4.2.
Rule 2 is about seeing through existential types. Let's consider an example:
class X
object Rule2 {
abstract class A[V] {
val t: V
}
val a: A[W] forSome { type W <: X } = null
val x: X = a.t
}
To typecheck val x: X = a.t
we need to understand the type of a.t
.
- We ask ASF for
A.V
inA
as seen froma.type
. The prefixS'
isϵ.type
. Rule 2 applies with the existential typeS' forSome {Q}
decomposed into the typesS' = A[W]
andQ = { type W <: X }
. Note that the twoS'
s are not the same -- the first is the prefix of the ASF, the second the prefix of the existential type -- this is an unfortunate name clash in the spec. ASF applies itself recursively to the prefixA[W]
of the existential type. - ASF's query is
A.V
inA
as seen fromA[W]
. Rule 3.1 extracts the type parameterW
and returns it. Check the explanation of Rule 3 for details. - ASF completes the application of Rule 2 by substituting prefix
A[W]
withW
in the existential type and returnsW forSome { type W <: X }
as its final answer.
We can simplify the returned existential type by applying Simplification Rules (SRs) specific to existential types. SLS 3.2.10 contains the definition of SRs.
- By application of SR4, we subsitute the upper bound
X
of the typeW
in the prefix of the existential typeW forSome { type W <: X }
and we simplify the existential type toX forSome { type W <: X }
. - By application of SR2, we drop the unused quantification
type W <: X
inX forSome { type W <: X }
and simplify the existential type toX forSome {}
. - By application of SR3, we drop the empty sequence of quantifications in
X forSome {}
and further simplify the existential type toX
.
The third rule is about resolution of a type parameter for a given type argument.
object Rule3 {
abstract class A[T] {
abstract class B {
abstract class C1 {
val t: T
}
abstract class C2 extends C1
val c: C2
}
val b: B
}
abstract class AA[T] extends A[T]
val x1: AA[X] = null
// the type of x1.b.c.t.type is type A.T in C1 as seen from x1.b.c.type
val x2: x1.b.c.t.type = null
val x: X = x2
}
In order to typecheck val x: X = x2
we need to understand the type of x2
(we'll be looking at the type of the term t
).
- We ask ASF for
A.T
inC1
as seen from the prefixS = x1.b.c.type
. The prefixS'
isx1.b.type
(soC1
can be selected fromS'
with the type projectionx1.b.type#C1
andC1
is a base class forS
). Rule 3 applies withA.T
being the first type paramemter of the classA
; the classD
isA
. The prefixS
doesn't have an application of the parametric typeA
amongst its base types so Rule 3.1 doesn't apply. Rule 3.2 applies because the classC1
is nested in the classB
. ASF applies itself recursively to the same typeA.T
but in the class one level up in the class nesting relationship. - ASF's query is
A.T
inB
as seen from the prefixS = x1.b.type
(soS' = x1.type
). By the same reasoning as in the step above, first Rule 3 applies and then Rule 3.2 applies because the classB
is nested in the classA
. ASF applies itself recursively to the same typeA.T
but in the classA
. - ASF's query now is
A.T
inA
as seen from the prefixS = x1.type
(soS' = ϵ.type
). Rule 3 still applies becauseA.T
is the first type parameter of the classA
. Rule 3.1 applies because the typeA[X]
is a base type of the prefixS
; ASF extracts the type argument from the typeA[X]
and returnsX
as its final answer.
In the example above, we only excercised rules 3.1 and 3.2 but not 3.3. The Rule 3.3 is a fallthrough rule that is triggered when a prefix doesn't encode any relevant information about how to resolve a type parameter. For example, you can ask for the type A.T
in the class X
as seen from the prefix x.type
to trigger Rule 3.3. The definition of ASF ties together the class C
and the prefix type S
but leaves the type T
arbitrary. You can ask ASF for a type unrelated to the passed prefix. The 3.3 cannot be triggered directly by writing a type in user program because in order to access a type parameter, you have to be inside the class and then other rules will apply.
The fourth rule is about resolving this
. Let's see it in action based on this example:
object Rule4 {
abstract class A {
abstract class B1 {
abstract class C1 {
val a: A.this.type
}
abstract class C2 extends C1
val c: C2
}
abstract class B2 extends B1
val b: B2
}
abstract class AA extends A
val x1: AA = null
val x2: x1.b.c.a.type = null
val x: AA = x2
}
In order to typecheck val x: AA = x2
we need to understand the type of x2
(we'll be looking at the type of the term a
).
- We ask ASF for
A.this.type
inC1
as seen from the prefixS = x1.b.c.type
. The prefixS'
isx1.b.type
(soC1
can be selected fromS'
with the type projectionx1.b.type#C1
andC1
is a base class forS
). Rule 4 applies to the singleton typeA.this.type
with the classA
extracted from it (soD = A
). Rule 4.1 is skipped because the classA
is not a subclass of the classC1
. Rule 3.2 applies because the classC1
is nested in the classB1
. ASF applies itself recursively to the same typeA.this.type
but in the class one level up in the class nesting relationship. - ASF's query is
A.this.type
inB1
as seen from the prefixS = x1.b.type
(soS' = x1.type
). By the same reasoning as in the step above, first Rule 4 applies and then Rule 4.2 applies because the classB1
is nested in the classA
. ASF applies itself recursively to the same typeA.this.type
but in the classA
. - ASF's query is now
A.this.type
inA
as seen from the prefixS = x1.type
(soS' = ϵ.type
). Rule 4 still applies because the singleton typeA.this.type
matches the definition of the rule (with the classD = A
). Rule 4.1 applies with the classA
being its own subclass andA
found amongst base types of the prefixS
because the classAA
is a subclass ofA
. ASF's final answer isx1.type
.
Similarly to Rule 3.3, Rule 4.3 is not covered by the example above. It's again a fallthrough rule. Rule 4.3 can be triggered by asking ASF for X.this.type
in AA
as seen from x1.type
.
We saw that rule 3. in ASF definition is about resolution of type parameters. Where's resolution of type members defined? Let's study this with an example:
object TypeMemberOverride {
abstract class B {
type U
// when you write `U`, this a shorthand for `B.this.type#U` (see Scala spec 3.2.3)
val u: U
}
abstract class BB extends B {
type U = X
}
val x1: BB = null
val x2: X = x1.u
}
In order to typecheck the val x2: X = x1.u
we have to understand the type of x1.u
. We ignore the object TypeMemberOverride
to shorten the considered paths so we're looking at x1.u
instead of TypeMemberOverride.x1.u
, etc.
- We ask ASF for
B.this.type#U
inB
as seen from the prefixS = x1.type
. The prefixS'
isϵ.type
(soB
can be selected fromS'
with the type projectionϵ.type#B
andB
is a base class forS
). The type projectionB.this.type#U
is a selection of a type memberU
from the singleton typeB.this.type
. Rule 5 applies to the singleton typeB.this.type
. - ASF's query is now
B.this.type
inB
as seem from the prefixS = x1.type
. The prefixS'
isϵ.type
. By the same reasoning as explainedthis
resolution section, first Rule 4 applies to the singletion typeB.this.type
with the classB
extracted from it (soD = B
) and then Rule 4.1 applies so the answer for this ASF's step is the singleton typex1.type
(the prefixS
). - ASF completes the application of Rule 5 by returning
x1.type#U
as its final answer.
Now we're done with ASF and we select the member U
of either the class BB
, or the class B
because both classes have instances that are base types of the type x1.type
(see SLS 3.4.3 for definition how members of a type are defined). We pick U
as a member of of the class BB
because member U
of the class BB
subsumes member U
of the class B
(see SLS 5.1.4. And we get X
as the final answer by resolving the type alias U
defined in the class BB
.
Thanks to Nada Amin for raising the bar on my note taking skills and Adriaan Moors for comments.
Wait, the empty path? Isn't it
S' = Rule2
, or do I miss something? I guess you're ignoring that. Also, technically it seems we're in fact asking ASF forV
, notA.V
. I know what you mean, but I'm asking just to be sure I'm following along how to apply these rules formally. I'm also not 100% sure on what the empty path means.