2015-01-29 Unofficial Relay FAQ
Compilation of questions and answers about Relay from React.js Conf.
Disclaimer: I work on Relay at Facebook. Relay is a complex system on which we're iterating aggressively. I'll do my best here to provide accurate, useful answers, but the details are subject to change. I may also be wrong. Feedback and additional questions are welcome.
Relay is a new framework from Facebook that provides data-fetching functionality for React applications. It was announced at React.js Conf (January 2015).
Each component specifies its own data dependencies declaratively using a query language called GraphQL. The data are made available to the component via properties on this.props.
Developers compose these React components naturally, and Relay takes care of composing the data queries into efficient batches, providing each component with exactly the data that it requested (and no more), updating those components when the data changes, and maintaining a client-side store (cache) of all data.
GraphQL is a data querying language designed to describe the complex, nested data dependencies of modern applications. It's been in production use in Facebook's native apps for several years.
GraphQL itself is an engine for mapping from queries to code that is responsible for actually fetching the data, so is agnostic about what underlying storage is actually used. Relay uses GraphQL as its query language, but it is not tied to a specific implementation of GraphQL.
By co-locating the queries with the view code, the developer can reason about what a component is doing by looking at it in isolation; it's not necessary to consider the context where the component was rendered in order to understand it. Components can be moved anywhere in a render hierarchy without having to apply a cascade of modifications to parent components or to the server code which prepares the data payload.
Co-location leads developers to fall into the "pit of success", because they get exactly the data they asked for and the data they asked for is explicitly defined right next to where it is used. This means that performance becomes the default (it becomes much harder to accidentally over-fetch), and components are more robust (under-fetching is also less likely for the same reason, so components won't try to render missing data and blow up at runtime).
Relay provides a predictable environment for developers by maintaining an invariant: a component won't be rendered until all the data it requested is available. Additionally, queries are defined statically (ie. we can extract queries from a component tree before rendering) and the GraphQL schema provides an authoritative description of what queries are valid, so we can validate queries early and fail fast when the developer makes a mistake.
The other thing Relay does to prevent errors is "masking" of the data it passes into each component. This means that only the fields of an object that a component explicitly asks for will be accessible to that component, even if other fields are known and cached in the store (because another component requested them). We call this masking, and it makes it impossible for implicit data dependency bugs to exist latently in the system.
Note that co-location in itself isn't the end goal here. At the moment our queries reside, explicitly, in the components, but through the power of static analysis (specifically, Flow and the type information encoded in the GraphQL schema), you can imagine a state where the queries can be inferred via an analysis of which subcomponents a component renders (and therefore, which subqueries need to be composed into the component query) and which properties it itself accesses. If we can get there, then over-fetching and under-fetching will go from "unlikely" to outright impossible.
By handling all data-fetching via a single abstraction, we're able to handle a bunch of things that would otherwise have to be dealt with repeatedly and pervasively across the application:
- Performance: All queries flow through the framework code, where things that would otherwise be inefficient "N+1" query patterns get automatically collapsed and batched into efficient, minimal queries. Likewise, the framework knows which data have been previously requested, or for which requests are currently "in flight", so queries can be automatically de-duplicated and the minimal queries can be produced.
- Subscriptions: All data flows into a single store, and all reads from the store are via the framework, so the framework knows which components care about which data and should be re-rendered when data changes; components never have to set up individual subscriptions.
- Common patterns: We can make common patterns such as pagination easy (this is the example that Jing gave at the conference); if you have 10 records initially, getting the next page just means declaring you want 15 records in total, and the framework automatically constructs the minimal query to grab the delta between what you have and what you need, requests it, and re-renders your view when the data becomes available.
- Simplified server implementation: Rather than having a proliferation of end-points (per action, per route), a single GraphQL endpoint can serve as a facade for any number of underlying resources.
- Uniform mutations: There is one consistent pattern for performing mutations (writes), and it is conceptually baked into the data querying model itself. You can think of a mutation as a query with side-effects: you provide some parameters that describe the change to be made (eg. attaching a comment to a record) and a query that specifies the data you'll need to update your view of the world after the mutation completes (eg. the comment count on the record), and the data flows through the system using the normal flow. We can do an immediate "optimistic" update on the client (ie. update the view under the assumption that the write will succeed), and finally commit it or roll it back in the event of an error when the server payload comes back.
In some ways Relay is inspired by Flux, but the mental model is much simpler. Instead of multiple stores, there is one central store that caches all GraphQL data. Instead of explicit subscriptions, the framework itself can track which data each component requests, and which components should be updated whenever the data change. Instead of actions, modifications take the form of "mutations".
There may be use cases where you let Relay manage the bulk of the data flow for your application, but use a Flux store on the side to handle a subset of application state. For example: with Relay, you may be able to model all of your workflow in terms of GraphQL queries (for reading data) and mutations (for writing data). But there may be times where you have "draft" mutations that you want to build up over time on the client (for example, think of a "wizard" work flow) and for some reason you don't want to persist this state to the server or use a real mutation; you could manage these using Flux (and store it ephemerally in an in-memory data structure, or in local storage). We may eventually end up baking some of these workflows into Relay itself, but in the meantime, nothing will prevent you from blending it with other approaches.
Relay does have a notion of routes and routing, but it's one of the APIs that we're currently improving so I'll keep away from details (which may change) and try to speak in generalities.
Relay uses routes to determine which data to fetch to render a given component (it's possible for a component to be composed on any number of different views).
The data required to render a particular view is a function of the route and any query params that may be supplied in the route or by the component itself.
You can think of the route as a URI, which itself may contain "query params" (not necessarily part of a URI query string; the params may be embedded as path components with the URI).
Right now we have our own routing. We've been talking to the react-router people to see if there's a way we can make things work together.
You said that an invariant was that a component won't be rendered until the data it requested is available. How would I render things like placeholders and loading indicators in this case?
Relay allows you to mark part of a query as "deferred". Anything which is not marked as deferred is considered to be required. Relay won't render a component until all of its required data is available. We provide an API for components to check whether their deferred data is available, missing, or on the way. Using these primitives, you can build interfaces which do things like immediately show their navigation and core content, and subsequently load in comments, while showing a loading indicator in the meantime.
Each GraphQL query begins with a "root call" which allows us to start at a particular node, or set of nodes, in the graph. From there, "fields" in the query describe what information we want on these objects, as well as the fields we want on the subsequent arbitrarily nested objects.
The backbone of the GraphQL engine is a schema, created from definition classes that contain rich metadata, which provide two things: (1) a description of all the possible fields, relationships and types that can be represented in a valid query; and (2) the mapping from fields to the actual retrieval mechanism. These classes can be used to wrap business objects, or services, or any other data source we wish to expose to GraphQL.
The GraphQL engine parses queries into an AST representation and given this tree and the schema, it traverses the nodes evaluating an executor which uses the definitions from the schema to retrieve objects, and access fields on the retrieved objects (for example, a field may map to a property on the object, or to a function that computes derived data or itself performs an arbitrary call to another service).
This is made performant by a combination of two things: pervasive caching (so that if the same data is requested multiple times at different places in the tree, the actual data is fetched only once) and extensive use of asynchronous primitives (async/await) which enable us to effectively parallelize and batch operations.
In the video's example, they said they use a node(id) root call, but it's not clear how to determine the model for a given id.
That's all going to be implementation-specific. For example, at Facebook the two root calls that we most commonly use are:
- viewer(): specifies an object representing the current viewer; and you can imagine a query that looks something like viewer() {news_feed.first(10) { ... }}
- node(): returns an object from the Facebook Graph specified by a Facebook ID; this could actually be any of a number of things (such as a page, or a person) that conform to the Node interface
Another thing to note is that IDs passed to the node() root call don't have to be integers, they can be opaque strings that map to pretty much anything you want, so there is considerable flexibility for mapping GraphQL concepts onto those from existing storage systems.
We'll be releasing a spec describing GraphQL in detail and a reference implementation of the GraphQL engine. Note that the engine itself doesn't actually fetch any data; it will rely on some kind of adapter layer to do that, and there a lots of possibilities there for doing this in different languages, talking to caches, services, databases, ORMs, and embedding business logic etc. We intend to provide examples showing what kinds of patterns are possible.
Yes, absolutely. The grammar for GraphQL is relatively tiny, and the engine itself (parser, executor) is small. We'll be releasing a reference implementation, which I fully expect will be able to be ported to other languages in a straightforward manner.
Syntax in GraphQL looks different from JSX (eg. ${name} instead of {name}). Is that intentional or accidental?
These are ES6 template literals, which we use because they're familiar to ES6 developers. The two main places where you'll see this kind of interpolation are when inserting query params (eg. friends.first(${params.count})) and when composing queries from subcomponents (eg. ${ProfilePicture.getQuery('viewer')}).
There is actually some additional magic going on here — our transform pipeline actually embeds the AST directly (as JavaScript objects) in place of the template — but generally developers won't need to worry about that at all.
Jing's slide showed us modifying query params using a this.setQueryParams call, rather than React's this.setState.
This is because setting query params is an inherently async operation. Setting query params may trigger a network request. Multiple setQueryParams calls may be issued before the results of prior calls arrive. Later calls should supersede prior calls. The corresponding fetchs may fail or need to be retried. This complexity needs to be abstracted away.
The framework preserves an invariant that it won't try to render a component until all the data it requested are available, so it wouldn't do to only have, say, 10 objects in a list and to trigger a render when the query params have been updated to request 15 objects but the next 5 objects haven't arrived yet.
So, the setQueryParams API provides us with an abstraction behind which to hide the details of all this asynchrony. The framework can track both "current" and "pending" values of query params, and make sure that the component always sees the right value for any given query param (ie. the one that reflects the reality of what we have in the store at render time).
This is an API that we're actively working on right now, so it may change between now and the open source release.
Relay is being used in a few places in production. It's being used with React Native in the Facebook Groups app (currently on the iOS app store) and on an experimental new version of the Facebook mobile website that is currently rolled out to a small number of test users.
Isomorphism is indeed a core part of Relay, mostly thanks to React's isomorphism. Relay's server rendering mode works very similarly to other solutions for React, where it renders to a string instead of DOM and then the client runs React again to inflate the component tree. Since all data fetching happens through the framework, the ability to capture all of the fetched data and ship it to the client to prime the client's store is pretty straightforward.
Since the framework can statically build the GraphQL query without rendering, it's also possible to preload the store in the initial server response, but still render on the client. This is really cool because it parallelizes data fetching and bootstrapping the page.
In addition to these "server" and "preload" modes, we also have a "client" mode in which literally everything is performed on the client (building the query, fetching the data, doing the initial render); this can be very useful for debugging purposes.
Relay currently provides a default implementation of shouldComponentUpdate which is aware of all the Relay-managed data flow and can short-circuit updates when the data haven't changed. Immutability as a general concept is still important (to get those cheap === comparisons) so we try to leverage that as much as possible internally, although there is still more that we can do there.
Relay depends on GraphQL as the query language and assumes your app will talk to an endpoint that speaks GraphQL, but the underlying implementation of the endpoint can be anything you want (including talking to other services which may or may not be RESTful). This means that at the application layer you are no longer thinking in terms of individual "resources" but rather the entire hierarchy of data that your application is going to need (ie. not resources, but trees of resources).
Each component specifies the bit of data that it will need in the form of a query fragment, and the framework takes care of composing all of the fragments into a larger hierarchy that represents the entire query. Because all of this is centralized in the framework, even dealing with massive queries (ie. all the data your entire application needs to render a complex, nested view hierarchy) can be made efficient through caching, batching, re-use, and other means to reduce the size of queries.
We're working very hard right now to get this ready for public consumption and we are super excited about sharing it with you, but we can't say yet when that will be. We'll keep you posted!
You can expect to see blog posts and other materials from us with more details.
Where does the parsing of GraphQL take place: in the client or the server side? I guess it is implemented in the server side, otherwise the server may respond more fields than what is required by the client.