Understanding GraphQL engine implementations

When we talk about advantages of GraphQL we often hear one-liners such as “only fetch what you need,” “only requires one generic endpoint,” “data source agnostic,” or “more flexibility for developers.” But as always things are more subtle than a one-liner could ever describe.

Generic and flexible are the key words here and it’s important to realize that it’s hard to keep generic APIs performant. Performance is the number one reason that someone would write a highly customized endpoint in REST (e.g. to join specific data together) and that is exactly what GraphQL tries to eliminate. In other words, it’s a tradeoff, which typically means we can’t have the cake and eat it too. However, is that true? Can’t we get both the generality of GraphQL and the performance of custom endpoints? It depends!

Let me first explain what GraphQL is, and what it does really well. Then I’ll discuss how this awesomeness moves problems toward the back-end implementation. Finally, we’ll zoom into different solutions that boost the performance while keeping the generality, and how that compares to what we at Fauna call “native” GraphQL, a solution that offers an out-of-the-box GraphQL layer on top of a database while keeping the performance and advantages of the underlying database.

GraphQL is a specification

Before we can explain what makes a GraphQL API “native,” we need to explain GraphQL. After all, GraphQL is a multi-headed beast in the sense that it can be used for many different things. First things first: GraphQL is, in essence, a specification that defines three things: schema syntax, query syntax, and a query execution reference.

Schema syntax: Describes your data and API (the schema)

The schema simply defines what your data looks like (attributes and types) and how it can be queried (query name, parameters, and return types). For example, a todo application could have Todos and a List of todos, if it only provides one way to read this data — e.g., get a list of todos via the id, then that schema would look like this:

// todo-schema.gql
type Todo {
   title: String!
   completed: Boolean!
   list: List
}

type List {
   title: String!
   todos: [Todo] @relation
}

type Query {
   getList(id: ID): List!
}

Query syntax: Specifies how you can query

Once you have a schema that defines how your data and queries look, it’s super easy to retrieve data from your GraphQL endpoint. Do you want a list? Just call the getList item and specify what attributes of the list you want to return.

query {
   getList("<some id>){
      title
   }
}

Do you want to join that data with todos? No problem! Just add a small snippet of JSON to the mix.

query {
   getList("<some id>"){
      title
      todos {
         title
         completed
      }
   }
}

But how does this join happen? Of course we did not yet define how this query actually maps to data that comes from our data source.

Query execution reference: A reference implementation for execution

It’s relatively easy to understand the schema and see how you can query. It’s harder to understand what the performance implications are since there are so many different implementations out there. GraphQL is not an implementation to retrieve your data. Rather, GraphQL provides guidelines on how a request query should be broken down into multiple “resolvers” and turned into a response.

Resolvers define how one element of the query (what GraphQL calls a field) can be turned into data. Then, depending on the framework or GraphQL provider, you either implement these resolvers yourself or they are provided automagically. When resolvers are provided for you, the implementation will determine whether we can talk about native GraphQL or not. In essence, the resolvers are just functions that have a certain signature. In JavaScript, such a resolver could look like this:

function someresolver(obj, args, context, info) {
   return // do something to get your data 
}

And each of our “fields” will have a corresponding resolver. Fields are just the attributes in your schema.

// todo-schema.gql
type Todo {
   title: String!
   completed: Boolean!
   list: List
}

type List {
   title: String!
   todos: [Todo] @relation
}

type Query {
   getList(id: ID): List!
}

Each of these fields will have a resolver function (either generated by the library or implemented manually). The execution of a GraphQL query starts at a root field resolver, in this case, getList. Since getList promises to return a List, we will also need to fill in the fields of the List; therefore we need to call the resolvers for these fields and so on. In essence, it’s a process of recursive function calls. Let’s look at an example:

query {
   getList("<some id>"){
      todos {
         title
      }
   }
}

For the above query, we would traverse three fields, each with a resolver function:

  • getList returns a List with only the field todos
  • todos receives the List item from getList and returns a list of Todos related to that list.
  • title receives a Todo item from the todos resolver and returns a string from the title.

This is in itself a very elegant recursive way to answer the query. However, we will see that the choices made in the actual implementation will have a huge impact on performance, scalability, and the behavior of your API.

Patterns for writing resolvers

In order to understand the different approaches, we need to learn how to get started building a GraphQL execution engine. The syntax of that depends on the server library we choose, but each library adheres to the resolver guidelines described above.

Resolvers are just functions

As we have explained, implementing a GraphQL engine is all about implementing functions called resolvers with a specific signature.

function someresolver(obj, args, context, info) {
   return // do something to get your data 
}

The arguments serve different purposes for which you can find the details here. The ones that are most important for this implementation are:

  • obj is the previous object. In the previous example, we mentioned that the todos resolver receives the List object that was resolved by getList. The object parameter is meant to pass on the result of the previous resolver.
  • args is the argument(s). In getList, we pass an ID, so args will be {id: “some id”}.

And these functions form a resolver chain

We can’t do much with one resolver function, and our GraphQL server library will need to know how to map a query to the different resolvers and how we delegate work from one resolver to the next resolver. Each library has a slightly different syntax to specify this, but the general idea remains the same. For example, one syntax to specify the mapping of queries to resolvers could look as follows:

{
   Todo: {
      title(obj, args, context, info) { ... }
      completed(obj, args, context, info) { ... }
      list(obj, args, context, info) { ... }
},

List: {
   title(obj, args, context, info) { ... }
   todos(obj, args, context, info) { ... }
}

Query: {
   getList(obj, args, context, info) { ... }
   }
}

If we write the query below, we can match this to resolvers by first looking into the root resolvers (the ones in Query) where we will find the getList resolver. Since that resolver returns a List, the GraphQL execution engine knows that getList returns a List and therefore it needs to go search in List for the resolver of the todos field. 

query {
   getList("<some id>"){
      todos {
         title
      }
   }
}

The way this resolves is called the resolver chain.

graphql resolver chain 01 Fauna

Now that we know how GraphQL libraries want us to write resolvers, we can start thinking about how it affects performance.

The resolver chain is more like a resolver tree

The above explanation might make you think that resolving a GraphQL query is quite linear, but in fact resolver chains are more like chains that keep on splitting… Oh wait, that’s just called a tree!

graphql resolver chain 02 Fauna

GraphQL approach #1: The naive implementation

When we implement resolvers naively in this rather elegant recursive system we can easily write an API that is slow. By playing human interpreter on the previous query we can see how a naive implementation results in a tree-like execution plan that sends many queries to the database.

graphql naive implementation 03 Fauna

Although there are only two database calls in the above implementation, one of these calls, getTodo(id), is called for each Todo that is associated with the List. That means that we will be hammering our database with 1 (getting the list) + N (getting each todo) calls and joining all this data in the back end. This is known as the N+1 problem.

Remember that, in contrast, a clean REST API would allow us to call each of these resources separately from the front end and require us to join these somewhere (e.g. in the front end) or require us to build custom endpoints. There were two things we wanted to improve upon by using GraphQL: 

  • Multiple calls from the front end, requiring that the data also has to be joined in the front end.
  • Workarounds that require you to build a custom endpoint for performance or for each new front end or app requirement. For example, if you have a new UI for your mobile application, it might have different requirements. It might require other joins or fewer attributes.
rest back end calls 04 Fauna

An example that depicts the number of calls in a REST back end, both when we follow REST practice by separating endpoints per entity type and when we optimize by writing custom endpoints to join.

Although our naive GraphQL endpoint effectively solves these issues, it creates another problem further down the road.

graphql naive implementation 05 Fauna

The introduction of GraphQL in combination with a naive implementation moves a problem that was transparent to the back end, where the problem might still be present but is hidden from the API user. In essence, we replaced the problem of multiple REST calls with the N+1 problem. It is not necessarily a problem to express generic queries in REST because similar things can be done with Odata. The difficulty is to provide an efficient implementation for such a generic endpoint.

With the naive implementation, we now have an even higher number of queries between the back end and the database, and the results have to be merged in the back end. Joining many small requests requires memory and might cripple our database as we scale. This is clearly not ideal.

GraphQL approach #2: Batching with Dataloader

Facebook has created a great piece of software, called Dataloader, that allows you to solve a part of the problem. If you plug in Dataloader, similar queries will be essentially cached per frame or per tick. Consider the following execution, without Dataloader.

graphql batching with dataloader 06 Fauna

Dataloader takes in all of these queries, waits for the end of the frame, then combines them into one query.

graphql batching with dataloader 07 Fauna

Besides that, Dataloader does in-memory caching of certain queries’ results. With Dataloader, we have now improved the number of database calls significantly. However, it only helps for similar queries that fetch the same entity type, which means that we are still joining data in memory and we are still doing multiple calls. If we look at the queries that are sent to the database, the approach now looks like this: 

graphql batching with dataloader 08 Fauna

This is already great, but we can do much better. After all, joining in memory in the back end is just not scalable as data grows and our schema becomes more complex, requiring that multiple entities be fetched. For example, consider the following model from a Twitter-based example application called Fwitter.

graphql batching with dataloader 09 Fauna

We need to retrieve not only the Fweets (which are like Twitter’s tweets), but their comments, hashtags, statistics, author, original fweets in case of a refweet, etc. We face a complex in-memory join, and again, many database calls. 

GraphQL approach #3: Generating the query

Ideally, if possible, we want a GraphQL query that translates into one database query. Sounds simple right? However, we’ll see that code generation can be hard or impossible depending on the underlying data storage and query language.

graphql generating the query 10 Fauna

For starters, this approach can raise many questions. Do we generate one query or multiple queries? Can we even generate one query that expresses a complex GraphQL statement? If we generate a big query with many joins, is it still efficient?

Since native GraphQL is a special case of “generating the query,” let’s first (and finally) define what we at Fauna call “native” GraphQL. We’ll answer these questions in the subsequent section, “Why is native GraphQL so difficult to achieve?” 

GraphQL approach #4: Native GraphQL

We can take the previous approach one step further and run this translation layer as a part of our infrastructure close to the database. That means that an extra hop will be eliminated. And if our database is multi-region, our GraphQL API will not lose the latency benefits of that.

If the GraphQL layer is a part of the database and allows you to do highly efficient queries straight from your application, then your response latencies will be much better. You’ll be fetching all the entities you need in one hop and only one call. 

native graphql 11 Fauna

At Fauna, we consider a GraphQL implementation native when the GraphQL layer lives on the infrastructure of the database and adheres to the following conditions:

  • One GraphQL query = one database query = one transaction
  • GraphQL queries offer the same ACID guarantees as the underlying database
  • The underlying database allows efficient execution of such queries
Previous 1 2 Page 2

In FaunaDB, native GraphQL means that your queries have the same consistency guarantees because one query also results in one transaction. Besides that, the same scalability and multi-region guarantees apply because there is no additional system, service, or back end in between that must be scaled separately. Queries written in GraphQL are just as fast and scalable as queries written in our own Fauna Query Language (FQL).

The last question that remains is, why doesn’t every approach that generates queries offer native GraphQL? That question requires us to investigate the difficulties of mapping GraphQL to a query. 

Native GraphQL challenges

If the main ingredient is code generation close to the data, then there should be many ways to do that, right? As I noted above, the problem is that code generation is not always simple. How difficult it is and whether it’s even feasible depends on the underlying database and query language. Let’s look at the challenges. 

Absence of relations

The most obvious problem would be the lack of relations. Some newer scalable databases like Firebase, Firestore, MongoDB, and DynamoDB do not focus on providing relations, which makes it infeasible to transform a GraphQL query into one database query. The only way out is to do multiple queries and joins in memory in a different location such as the back end.

Generation capabilities of the query language

Query languages are often created to provide easy ad hoc data exploration. They are therefore usually declarative and rarely intended to be generated. In many query languages, it’s therefore extremely difficult to generate an optimal query or even a complex query. For example, there is an impressive project called Join Monster that generates SQL queries from GraphQL. Writing something like that is a significant endeavor that requires arcane skills in string concatenation.

But wait, doesn’t an Object Relational Mapper (ORM) solve this problem? Well, not exactly. An ORM helps us map a GraphQL query to objects. But these objects will also have the same tree-like relations as the resolvers. Hence, although we’ve delegated the complex generation to the ORM tool, the ORM still has to do the work of translating your ORM queries to the underlying query language.

It’s true that ORM tools are great at that and do perform intelligent optimizations, but they will typically decide at a certain moment to break up the query into multiple subqueries due to the complexity of the generation, or more importantly, due to performance implications. This is the next and probably most important point.

Performance of the execution model

When there are many joins, these joins might become a performance issue for the underlying database in traditional databases. In traditional SQL databases, one big query is not always faster than multiple small queries. As a result, a query is typically broken down into multiple queries, which brings us again to the realm of multiple queries and joins in memory. Granted, keeping the splitting of queries to a minimum can yield a huge performance boost. But the solution will not be able to deliver consistency guarantees and will be harder to scale. 

The problem lies in the implementation of these joins and boils down to the low-level details of join implementations. It’s the same reason why a graph database is better at certain workloads (and worse at others) than a traditional RDBMS. In an RDBMS, a join typically works on indexes by using an algorithm (nested loops or hash joins). Long story short: Such algorithms are great for joining two giant sets of data together efficiently, but they become less efficient when many joins are in play. In essence, there is a mismatch between the GraphQL join and the SQL join; the RDBMS is not made for the “tree-walking strategy” required by GraphQL.

What is required for native GraphQL?

Although FaunaDB is not a pure graph database, it shares a few similarities with a graph database. One of those similarities is something that graph databases call index-free adjacency. In simple terms, this means that you can directly link different objects together in storage using references.

graphql requirements 12 Fauna

Instead of looping through an index or building a hash (nested loops or hash joins), FaunaDB simply walks through the list of references and dereferences them as it encounters them. In other words, FaunaDB does not need to do joins in this scenario because it offers an alternative solution.  

Walking the tree and dereferencing the references

To walk through a list of references we have a convenient Map function, and to dereference a reference we have a Get function. The pattern to loop through a list and dereference would look as follows.

Map(
   ListOfReferences,
   Lambda('ref', Get(Var('id')))
)

Imagine that the result of this Get is also a list. Well, no problem. We can just Map and Get over that list too. It’s just like a regular programming language.

Due to the way that this works and the fact that pagination is baked into FQL and therefore also into our GraphQL, we do not have problems executing big joins. The Map and Get pattern works similarly to how a graph database would execute the query (simply by following the references).

graphql requirements 13 Fauna

If we look at this pattern and then remember how the resolvers in GraphQL work recursively, we can see that this is very similar. So, it’s not surprising that FQL maps very well onto GraphQL. The only difference is that this is not happening in memory or in your back end. It’s all happening in the database in one query and one transaction, which dramatically reduces the round trips between your database and the GraphQL client.

In FaunaDB’s native GraphQL implementation, each resolver function resolves a field into an FQL expression. Then, once all the fields are resolved, all these snippets of FQL form one bigger FQL query that can be executed as a whole. In our case, this is made easy thanks to a functional and composable query language.

A functional and composable query language

The previous example might look verbose at first and it’s no doubt different from what you are used to. When the Fauna Query Language was designed, the vision was for it to be a language that could do more than just querying. We wanted to use FQL in user-defined functions (stored procedures) as well as complex conditional transactions because we believe that it makes sense to keep your business logic as close to where the data lives as possible. Traits like this make FQL more of a general purpose programming language than other query languages.

A second requirement was to empower our users to write their own tools on top of FQL, which required the language to be highly composable. The perfect language for that is an expression-oriented functional language. In contrast to many query languages, FQL is not declarative. That means that you do not just write what you want to retrieve and let the database figure it out. Instead, just as in a normal programming language, you write exactly how you want to retrieve the data.

Finally, more and more databases today are being accessed directly from the front end or from a serverless function, and serving clients all over the world. Due to the potential distance between the application and the database, it was important for FQL to be able to fetch data and execute complex logic in one query. That sounds familiar, right? It’s exactly one of the reasons why you would use GraphQL as well.  

The first important insight is that the components of our queries are just functions in the language that we are currently using (in our case JavaScript). This means that we can easily put snippets of FQL in functions or assign them to variables and compose them using the constructs of our host language. The only way to see how easy it is to generate complex queries in FQL is to implement one ourselves, so let’s take the previous query and start implementing it. 

query {
   getList("<some id>"){
      todos {
         title
      }
   }
}

We want to fetch lists by ID so we start off by getting the index. In FaunaDB, indexes are mandatory, which makes it incredibly difficult to write an inefficient GraphQL query. Further, FaunaDB generates all of these indexes for you behind the scenes.

We start off with the index function to get the Index.

Index('list_by_id')

And then place it in a Match function to get the list reference.

Match(Index('list_by_id'), "list id")

This returns Fauna references. From those, we can then get the actual List objects by mapping over this list of IDs and calling Get. The fact that we support the language constructs such as Map that you often see in regular programming languages already gives a sneak preview why it’s so easy to map a GraphQL query to FQL.

Map(
 Paginate(
   Match(Index('list_by_id'), "list id")
 ),
 Lambda('id', Get(Var('id')))
)

If we want to get Todos as well, no problem. We will use Let to make our query structures and delegate the retrieval to todos to another function. And this is where it becomes interesting. All we are doing here is defining the query, we are not executing anything yet. If we were implementing this in JavaScript (which we have an FQL driver for) we could now just start using JavaScript variables to compose our query.

const query = Map(
 Paginate(
   Match(Index('list_by_id'), "list id")
 ),
 Lambda('id',
   Let({
       list: Get(Var('id')),
       todos: getTodos(Select(['todos'], Var('list')))
     },
     // Return the variables
     {
       todos: Var('todos'),
       list: Var('list')
     }
   )
 )
)

This means that getTodos could be implemented similarly and will just be placed into the already existing query.

function getTodos(todos){
 return Map(
   todos,
   Lambda('todoId', Get(Var('todoId')))
 )
}

Since we are not constructing a string but just placing functions in other functions, we can now break up this query much more easily. Generating a generic query for a complex model becomes trivial. The root resolver can then just call the complete query. So, in our native GraphQL, what is actually happening behind the scenes is that each GraphQL query is translated into one query in our native language (FQL).

Native GraphQL guarantees

GraphQL is so easy to use that it is quickly becoming the language of choice to query a database. The fact that a familiar language can be used to retrieve data from many different databases and APIs is a very positive development. However, the success of GraphQL came at the price of giving up knowledge of how our data is retrieved.

Before GraphQL, the REST API that we accessed, or the SQL and/or the query plans the SQL generated, gave us a clear indication of how our API would perform, scale, and behave (including which guarantees, e.g. ACID, it provided). Being only one endpoint and one familiar language with many different implementations, GraphQL undeliberately obfuscates this. When we use a library/framework/GraphQL provider, the implementation often becomes a black box. 

The term native GraphQL indicates that the provided GraphQL API has some desired properties. We found it important to clearly indicate what these properties are and how we implemented them in FaunaDB. We’ve explained which issues you might encounter in do-it-yourself approaches or approaches based on ORMs or SQL generation, and we show why we did not encounter these problems and why we are able to offer something that we call native GraphQL.

To us at Fauna, native GraphQL is an API that adheres to the same guarantees as the underlying database. In that sense, native GraphQL is indistinguishable from the database query language in terms of performance, scalability, and ACID guarantees.

Brecht De Rooms is senior developer advocate at Fauna. He is a programmer who has worked extensively in IT as a full-stack developer and researcher in both the startup and IT consultancy worlds. It is his mission to shed light on emerging and powerful technologies that make it easier for developers to build apps and services that will captivate users.

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Posted by Contributor