One of the biggest drawbacks of an out-of-the-box GraphQL solution is its tendency to make ridiculous numbers of N+1
queries. For example, consider the following GraphQL query:
{
patients {
name
bed {
code
}
}
}
We’re trying to grab all of the patients
in our system, and for each patient, we also want their associated bed
.
While that seems simple enough, the resulting database queries are anything but. Using the most obvious resolvers, our GraphQL server would ultimate make N+1
queries, where N
represents the number of patients in our system.
const resolvers = {
Query: {
patients: (_root, _args, _context) => Patients.find({})
},
Patient: {
bed: ({ bedId }, _args, _context) => Beds.findOne({ _id: bedId })
}
};
Our application first queries for all patients (Patients.find
), and then makes a Beds.findOne
query for each patient it finds. Thus, we’ve made N
(bed for patients) +1
(patients) queries.
This is unfortunate.
We could easily write a traditional REST endpoint that fetches and returns this data to the client using exactly two queries and some post-query transformations:
return Patients.find({}).then(patients => {
return Beds.find({ _id: { $in: _.map(patients, 'bedId') } }).then(beds => {
let bedsById = _.keyBy(beds, '_id');
return patients.map(patient => {
return _.extend({}, patient, {
bed: bedsById[patient.bedId]
});
});
});
});
Despite its elegance, the inefficiency of the GraphQL solution make it a no-go for many real-world applications.
Thankfully, there’s a solution! 🎉
Facebook’s dataloader
package is the solution to our GraphQL inefficiency problems.
DataLoader is a generic utility to be used as part of your application’s data fetching layer to provide a consistent API over various backends and reduce requests to those backends via batching and caching.
There are many fantastic resources for learning about DataLoader, and even on using DataLoader in an Apollo-based project. For that reason, we’ll skip some of the philosophical questions of how and why DataLoader works and dive right into wiring it into our Apollo server application.
All we need to get DataLoader working in our application is to create our “batch”, or “loader” functions and drop them into our GraphQL context for every GraphQL request received by our server:
import loaders from "./loaders";
...
server.use('/graphql', function(req, res) {
return graphqlExpress({
schema,
context: { loaders }
})(req, res);
});
Continuing on with our current patient and bed example, we’ll only need a single loader to batch and cache our repeated queries against the Beds
collection.
Let’s call it bedLoader
and add it to our loaders.js
file:
export const bedLoader = new DataLoader(bedIds => {
// TODO: Implement bedLoader
});
Now that bedLoader
is being injected into our GraphQL context, we can replace our resolvers’ calls to Beds.findOne
with calls to bedLoader.load
:
const resolvers = {
Patient: {
bed: ({ bedId }, _args, { loaders }) => loaders.bedLoader.load(bedId)
}
};
DataLoader will magically aggregate all of the bedId
values that are passed into our call to bedLoader.load
, and pass them into our bedLoader
DataLoader callback.
Our job is to write our loader function so that it executes a single query to fetch all of the required beds, and then returns them in order. That is, if bedIds
is [1, 2, 3]
, we need to return bed 1
first, bed 2
second, and bed 3
third. If we can’t find a bed, we need to return undefined
in its place:
export const bedLoader = new DataLoader(bedIds => {
return Beds.find({ _id: { $in: bedIds } }).then(beds => {
const bedsById = _.keyBy(beds, "_id");
return bedIds.map(bedId => bedsById[bedId]);
});
});
That’s it!
Now our system will make a single query to grab all of the patients
in our system. For every patient we find, our bed
resolver will fire and that patient’s bedId
into our bedLoader
DataLoader. Our bedLoader
DataLoader will gather all of our bedId
values, make a single query against the Beds
collection, and return the appropriate bed to the appropriate bed
resolver.
Thanks to DataLoader we can have the elegance of a GraphQL approach, combined with the efficiency and customizability of the manual approach.