2020-06-16

#graphql

Karthik Kalyanaraman

20264

Jun 16, 2020 ⋅ 6 min read

Avoid overfetching with properly designed GraphQL resolvers

Karthik Kalyanaraman Software engineer. Curious about technology and the economics of the tech industry. Follow me on Twitter @karthikkalyan90.

Introduction

If you’re reading this article, then I assume you’re fairly convinced of the benefits GraphQL brings to the table. As you may have heard, GraphQL solves one of the fundamental problems of REST, which is the overfetching and underfetching of data.

Avoid Overfetching With Properly Designed GraphQL Resolvers

These problems become even more apparent when you are building a server for mobile-first apps. In REST, there are two ways you can architect and design new requirements:

Create a new endpoint
Reuse an existing endpoint by fetching the extra information with it

Both approaches have their own set of tradeoffs: the first option will lead to more round trips, which is not ideal if the mobile user is in a spotty network condition; the second option wastes bandwidth unnecessarily.

We can solve both of these problems elegantly with GraphQL because it promises to give us exactly what we ask for. But if you don’t understand the quirks of GraphQL resolvers, you may run into overfetching problems even with GraphQL. Well-designed resolvers are fundamental to reaping the benefits of GraphQL.

A brief review of queries

In simple terms, resolvers are functions that resolve the value for a GraphQL type or a field of a GraphQL type. Before we jump into the resolver design process, however, let’s briefly look at the GraphQL query type.

What really makes GraphQL queries tick is the fact that they look like JSON, and everyone knows JSON well. For the sake of explanation, let’s design a GraphQL API for fetching data from a school database that has student and course information in the database.

Let’s say you’re writing a query that looks like:

query {
  student(id: "student1") {
    name,
    courses {
      title
    }
  }
}

Before this query hits the corresponding resolver, it is parsed into a tree/graph. As you might already know, a query is a root type. This means the query will be the root node of the tree, which looks something like this:

GraphQL Query Tree Diagram

As you can see, query is the root node, and student, name, courses, and title are the children. The GraphQL query is parsed into a tree like this before hitting the resolvers. It’s useful to visualize your queries this way because efficient resolvers are designed based on the actual structure of the queries.

Moving on! Now that we know how to visualize queries as trees, let’s go ahead and write resolvers. A resolver in GraphQL has the following structure:

const resolvers = {
  Query: {
    student: (root, args, context, info) => { return students[args['id']] }
  }
}

root – signifies the result from the parent type
args – arguments passed to the resolver
context – a mutable object that can be used for storing/passing common configs like session data, req (in Express), etc.
info – contains field information like fieldName, fieldNodes, returnType, etc.

Like I mentioned before, resolvers can be written for every type and every field. Let’s go ahead and write a resolver for the Student type. Our schema file looks like this:

  type Query {
    student(id: String!): Student
  }

  type Course {
    id: String!
    title: String
  }

  type Student {
    id: String!
    name: String
    courses: [Course]
  }

I like to keep the resolvers in a separate file. For the sake of this example, I am storing the data in global variables. My resolvers.js file looks like this:

var students = {
  'student1': {
    id: 'student1',
    name: 'karthik',
    courses: ['math101', 'geography201']
  },
  'student2': {
    id: 'student2',
    name: 'john',
    courses: ['physics201', 'chemistry103']
  },
};


var courses = {
  'math101': {
    id: 'math101',
    title: 'Intro to algebra',
  },
  'geography201': {
    id: 'geography201',
    title: 'Intro to maps',
  },
  'physics201': {
    id: 'physics201',
    title: 'Intro to physics',
  },
  'chemistry103': {
    id: 'chemistry103',
    title: 'Intro to organic chemistry',
  },
};

const resolvers = {
  Query: {
    student: (root, args, context, info) => { 
      return students[args['id']]
    }
  }
}

module.exports = resolvers

As we can see, the resolver for student takes an id in its args and returns the corresponding student from the students object:

Student Query Results

OK, we just saw how the passed argument ID, "student1", hit the resolver through the resolver’s argument, "args". Let’s explore the other arguments.

`root`

Every graphQL type has a default resolver. When you don’t write a resolver for a type or a field, GraphQL automatically looks into the root for a property with the same name as the field or a type. The default resolver will look something like this:

const resolvers = {
  Query: {
    student: (root, args, context, info) => { 
      return students[args['id']]
    },
  },
  Student: {
    name: (root, args, context, info) => {
      return root.name;
    }
  }
}

module.exports = resolvers

On lines 8–9, I have basically implemented what the default resolver for the name field does. If you want to test the theory, return a static string instead of root.name. You will notice that it returns the static string for all queries to student(id).

`context`

The context can be used for passing information between resolvers. For instance, if you want to pass the req object down to all fields, you can simply mutate context by adding the req to it.

OK! Basics out of the way. Let’s look at some potential problems we may unknowingly face with the design of resolvers and how we can overcome them.

Issues with resolvers

Overfetching

Yes! You read that right. Isn’t this exactly the reason why we moved away from REST? Absolutely! But there are scenarios in which we could experience overfetching because of the way we have designed our resolvers.

Over 200k developers use LogRocket to create better digital experiences

Learn more →

For instance, if you want to write a resolver for "courses" in the student type and fetch the courses along with the student query, we can do something like this:

const resolvers = {
  Query: {
    student: (root, args, context, info) => {
      const studentCourses = students[args['id']]['courses'].map(id => {return courses[id]})
      return {
        ...students[args['id']],
        "courses": studentCourses
      }
    },
  },
  Student: {
    name: (root, args, context, info) => {
      return root.name;
    }
  }
}

Student Courses Query Results

Problem 1: What happens if we write a query that asks only for the student `id` and `name`?

We would still be unnecessarily doing the operation in line 4. In a real-world scenario, this could even be an expensive API call. But when GraphQL resolves the query, it would drop the extra data on the floor.

On the surface, it still seems like we are getting only what we asked for. But behind the scenes, we have forced our server to overfetch because of the way we designed our resolver.

Solution: Move the courses resolver to the courses field.

const resolvers = {
  Query: {
    student: (root, args, context, info) => {
      return students[args['id']];
    },
  },
  Student: {
    name: (root, args, context, info) => {
      return root.name;
    },
    courses: (root, args, context, info) => {
      return root.courses.map(id => courses[id]);
    }
  }
}

Notice how I am leveraging the root argument to my advantage. This is the exact reason why it’s useful to visualize the query as a tree and understand the root node and how you can use it. Now, if we just query for id and name, we are not at risk of resolving the courses unnecessarily.

Great! So does adopting this pattern solve all our problems? Unfortunately not!

Problem 2: What happens if we write a query to fetch only for the courses?

You might think it is going to resolve only the courses resolver. But, again, think about the tree — what gets resolved first? The student resolver!

The student node is the parent of the courses node, and GraphQL resolves in a breadth-first search fashion, which means the student node gets resolved before the courses node so that the courses node’s root argument is populated.

Again, we run into overfetching. What to do now?

Let’s move the student resolver down to its fields and resolve the fields separately.

More great articles from LogRocket:

Don't miss a moment with The Replay, a curated newsletter from LogRocket
Learn how LogRocket's Galileo cuts through the noise to proactively resolve issues in your app
Use React's useEffect to optimize your application's performance
Switch between multiple versions of Node
Discover how to use the React children prop with TypeScript
Explore creating a custom mouse cursor with CSS
Advisory boards aren’t just for executives. Join LogRocket’s Content Advisory Board. You’ll help inform the type of content we create and get access to exclusive meetups, social accreditation, and swag.

const resolvers = {
  Query: {
    student: (root, args, context, info) => {
      return args['id'];
    },
  },
  Student: {
    id: (root, args, context, info) => {
      return students[root]['id'];
    },
    name: (root, args, context, info) => {
      return students[root]['name'];
    },
    courses: (root, args, context, info) => {
      return students[root]['courses'].map(id => courses[id]);
    }
  }
}

Notice how I just returned the id from the student resolver and transferred the concern down to the individual fields. The individual fields are now responsible for resolving its value.

OK, great! So does this solve our problem of overfetching now? It does, but with a caveat.

Problem 3: What happens if we query for `id`, `name`, and `courses`?

The student object is fetched twice, once each for id and name. But the courses object is fetched only one. This duplication of request is a problem that is much easier to work with if we have the code written in a cleaner and more testable fashion like this.

We can clearly see how many times a particular API will be called by simply looking at the number of fields for a type that uses this API in its resolver. Fortunately, there are a few solutions that can help solve the request duplication problem:

This issue is also known as the N + 1 problem in GraphQL because we make one call to resolve the student and N calls to resolve each of the N types nested in the root type (student).

Is it necessary to use these solutions in my GraphQL server?

It depends! If you already have a rich codebase of database APIs (or ORMs) for talking to your underlying database — and if you suspect that some of these calls will be reused to resolve different fields in your GraphQL schema — it is a good practice to adopt a data deduplication solution like those above to fully realize GraphQL’s performance benefits.

You might have noticed that we are repeatedly doing similar operations on lines 9 and 12. In a real-world project, this could be an API call, and calling APIs multiple times to resolve different fields may look bad on the surface and tempt you to refactor it.

But having an understanding of the problem it solves is much more important. When you have a data deduplication solution set up, it’s possible that the API call is made only once and cached for reuse.

Database calls

So far we have seen how to avoid some of the common problems while writing resolvers. Now let’s take a look at how we can structure the database calls and what options are available to us.

MySQL/PostgreSQL

If you are using a SQL database, chances are you already use an ORM like sequelize or sqlalchemy for fetching data from your database. If that’s the case, it’s ideal to call the sequelize APIs inside the resolver functions. This way you can scope the calls specific to that particular field.

It’s also a generally good idea to pass the db configs using resolvers’ context field.

MongoDB

In the case of MongoDB, you can directly use the Mongo CRUD APIs inside the resolvers.

Conclusion

I hope you enjoyed reading this post about GraphQL resolvers, and hopefully you found it useful. Feel free to leave any questions or feedback. 🙂

Monitor failed and slow GraphQL requests in production

While GraphQL has some features for debugging requests and responses, making sure GraphQL reliably serves resources to your production app is where things get tougher. If you’re interested in ensuring network requests to the backend or third party services are successful, try LogRocket.

https://logrocket.com/signup/

LogRocket is like a DVR for web and mobile apps, recording literally everything that happens on your site. Instead of guessing why problems happen, you can aggregate and report on problematic GraphQL requests to quickly understand the root cause. In addition, you can track Apollo client state and inspect GraphQL queries' key-value pairs.

LogRocket instruments your app to record baseline performance timings such as page load time, time to first byte, slow network requests, and also logs Redux, NgRx, and Vuex actions/state. Start monitoring for free.

#graphql

Building a full-featured Laravel admin dashboard with Filament

Build scalable admin dashboards with Filament and Laravel using Form Builder, Notifications, and Actions for clean, interactive panels.

Kayode Adeniyi

Dec 20, 2024 ⋅ 5 min read

Working with URLs in JavaScript

Break down the parts of a URL and explore APIs for working with them in JavaScript, parsing them, building query strings, checking their validity, etc.

Joe Attardi

Dec 19, 2024 ⋅ 6 min read

Lazy loading vs. Eager loading

In this guide, explore lazy loading and error loading as two techniques for fetching data in React apps.

Njong Emy

Dec 18, 2024 ⋅ 5 min read

How to migrate your Node.js app to Deno 2.0

Deno is a popular JavaScript runtime, and it recently launched version 2.0 with several new features, bug fixes, and improvements […]

Yashodhan Joshi

Dec 17, 2024 ⋅ 7 min read

View all posts

2 Replies to "Avoid overfetching with properly designed GraphQL resolvers"

Carlos Mercado says:

August 4, 2020 at 10:50 pm

I really enjoy your post. I learned a lot. Thaks you !!

Reply
tylim says:

November 24, 2020 at 4:31 pm

I dont understand why problem 3 student get called twice.

student, id, name and courses should get called once

if your have course schema (singular of courses), than this schema will get called twice, if the title schema exist, than title also get called twice, which mean course and title are where the n+1 problem occur

Reply

Advisory boards aren’t only for executives. Join the LogRocket Content Advisory Board today →

Avoid overfetching with properly designed GraphQL resolvers

Introduction

A brief review of queries

`root`

`context`

Issues with resolvers

Overfetching

Over 200k developers use LogRocket to create better digital experiences

Problem 1: What happens if we write a query that asks only for the student `id` and `name`?

Problem 2: What happens if we write a query to fetch only for the courses?

More great articles from LogRocket:

Problem 3: What happens if we query for `id`, `name`, and `courses`?

Is it necessary to use these solutions in my GraphQL server?

Database calls

MySQL/PostgreSQL

MongoDB

Conclusion

Monitor failed and slow GraphQL requests in production

Stop guessing about your digital experience with LogRocket

Recent posts:

Building a full-featured Laravel admin dashboard with Filament

Working with URLs in JavaScript

Lazy loading vs. Eager loading

How to migrate your Node.js app to Deno 2.0

2 Replies to "Avoid overfetching with properly designed GraphQL resolvers"

Leave a ReplyCancel reply

Advisory boards aren’t only for executives. Join the LogRocket Content Advisory Board today →

Introduction

A brief review of queries

root

context

Issues with resolvers

Overfetching

Over 200k developers use LogRocket to create better digital experiences

Problem 1: What happens if we write a query that asks only for the student id and name?

Problem 2: What happens if we write a query to fetch only for the courses?

More great articles from LogRocket:

Problem 3: What happens if we query for id, name, and courses?

Is it necessary to use these solutions in my GraphQL server?

Database calls

MySQL/PostgreSQL

MongoDB

Conclusion

Monitor failed and slow GraphQL requests in production

Share this:

Stop guessing about your digital experience with LogRocket

Recent posts:

Building a full-featured Laravel admin dashboard with Filament

Working with URLs in JavaScript

Lazy loading vs. Eager loading

How to migrate your Node.js app to Deno 2.0

2 Replies to "Avoid overfetching with properly designed GraphQL resolvers"

Leave a ReplyCancel reply

`root`

`context`

Problem 1: What happens if we write a query that asks only for the student `id` and `name`?

Problem 3: What happens if we query for `id`, `name`, and `courses`?