Leonardo Losoviz Freelance developer and writer, with an ongoing quest to integrate innovative paradigms into existing PHP frameworks, and unify all of them into a single mental model.

What we can learn from GraphQLConf 2021

8 min read 2264

What we can learn from GraphQLConf 2021

GraphQLConf 2021 took place a few weeks ago, delivering talks by many great speakers on several hot topics being discussed within the GraphQL ecosystem, including:

The videos were recently uploaded, so I took the time to watch them all; selected a few talks on schema stitching, federation, architectural design, and ecommerce; and made a summary of them, which I present in this article. Enjoy!

Schema stitching: Enriching data in headless GraphQL architectures

In this video, Roy Derks walks us through an example of using schema stitching via @graphql-tools/stitch, and makes available a GitHub repo with the demo code.

Schema stitching is the art of combining the GraphQL schemas from different services into a single, unified GraphQL schema. The goal is to produce a gateway service from which we can access all services in our company.

Schema stitching was first introduced by Apollo, but they decided to discontinue it in favor of Apollo Federation. Most recently, The Guild has taken over the concept, re-implementing it and improving on the previous design, producing a solution that offers comparable benefits to federation while providing a simpler way of thinking about the problem.

Roy uses schema stitching to demonstrate how to combine the data from two external services:

  1. A content management system, available (as a mock) under localhost:3001
  2. An ecommerce API, available (as a mock) under localhost:3002

These two data sources are also combined with the schema from the local GraphQL server, available under localhost:3000.

Accessing a unified GraphQL schema — Screenshot from Roy Derks' talk
Accessing a unified GraphQL schema — Screenshot from Roy Derks’ talk

The employed stack is based on:

An API route from Next.js is created under api/graphql.js, which exposes the GraphQL endpoint under localhost:3000/api/graphql. We can interact with it via the GraphQL Playground, by opening the endpoint’s URL in the browser:

An image of the GraphQL Playground
An image of the GraphQL Playground

The relevant code in the API route to merge the schemas, in its simplest form, is the following:

We made a custom demo for .
No really. Click here to check it out.

let localSchema = makeExecutableSchema({
    // Code here to create the local GraphQL schema
});

export default async function grapqhl(req, res) {
  // Setup subschema configurations
  const localSubschema = { schema: localSchema };

  const cmsSubschema = await createRemoteSchema({
    url: 'http://localhost:3001/graphql/'
  });

  const productsSubschema = await createRemoteSchema({
    url: 'http://localhost:3002/graphql/',
  });

  // Build the combined schema and set up the extended schema and resolver
  const schema = stitchSchemas({
    subschemas: [localSubschema, productsSubschema, cmsSubschema]
  });
}

This code sets up the local GraphQL schema using makeExecutableSchema, loading the remote GraphQL schemas from localhost:3001/graphql and localhost:3002/graphql using createRemoteSchema, and finally merging them all together using stitchSchemas.

Next, we need to relate the schemas with each other, so that we can retrieve the products (provided by the ecommerce API) by the posts’ user IDs (provided by the external CMS):

{
  cms_allPosts { # This data comes from the CMS
    id
    User {
      name
      Products { # This data comes from the e-commerce API
        id
        title
      }
    }
  }
}

Combining the schemas is configured in stitchSchemas:

// Build the combined schema and set up the extended schema and resolver
const schema = stitchSchemas({
  subschemas: [localSubschema, productsSubschema, cmsSubschema],
  typeDefs: `
    extend type Product {
      cmsMetaData: [Cms_Product]!
    }
  `,
  resolvers: {
    Product: {
      cmsMetaData: {
        selectionSet: `{ id }`,
        resolve(product, args, context, info) {
          // Get the data for the extended type from the subschema for the CMS
          return delegateToSchema({
            schema: cmsSubschema,
            operation: 'query',
            fieldName: 'cms_allProducts',
            args: { filter: { id: product.id } },
            context,
            info,
          });
        },
      },
    },
  },
});

Finally, we must check if the different schemas expose their data under the same type or field name. For instance, the ecommerce API could also have a type called Post, and/or exposed under field allPosts, thus producing a conflict.

Avoiding these conflicts is accomplished via createRemoteSchema‘s transforms parameter, which allows us to rename the types and fields from one schema into something else. In this case, we have the subschema from the CMS rename its type Post into Cms_Post and field allPosts into cms_allPosts:

const cmsSubschema = await createRemoteSchema({
  url: 'http://localhost:3001/graphql/',
  transforms: [
    new RenameRootFields(
      (operationName, fieldName, fieldConfig) => `cms_${fieldName}`,
    ),
    new RenameTypes((name) => `Cms_${name}`),
  ],
});

We can finally execute a query that fetches data from all separate services, accessed through a single, unified schema:

Executing a GraphQL query from a merged schema
Executing a GraphQL query from a merged schema — Screenshot from Roy Derks’s talk

As previously mentioned, here are links if you’d like to look at the repo with the code or watch the full GraphQLConf video.

GraphQL with data federation vs. federation on GraphQL services

Tanmai Gopal is the co-founder and CEO of Hasura, a service providing realtime GraphQL APIs over Postgres.

Hasura is currently building their own version of federation, which is based on a different architecture than the one from Apollo Federation. In his talk, Tanmai walks us through all the challenges that Hasura’s federation solution is attempting to solve, and why Apollo Federation’s approach fails at addressing them.

Tanmai first describes what federation is, and when and why its use is justified:

What is federation in the GraphQL context? A slide from Tanmai Gopal's talk

Why should we even have federation? A slide from Tanmai Gopal's talk

Good signs for needing a unified GraphQL API

Tanmai then dives deep into two different ways of thinking about federation:

  1. Federation of GraphQL services (the approach followed by Apollo), which has each underlying source execute its portion of the query
  2. GraphQL on federated data, which centralizes the data from the different services before resolving the query

Federating GraphQL services vs. GraphQL on federated data

Tanmai then explains that while these two different approaches produce the same results, there is a sizable gain in performance when using the GraphQL on federated data approach:

A massive impact on performance

Tanmai provides a few example queries to describe why federating GraphQL services doesn’t scale well. He stresses that aggregating data across different services cannot be done efficiently, because, when resolving complex queries that contain cross-database joins across services, the gateway does not have a unified context for all entities across the involved subdomains.

Example 2: Top N Queries

Finally, Tanmai explains that Hasura is working on a solution based on using GraphQL with federated data, which is able to fetch all relevant data from the different services and resolve the query from a centralized location, instead of having each service resolve its part of the query, thus providing better performance for complex queries.

Faster performance

In order to compare Apollo’s and Hasura’s federation approaches, we will need to wait a bit more: Tanmai promises this new feature will be available soon, but doesn’t mention exactly when. The Hasura website doesn’t talk about it either, and there are no docs available yet. Currently, we only have the talk from this conference.

However, I imagine that Hasura’s approach is more restrictive than Apollo’s, since it will most likely depend on Hasura’s infrastructure to function, and its appeal will largely depend on Hasura’s price tag for the service. Apollo Federation, on the other hand, was designed as a methodology to split the graph based on directives, so it works without the need for any external tool or infrastructure (even though Apollo also offers managed federation), using preexisting syntax in GraphQL.

Finally, Hasura’s federation approach should be contrasted with GraphQL Tools’ schema stitching, which, as we saw from the previous talk, can deliver the same results in a simpler way.

Watch the video here.

Migrating GitHub’s global IDs

Andrew Hoglund works as a senior software engineer for GitHub’s API team. In his talk, Andrew shares how his team is currently migrating the global identifier for all entities in GitHub’s GraphQL API to a different format, why they are doing it, and the challenges they’ve come across.

We can visualize the format for the global ID by querying for field id on any object. For instance, fetching the ID for the leoloso/PoP repository object can be done through this GraphQL query:

{
  user(login: "leoloso") {
    repository(name: "PoP") {
      id
    }
  }
}

Executing the query, we obtain this response:

{
  "data": {
    "user": {
      "repository": {
        "id": "MDEwOlJlcG9zaXRvcnk2NjcyMTIyNw=="
      }
    }
  }
}

The ID for the repository object is MDEwOlJlcG9zaXRvcnk2NjcyMTIyNw==. This format has the following properties:

  • It is base64 encoded
  • Contains the object type and object ID
  • It is intended to be opaque

Decoding the ID, for instance via a Bash command, will reveal the stored information:

$ echo `echo MDEwOlJlcG9zaXRvcnk2NjcyMTIyNw== | base64 --decode`
010:Repository66721227

The underlying data, 010:Repository66721227, is comprised of:

  • A checksum
  • The object type
  • The database ID

The current format of GitHub's global IDs

This simple format initially served GitHub well because GitHub had stored its data on a single database. The entity’s global ID already provided all the information required to locate the entity in the database and retrieve its data:

The entity global ID could be used to locate items in the database

Some time later, GitHub used Vitess to migrate to a sharded database, which is a horizontal partition of the data within a database. Database shards can improve performance because, by spreading all data across multiple databases, each database table will have a reduced number of rows — thereby reducing the index size and making search faster — and different shards can be placed on different machines, which allows you to optimize the hardware for different pieces of data.

When querying the sharded database, the global ID format became unsuitable, even though it could still be used to retrieve the data, because the data for the entity would exist on only one database and not in all shards, and the global ID would not indicate which database the data was located in.

To locate the data, GitHub’s GraphQL API had to execute a query against all of the databases, from which just one of them would produce a match, making the query execution very ineffective:

The query execution is ineffective

As a result, the GitHub API team decided to migrate the global ID to a new format, which would also provide the name of the database containing the entity data, so it could be retrieved efficiently once again:

GitHub decided to migrate their global IDs to a new format containing the database name for each entity

This new format is more complex than the previous one. It is composed of two elements:

  • A type hint, indicating what type of entity it is
  • An ownership scheme, which contains the data needed to retrieve the entity of that type

The new format for GitHub's global IDs

The ownership scheme is customized per entity, since different entities require different pieces of data to be identified. For instance, a workflow run requires the ID of the pull request that triggered it.

This new format works well, allowing GitHub to solve the current issues and anticipate some in the future. In particular, GitHub may eventually store its data in a multi-region setup, and the new format could also indicate the name of the region where the data is stored.

Because this new ID format is incompatible with the previous one, GitHub will need to implement a slow, progressive rollout to make sure it doesn’t break services. For this task, GitHub has put in place a deprecation period during which the two ID formats will coexist, and created tools to help services migrate from the old to the new format:

A progressive rollout of the new global IDs will avoid breaking GitHub services

You can watch the video to learn more.

Powering ecommerce with GraphQL

Stuart Guest-Smith is the principal architect lead at BigCommerce. In his talk, Stuart explores how merchants are able to build performant, scalable, and personalized ecommerce experiences with GraphQL.

Stuart opens the talk by declaring that:

Ecommerce requires companies to be fast, flexible, and personalized. GraphQL makes this possible, in a big way.

An ecommerce service will normally span multiple pieces of software, including, among others:

  • Enterprise resource planning (ERP)
  • Order management system (OMS)
  • Product information management (PIM)
  • Customer relationship management (CRM)
  • Content management system (CMS)

GraphQL enables us to access and relate the data from all these backend systems, achieving one of the latest trends of ecommerce, “composable commerce”:

Modern commerce is composable

The data from these multiple backend systems can be made accessible via a GraphQL API to different frontends, such as:

  • An ecommerce storefront
  • A native mobile app
  • A headless CMS
  • A headless Digital Experience Platform (DXP)
  • A custom Progressive Web Application (PWA) storefront

This flexibility allows us to create a personalized and enhanced experience for our users.

Build differentiated shopper experiences with GraphQL

Stuart then explains that GraphQL can act as the interface between the multiple ecommerce backends and the multiple frontends by using the BfF architecture pattern, which aims to provide a customized backend per-user experience.

Using the BfF model

Tame the commerce class with a unified backend in GraphQL

GraphQL can help performance since it can retrieve only the required data, without under- or over-fetching. But in addition, we must add a caching layer and manage the query complexity (or risk having malicious actors execute expensive queries that could slow down the system):

Scalable and performance commerce

In conclusion, Stuart explains that one of the most important benefits that GraphQL delivers for ecommerce is the personalization of the customer experience:

Delivering personalized customer experiences with GraphQL

You can watch the full video here.

Conclusion

GraphQLConf 2021 brought us a glimpse of what’s being discussed in the GraphQL ecosystem right now, demonstrating that five years after GraphQL’s introduction there are still plenty of developments happening, with new methodologies being created to address the ongoing needs from the community.

This article summarizes talks on four different new developments:

  • Roy Derks explains how to use the new @graphql-tools/stitch library for schema stitching
  • Tanmai Gopal shows how Hasura is creating a competitor service to Apollo Federation
  • Andrew Hoglund shares how the GitHub GraphQL API’s global ID format has been migrated into a new format which can support database sharding and a multi-region set-up
  • Stuart Guest-Smith explains how GraphQL can be leveraged, using the BfF pattern, to enable composable commerce and personalized user experiences

Even though GraphQL is mature, it is very exciting to see that new developments are happening! To find out what else is taking place, check out all videos from GraphQLConf 2021.

Monitor failed and slow GraphQL requests in production

While GraphQL has some features for debugging requests and responses, making sure GraphQL reliably serves resources to your production app is where things get tougher. If you’re interested in ensuring network requests to the backend or third party services are successful, try LogRocket.https://logrocket.com/signup/

LogRocket is like a DVR for web apps, recording literally everything that happens on your site. Instead of guessing why problems happen, you can aggregate and report on problematic GraphQL requests to quickly understand the root cause. In addition, you can track Apollo client state and inspect GraphQL queries' key-value pairs.

LogRocket instruments your app to record baseline performance timings such as page load time, time to first byte, slow network requests, and also logs Redux, NgRx, and Vuex actions/state. .
Leonardo Losoviz Freelance developer and writer, with an ongoing quest to integrate innovative paradigms into existing PHP frameworks, and unify all of them into a single mental model.

Leave a Reply