As TanStack tools continue to gain adoption, developers are demanding more efficient and predictable ways to manage data without creating endless endpoints. Query-Driven Sync, TanStack’s newest feature, delivers exactly that.
As an application grows in size and complexity, the tendency for API sprawl – the need to create numerous endpoints to meet the needs of different clients or components – also increases. This can impede the developer experience, as developers must navigate extensive documentation to find the right API. It can also lead to complex lifecycle management, where teams may struggle to track which APIs require updates or deprecation.
In the latest update to TanStack DB (v0.5), TanStack introduced a feature designed to address this problem: Query-Driven Sync. This feature shifts how developers approach data fetching, caching, and updates. Instead of writing custom backend endpoints or GraphQL resolvers for every data view, Query-Driven Sync turns your component’s query into the API call. You define the query directly inside your client components, and TanStack DB automatically transforms it into a precise network request.
In this article, we’ll explore how Query-Driven Sync works and how you can leverage it to build efficient, scalable React applications.
To follow along with this article, you should have some prior knowledge of TanStack DB and its core concepts. If you’re new to the library, we recommend starting with our introductory guide on Tanstack DB.
The Replay is a weekly newsletter for dev and engineering leaders.
Delivered once a week, it's your curated guide to the most important conversations around frontend dev, emerging AI tools, and the state of modern software.
TanStack DB is built on a declarative philosophy: you describe what data you need, and the system handles fetching, caching, and updating. This approach works well for the most part, but performance disparities eventually become noticeable between large and small datasets due to issues of overfetching and unnecessary upfront data propagation.
Query-driven sync was introduced to address this issue by giving developers control over how data is loaded into collections using queries defined on the client. In other words, Query-Driven sync is the process of syncing Tanstack DB collections using live queries.
Before this update, collections would sync all records the moment they were initialized. In practice, this meant that once a collection began syncing, the client had to fetch the entire dataset from the server before the collection could be considered ready.
Take the following code example:
import { createCollection } from '@tanstack/db'
import { queryCollectionOptions } from '@tanstack/query-db-collection'
const todoCollection = createCollection(
queryCollectionOptions({
queryKey: ['todos'],
queryFn: async () => {
// This fetch returns all todos from the backend
const response = await fetch('/api/todos')
return response.json()
},
getKey: (todo) => todo.id,
})
)
// In a React component
function TodoList() {
const { data: todos, status } = useLiveQuery((q) =>
q.from({ todos: todoCollection })
)
if (status === 'loading') {
return <div>Loading …</div>
}
return (
<ul>
{todos.map((t) => (
<li key={t.id}>{t.text}</li>
))}
</ul>
)
}
When todoCollection is created and it starts to sync, it immediately calls queryFn(), which makes a network request to the /api/todos endpoint and fetches all the todos from the server. Only after the full list is returned and written into the collection does collection.status become ready, at which point you can use useLiveQuery query the data locally.
You can imagine how problematic this becomes if the server sends a large dataset, for example, 50,000 rows or more. Query-Driven Sync solves this by allowing collections to define the schema and security, while queries determine the specific subset of data to load into a collection.
TanStack DB 0.5 provides three distinct sync modes for implementing Query-Driven sync: Eager, On-demand, and Progressive. These modes are designed to address different data-loading needs across components and applications
Eager mode is the default sync mode and was the only option available prior to version 0.5. As shown in the previous example, this mode loads the entire dataset upfront. Once the data is loaded, all subsequent queries, such as filters, joins, sorts, and reads, run entirely on the client with sub-millisecond performance.
Enabling eager mode in your collections is straightforward; you simply add a syncMode property with the value `eager` to your collections loader as follows:
const todoCollection = createCollection({
queryKey: ['todos'],
queryFn: async () => {
const resp = await fetch('/api/todos');
return resp.json();
},
getKey: todo => todo.id,
syncMode: 'eager'
});
But as we’ve pointed out from that example, eager mode does not scale efficiently for larger datasets because it requires fetching the entire collection immediately, which could result in longer load times and higher bandwidth consumption.
This makes the eager mode ideal for datasets that are small (under 10,000 rows) and static, such as user preferences or small reference tables.
On-demand mode improves upon the shortcomings of the previous approach by allowing collections to selectively load large, dynamic datasets based on query demands. Basically, when on-demand mode is enabled, the collection only loads the data that your queries actively request.
However, using on-demand mode is not as straightforward as the eager approach. TanStack DB does not automatically translate queries into network requests; this must be done manually using a technique called predicate mapping.
But before predicate mapping comes into play, Tanstack DB initiates an internal process known as predicate pushdown. As soon as you define your queries, TanStack DB pushes those query predicates (logical expressions such as filters, limits, where clauses, and orderBy)
down to the collection’s loader through the loadSubsetOptions metadata on the context.
Let’s look at how this works in practice to give you a clearer picture of how everything ties together.
For example, suppose we have a product collection. We can enable the on-demand sync mode by adding syncMode: on-demand to the collection loader, like this:
import { createCollection } from '@tanstack/db'
import { queryCollectionOptions } from '@tanstack/query-db-collection'
const productsCollection = createCollection(queryCollectionOptions({
queryKey: ...,
queryFn: (...),
syncMode: 'on-demand'
}))
With on-demand enabled, queryFn automatically has access to loadSubsetOptions through the query collection’s context. Within this context, query predicates are saved as expression trees. We can parse these expression trees into a simple structured object using the helper function parseLoadSubsetOptions provided by Tanstack:
const productCollection = createCollection(
queryCollectionOptions({
queryKey: ['products'],
queryFn: async (ctx) => {
// Parse your query predicates into structured object
const parsed = parseLoadSubsetOptions(ctx.meta?.loadSubsetOptions)
// GET /api/products with query-specific filters
const response = await fetch(/*map params to make network request*/)
return response.json()
},
syncMode: 'on-demand',
})
)
Now, when you write your live query on the client like this…
const { data: cheapElectronics } = useLiveQuery(q =>
q
.from({ products: productCollection })
.where(({ products }) => eq(products.category, 'electronics'))
.where(({ products }) => lt(products.price, 100))
.orderBy(({ products }) => products.price, 'asc')
.limit(10)
);
…TanStack DB will interpret the query predicates as (category = electronics, price < 100, limit 10), pass them down to the queryFn, and parse them into a structured object. We can then convert this structured object into API parameters that look like the following:
GET /api/products?category=electronics&price_lt=100&sort=price:asc&limit=10
As you may have deduced from the previous section, predicate mapping is the process of converting parsed query predicates into valid query strings that can be used to make network requests.
To properly map predicates, it is important to understand the structure of the parsed predicate. They typically contain the following elements:
filters: array of { field, operator, value }sorts: array of { field, direction }limit: number | nullHere’s an example of what it might output:
{
filters: [
{ field: ["products", "category"], operator: "eq", value: "electronics" },
{ field: ["products", "price"], operator: "lt", value: 100 }
];
sorts: [
{ field: ["products", "rating"], direction: "asc" }
];
limit: 5;
}
As you can see, the object contains a structured description of what the client is asking for:
category must be equal (eq) to “electronics” price must be less than (lt) 100asc) orderlimit)We can access these properties through the parsed variable, but it’s more straightforward to destructure them directly from the parseLoadSubsetOptions helper function:
const { filters, sorts, limit } = parseLoadSubsetOptions(ctx.meta?.loadSubsetOptions);
Now, to map the predicates and translate them into a query string, we can do the following:
queryFn: async (ctx) => {
const { filters, sorts, limit } = parseLoadSubsetOptions(ctx.meta?.loadSubsetOptions);
const params = new URLSearchParams();
filters.forEach(({ field, operator, value }) => {
const name = field.join('.');
switch (operator) {
case 'eq':
params.set(name, String(value));
break;
case 'lt':
params.set(`${name}_lt`, String(value));
break;
// handle other operators, e.g. lte, gt, etc.
}
});
sorts.forEach(({ field, direction }) => {
params.set('sortBy', field.join('.'));
params.set('order', direction);
});
if (limit != null) {
params.set('limit', String(limit));
}
const resp = await fetch(`/api/products?${params.toString()}`);
return resp.json();
}
What’s happening here isn’t as complicated as it might seem. Let’s break it down step by step.
We start by creating a new URLSearchParams instance. This utility makes it easy to build query strings for API requests without manually concatenating strings.
const params = new URLSearchParams();
Next, we loop through the filters array. For each filter, we join the hierarchical field names into a string using a dot (e.g products.category) and store it in a name variable, then we append the operator to it. So the following switch case with the equality operator (eq):
case 'eq': params.set(name, String(value)); break;
Will map to a normal query parameter like products.category=electronics. Whereas the following with the less than operator (lt) will append _lt to the parameter name, like price_lt=100:
case 'lt':
params.set(`${name}_lt`, String(value));
break;
Next, we loop through the sorts array and apply the same logic to its fields array. This means joining the field names into a string and appending them to a sortBy parameter, while also setting an order parameter with the direction (could be asc or desc):
sorts.forEach(({ field, direction }) => {
params.set('sortBy', field.join('.'));
params.set('order', direction);
});
The above sort definition will map to a query parameter like sortBy=products.rating&order=asc.
Lastly, we check if the limit property is not null before setting the limit parameter:
if (limit != null) {
params.set('limit', String(limit));
}
At the end of this predicate mapping process, the URLSearchParams instance will have a valid and serializable query string that we can then use to make a network request to the server, as demonstrated in the example:
const resp = await fetch(`/api/products?${params.toString()}`);
return resp.json();
This results in precise, minimal network requests that fetch only the required data slices. In this case, only 10 products in the electronics category, in ascending order, and whose price is less than 100. Hence, making your component’s query the API call, which is where the slogan for Query-Driven Sync is coined.
Predicate mapping for APIs with custom formats, such as GraphQL, works slightly differently. The concept is similar, but it uses a different helper function. Instead of parseLoadSubsetOptions, you use parseWhereExpression, which maps the predicates to GraphQL’s whereClause format:
queryFn: async (ctx) => {
const { where, orderBy, limit } = ctx.meta?.loadSubsetOptions
// Map to GraphQL's where clause format
const whereClause = parseWhereExpression(where, {
handlers: {
eq: (field, value) => ({ [field.join('_')]: { _eq: value } }),
lt: (field, value) => ({ [field.join('_')]: { _lt: value } }),
and: (...conditions) => ({ _and: conditions }),
},
})
// Use whereClause in your GraphQL query...
}
The good thing about predicate mapping is that it only needs to be implemented once per collection. After that, no matter how your live queries are defined on the client, TanStack DB will automatically generate the correct query strings to make the appropriate API calls.
Progressive mode combines the best of both worlds (eager and on-demand functionalities) to deliver a more efficient way of loading data. In this mode, the initial queried batch is loaded immediately using on-demand fetch techniques to ensure a quick first paint or initial rendering for the user.
While the user interacts with this subset of data, TanStack DB progressively syncs the rest of the dataset in the background:
const projectCollection = createCollection(
electricCollectionOptions({
table: 'projects',
syncMode: 'progressive',
})
)
// On the client
const { data: myProjects } = useLiveQuery(q =>
q
.from({ projects: projectCollection })
.where(({ projects }) => eq(projects.ownerId, currentUserId))
.limit(20)
);
Although Query-Driven Sync is designed to work with REST, GraphQL, or tRPC APIs, the Tanstack team recommends using progressive mode with sync engines such as Electric, Trailbase, and PowerSync, as demonstrated above. These engines integrate well with Query-Driven Sync and provide additional benefits that enhance data synchronization and performance.
Sync engines are specialized systems designed to handle real-time synchronization of data across distributed environments, typically between multiple clients and a central database. They automatically detect changes in the database and efficiently propagate these updates as incremental changes, or deltas, to all subscribed clients.
Using progressive mode with a traditional fetch approach means making additional fetch requests for the entire dataset, albeit in the background (after the initial request), which can become expensive really fast. In contrast, sync engines only send the deltas (the rows that have changed), allowing you to maintain large client-side datasets without the network cost of repeatedly fetching all the data.
Another benefit of using a sync engine with Query-Driven Sync is that predicates are automatically translated, eliminating the need for manual mapping. This means you only need to define your live queries on the client, and the sync engine handles the rest.
No API endpoint is required; this is Query-Driven Sync working exactly as it was originally intended.
There are many ways Query-Driven Sync can benefit and improve a developer’s experience, with the most significant being the optimized loading of datasets, which is a set of request economics Tanstack DB employs to improve performance:
In the event where two components define the same query (e.g. same filters, same limits), TanStack DB deduplicates and only sends one network request. For example:
// Component A
const { data: electronics } = useLiveQuery((q) =>
q.from({ products }).where(({ products }) => eq(products.category, "electronics"))
);
// Component B (same query, different component)
const { data: electronics } = useLiveQuery((q) =>
q.from({ products }).where(({ products }) => eq(products.category, "electronics"))
);
Only one network call will be made to /api/products?category=electronics.
If you eventually decide to expand your query to accommodate more data, TanStack DB will only fetch the additional data (delta) rather than reloading everything from scratch.
When your query joins two or more collections, Tanstack DB figures out exactly which related records you need and fetches only those, instead of fetching entire collections. Imagine the following query:
useLiveQuery((q) =>
q
.from(todos)
.join(projects, (t, p) => eq(t.projectId, p.id))
.where(eq(todos.completed, false))
)
This query asks for all incomplete todos with their project details attached. Before Query-Driven sync, to create a join like this, you would need to…
/api/todos/api/projects…before joining them on the client or creating a custom endpoint like /api/todos-with-projects. With the Query-Driven sync, Tanstack DB will analyze the joins and derive the minimum necessary backend call to satisfy the joined query:
/api/todos?completed=false/api/projects?ids=abc,def,ghiThis is one of the important aspects in which Query-Driven Sync can outperform hand-written endpoints, because many people forget to batch or over-fetch.
If your backend evolves to support more flexible querying, you only update the mapping layer. Your UI queries can remain the same. And over time, you can plug in a sync engine (e.g., Electric, PowerSync) to get real-time or delta-only syncing without changing your query logic.
TanStack DB integrates with TanStack Query’s cache policies (like staleTime, gcTime). So queries that run within the “fresh” window don’t need to hit the network again; they can be served entirely from the cache. And when you change query parameters (e.g., filtering or sorting), the DB intelligently fetches only the missing data rather than re-fetching all loaded data.
Query-Driven Sync gives you a powerful and expressive way to let UI queries drive data loading, but getting the most out of it requires a few intentional design choices. Below are some recommended best practices to ensure your queries stay fast, your network usage stays efficient, and your data layer remains maintainable at scale.
Your collections should represent stable entities in your system, such as tables, resources, or objects that change predictably. Good collection modeling ensures your predicate mapping stays simple and predictable:
Clear collection boundaries give Query-Driven Sync the information it needs to generate optimal subset loads and incremental deltas.
When joining collections, scope your relationships to what the UI actually needs and avoid joining massive datasets without meaningful constraints. The more focused your join predicates are, the more efficiently TanStack DB can batch related lookups and avoid unnecessary network overhead.
An important part of using Query-Driven Sync well is choosing the right sync mode per collection based on use case and dataset size. Eager mode works well for small, stable datasets, on-demand mode is ideal for large collections, and progressive mode strikes a balance between the two. Mixing sync modes within the app per data domain optimizes resource use and user experience.
Query-Driven Sync shines when you allow it to take advantage of Tanstack DB’s built-in caching and intelligent request deduplication. By keeping query inputs stable and letting the sync engine compute the minimal differences between previous and expanded results, you benefit from efficient delta-fetching rather than full reloads. This pattern becomes particularly useful in components with infinite scrolling or progressive refinement.
In this article, we explored the inner workings of Query-Driven Sync and how it transforms data fetching and synchronization in TanStack DB by turning each component query into a precise API call. In summation, it helps effectively address the common challenge of API sprawl in modern applications. No matter the size of your application, it can benefit from the performance, efficiency, and simplicity that Query-Driven Sync provides.
With the 0.5 update marking the completion of the core architecture, TanStack DB is now gearing up for version 1. Trying it out today and providing feedback will help the team refine any remaining issues and prepare for a smoother final release.
Happy hacking!

Error boundaries catch only render-time failures, which isn’t enough for modern async UIs. Signals treat errors as reactive state, giving you consistent handling across your app.

Build fast, scalable UIs with TanStack Virtual: virtualize long lists, support dynamic row heights, and implement infinite scrolling with React.

CI/CD isn’t optional anymore. Discover how automated builds and deployments prevent costly mistakes, speed up releases, and keep your software stable.

A quick comparison of five AI code review tools tested on the same codebase to see which ones truly catch bugs and surface real issues.
Hey there, want to help make our blog better?
Join LogRocket’s Content Advisory Board. You’ll help inform the type of content we create and get access to exclusive meetups, social accreditation, and swag.
Sign up now