Bulk operation performance - nested groupObjects vs 2 flat queries

Soufiane_Ghzal · April 14, 2026, 3:29pm

Hi,

We’re trying to understand the performance impact of groupObjects on bulk operations at scale..think 500k products, millions of variants.

Shopify’s docs mention that grouped output slows down bulk operations and increases the likelihood of timeouts, but doesn’t quantify it.

A few questions:

How significant is the performance difference between groupObjects: true and groupObjects: false in practice, on large datasets?
Does splitting the same query into two separate flat queries (products only + productVariants at root level) perform similarly to a single query with groupObjects: false ; purely in terms of total processing speed, regardless of whether they run in parallel (since 2026-01 multiple bulk ops can run in parallel, but this is not relevant to my question)?

Case 1 — single query, groupObjects: true

mutation {
  bulkOperationRunQuery(groupObjects: true, query: """
  {
    products {
      edges {
        node {
          id
          variants {
            edges {
              node { id }
            }
          }
        }
      }
    }
  }
  """) { bulkOperation { id status } }
}

Case 2 — single query, groupObjects: false (default)

mutation {
  bulkOperationRunQuery(query: """
  {
    products {
      edges {
        node {
          id
          variants {
            edges {
              node { id }
            }
          }
        }
      }
    }
  }
  """) { bulkOperation { id status } }
}

Case 3 — two separate flat queries

# Query 1
{ products { edges { node { id } } } }

# Query 2
{ productVariants { edges { node { id product { id } } } } }

productVariants at root, 1 connection level each, no nesting.

Thanks

Donal-Shopify · April 15, 2026, 11:33am

Hey @Soufiane_Ghzal! The biggest performance lever by far is groupObjects. When it’s true, two things happen that hurt you at scale. First, the query execution phase has additional retry logic with progressively smaller page sizes that can burn time. Second, and more importantly, the file assembly step has to download, parse, and re-sort the JSONL output line-by-line so child objects land directly after their parent. At 500k products with millions of variants, that sorting step is where timeouts happen. When groupObjects is false (the default on 2026-01+), file assembly is a straightforward concatenation without any content-level parsing.

On your Case 2 vs Case 3 question, the difference is smaller but does still exist. Nested queries use reduced pagination limits and require additional pagination work for child connections on each page of parent objects. A flat root-level query like productVariants avoids that entirely, getting the full page size with no extra overhead. At your scale that adds up, though it’s secondary to the groupObjects impact.

One thing to keep in mind with Case 3. Since each query is a standalone root-level connection, there’s no automatic __parentId linking in the JSONL. You’d rely on the product { id } field you’ve already included in your variant query to associate variants back to products, which is straightforward. In Case 2 with groupObjects: false, the JSONL does include __parentId automatically because it’s a nested connection. Either way the data is there, it’s just a different shape to parse.

So in practice, your best option is Case 3 with groupObjects omitted (defaults to false on 2026-01). Each flat query gets optimal pagination and the simplest file assembly path. You also get the concurrent execution benefit on 2026-01, even though I know you said that’s not the focus. The bulk operations guide covers the JSONL format details, and the 2026-01 changelog entry has more context on the default change.

Soufiane_Ghzal · April 15, 2026, 1:36pm

Hi @Donal-Shopify thanks a lot for the very detailed response.

That will help a lot choosing the right path going forward.

From what you’re saying, it’s still worth it to split queries and avoid nested connections where possible, so we’ll consider that one too.

Thanks!

Soufiane_Ghzal · April 17, 2026, 4:44pm

@Donal-Shopify I’d like to follow up on that one.

We’re trying to stop relying on objects grouping for our product bulk operations.

One problem we have is that we need to know which variants belong to a product at the time we fetch the product. Object grouping is a safe way to do that as it colocates the variants with the product. However, as discussed earlier, it’s scaling poorly.

I couldn’t find a way to simply pull the variant GUIDs attached to a product without using a nested connection to variants.

Is it something we can do in some ways?

Thank you.

Topic		Replies	Views
Bulk operations group objects default changed to `false` Changelog	0	33	October 14, 2025
Optional `groupObjects` argument in bulk operations mutations that offers faster and more reliable job execution Changelog	4	113	July 8, 2025
Optional groupObjects argument in bulk operations mutations GraphQL Admin API Troubleshooting general-gql-troubles	2	81	July 3, 2025
Bulk operation docs inconsistency GraphQL Admin API Troubleshooting general-gql-troubles	2	55	January 27, 2026
Very slow bulk operation fetching products+variants Products and Orders APIs admin-api-products	3	391	December 20, 2025

Bulk operation performance - nested groupObjects vs 2 flat queries

Related topics