Very slow bulk operation fetching products+variants

Hello,

We are migrating our app to the GraphQL Api. The most costly thing we do today is retrieving all products in the catalog. To be clear, today we are using the REST API and it is quite performant for us even if we haven’t tried to really optimize it.

In our new design we decided to use GraphQL+BulkOperations.
We chose bulk operations because:

  • we need the full catalog of products to be able to detect deleted products on our side. The webhook is not enough for us.
  • Shopify documentation really advocates for BulkOperations
  • Paginated nested connections look like a pain to implement.
  • Bulk operations better align with our existing pipeline. Using GraphQL pagination would be painful to add in our existing architecture.
  • Not having to worry all that much about rate limits is nice.

This is our query with all the data we need:

mutation {
      bulkOperationRunQuery (query: """ {
          products(query: "publishable_status:published") {
              edges { node {
                  id
                  images { edges { node { id url } } }
                  title
                  handle
                  descriptionHtml
                  featuredImage {url}
                  productType
                  status
                  tags
                  totalInventory
                  vendor
                  variants { edges { node {
                      id
                      barcode
                      price
                      compareAtPrice
                      sku
                      title
                      weight
                      weightUnit
                      inventoryPolicy
                      inventoryQuantity
                      inventoryManagement
                      image {id url}
                  } } }
              } }
          }
      } """ )
      {
          bulkOperation { id status }
          userErrors { field message }
      }
  }

Our problem:
This works, but for large shops it is very slow compared to the legacy REST API.
For instance, for a particular shop we see that fetching all products+variants+images takes:

  • 15 minutes using the legacy REST API
  • 4 hours using a bulk operation
    Meaning 16x times worse. For the record, we are talking about 2.2 million objects.

We frankly cannot understand this behavior from a technical point of view. Isn’t the bulk operation implemented using the same primitives than the legacy REST API ?

We would like to know if this is something particular to this shop.
This is the operation id (created on November 20, 2024) : gid://shopify/BulkOperation/3712286490737

Also, we have seen previous questions in the community forum discussing similar issues. Some of those questions are older than 2 years. But we still have the same problem.

Is this a known issue that you are addressing ? Or is it something particular to us ?

3 Likes

Hi @rockiedev_plays ,

Thanks for reaching out. We will discuss the bulk operation performance with the relevant team.

WRT to the migration from REST, our team is working on making our graphQL mutations & queries more performance and expect to see further gains sooner rather than later.

One way to achieve this export much faster (comparable to REST) would be to pre fetch cursors only and perform paginated queries in parallel.

Hey @Asaf_Gitai

Can you explain how you can get 1 million products and its variants, images, inventory items (because you removed inventoryManagement field which stores tracked information on 2024-07 as well) faster than API? Also it could be great if you can explain the new cost computation in graphql as well.

The performance gap is frustrating but makes some sense when you dig into how bulk operations work under the hood. They’re designed for reliability and resource isolation rather than raw speed. The job gets queued, runs on shared infrastructure with conservative resource allocation, and writes results to a JSONL file that you then download. REST pagination, by contrast, hits the API directly with your allocated rate limit and streams results immediately. For smaller catalogs bulk ops win on simplicity, but at 2M+ objects the queuing overhead and throttled processing really shows.

Asaf’s suggestion about pre-fetching cursors and parallelizing paginated queries is genuinely faster for large catalogs, though painful to implement. The pattern is roughly: first query to get total count and generate cursor breakpoints, then fan out 5-10 parallel workers each paginating through their segment. You burn through your rate limit faster but finish in a fraction of the time. Some teams use a Redis queue to coordinate the workers.

For the deleted product detection problem specifically, one workaround is to maintain a local product ID set and reconcile against products(first: 250, query: "updated_at:>last_sync_time") on shorter intervals rather than full catalog pulls. Combined with the products/delete webhook as a supplement, you catch most deletions without needing the full catalog every time.

The honest answer though is that syncing millions of objects on a schedule is always going to be expensive regardless of approach. Shopify’s infrastructure isn’t really optimized for “give me everything repeatedly” patterns at that scale. Curious whether you’ve explored event-driven architectures where you maintain state incrementally rather than full refreshes?

Hope some of that helps while you wait on Shopify’s performance improvements.