Very slow bulk operation fetching products+variants

Hello,

We are migrating our app to the GraphQL Api. The most costly thing we do today is retrieving all products in the catalog. To be clear, today we are using the REST API and it is quite performant for us even if we haven’t tried to really optimize it.

In our new design we decided to use GraphQL+BulkOperations.
We chose bulk operations because:

  • we need the full catalog of products to be able to detect deleted products on our side. The webhook is not enough for us.
  • Shopify documentation really advocates for BulkOperations
  • Paginated nested connections look like a pain to implement.
  • Bulk operations better align with our existing pipeline. Using GraphQL pagination would be painful to add in our existing architecture.
  • Not having to worry all that much about rate limits is nice.

This is our query with all the data we need:

mutation {
      bulkOperationRunQuery (query: """ {
          products(query: "publishable_status:published") {
              edges { node {
                  id
                  images { edges { node { id url } } }
                  title
                  handle
                  descriptionHtml
                  featuredImage {url}
                  productType
                  status
                  tags
                  totalInventory
                  vendor
                  variants { edges { node {
                      id
                      barcode
                      price
                      compareAtPrice
                      sku
                      title
                      weight
                      weightUnit
                      inventoryPolicy
                      inventoryQuantity
                      inventoryManagement
                      image {id url}
                  } } }
              } }
          }
      } """ )
      {
          bulkOperation { id status }
          userErrors { field message }
      }
  }

Our problem:
This works, but for large shops it is very slow compared to the legacy REST API.
For instance, for a particular shop we see that fetching all products+variants+images takes:

  • 15 minutes using the legacy REST API
  • 4 hours using a bulk operation
    Meaning 16x times worse. For the record, we are talking about 2.2 million objects.

We frankly cannot understand this behavior from a technical point of view. Isn’t the bulk operation implemented using the same primitives than the legacy REST API ?

We would like to know if this is something particular to this shop.
This is the operation id (created on November 20, 2024) : gid://shopify/BulkOperation/3712286490737

Also, we have seen previous questions in the community forum discussing similar issues. Some of those questions are older than 2 years. But we still have the same problem.

Is this a known issue that you are addressing ? Or is it something particular to us ?

2 Likes

Hi @rockiedev_plays ,

Thanks for reaching out. We will discuss the bulk operation performance with the relevant team.

WRT to the migration from REST, our team is working on making our graphQL mutations & queries more performance and expect to see further gains sooner rather than later.

One way to achieve this export much faster (comparable to REST) would be to pre fetch cursors only and perform paginated queries in parallel.