Inconsistent Graphql Bulk Query Responses

We use bulk operations in our application to sync product data, and we are seeing inconsistent results.

We are running a very simple query (but this sometimes takes up to an hour with some of our clients’ stores)

{
        collections {
            edges {
                node {
                    id
                    publishedOnCurrentPublication
                    title
                    handle
                    image {
                        url
                    }
                    storefrontId
                    updatedAt
                    products {
                        edges {
                            node {
                                id
                                featuredImage {
                                    url
                                }
                            }
                        }
                    }
                }
            }
        }
    }

Here are two responses for the same retailer:

This first one, says it returned 369 root objects, but the file in the URL only has 250.

{
  "id": "gid://shopify/BulkOperation/5285292376217",
  "errorCode": null,
  "rootObjectCount": "369",
  "status": "COMPLETED",
}

Here is a second one, which says 369… but the file does really contain 369.

{
  "id": "gid://shopify/BulkOperation/5287571554457",
  "errorCode": null,
  "rootObjectCount": "369",
  "status": "COMPLETED",
}

This is seemingly random, we see that some retailers it has been incorrect all week (this is as long back as we can fetch the files to check).

The trouble is we use these query responses as a source of truth and delete content from our system that is not returned, and so you can imagine how that’s an issue when the bulk operation is not returning the right data.

Please advise what we can do here?

Just spit-balling here because I do not have experience with the bulk query operations but is the rootObjectCount related to the total number of root objects involved in the bulk operation where objectCount may be the be used to determine the number of objects processed?

If so, could you check both to make sure they match what you expect, in this case one of them being the 250 whereas the other may really be the 369?

Again, just trying to help because the issue you outlined, being seemingly random, seems like it could be an issue that likely impacts the workflow of others too.

Thanks for the reply.

From what I can see rootObjectCount relates to the number of the root objects returned (when it works).

Therefore if you are asking for collections, it’s the number of collections. Products then it’s the number of products.

objectCount on the other hand is the total number of objects, so if your collections have products, and they have images it’s +1 for each.

I know for a fact there are 369 collections on this retailer, and two queries return that as a rootObjectCount but then the data file they provide in the URL only has 250 records in one of them :frowning:

Well, one thought comes to mind here and that is with pagination of retrieved resources I believe is limited to 250. Given that the number you are getting is the exact same I am wondering if some queries are being turned into normal retrieval instead of a bulk operation or something like that…

I mean it’s a bit of a stretch but the thoughts coming to mind would be different if the numbers were not the same.

Again, if the queries are the same it does not make much sense but I have had issues with posting the same data via product forms on the storefront (bundling products) and getting varying results in the past so I would not be surprised if some similar issue is occuring here.

I thought that too.

But bulk operation docs explicitly say that pagination is ignored (and I tested it by only requesting one item, and got all 369).

This is extremely infuriating as this core Shopify api just doesn’t work consistently.

We have one retailer where the root count is 888, but we are getting 764 back (most of the time)

Does it make sense to examine if any single one aspect of this query on collections could somehow be sensitive to something such that during the processing, Shopify has a moment where it goes hang on, something is up with this?

You know there are 369 collections. You run the query 10 times. How many times out of 10 do you get 369? Not every single time right? So what could possibly make it so that collections somehow become “unavailable”? It seems like they gather 250 (a page) without trouble, but it is gathering page 2 that causes an issue. So if even one collection in that page is “touched” during bulk processing, they defer to blowing out and you get 250.

Same with the other case. 888. You get 764. So while that is not 3 neat pages of 250 each or 750 and then it blows chunks for the final page of 138 because maybe something is up there?

In either case… seems like a “We die processing the last page of a bulk query” problem that needs to be somehow figured out! Interesting stuff. Not really. But kinda… Good luck!