Recent 400 Errors Referencing CloudFlare

Hi,

Similar to another issue I posted related to 5xx errors from Shopify, we’re also seeing an uptick recently in 400 errors coming back from Shopify for GraphQL queries and mutations we know are valid.

Using the same example as that other post (the inventorySetQuantities mutation) we’re seeing many instances where we will send an inventory update for an inventory item successfully, then a bit later we will send that same mutation for that same inventory item (albeit with a different quantity) and receive a 400 response, then a bit later we will once again send that mutation for that inventory item and get a 200 response. Each time we get a 400 responce code we also get a response body that mentions Cloudflare (pasted below) so I’m guessing the issue lies somewhere between Shopify and Cloudflare.

Like the 5xx errors, this is another “mental blocker” in terms of fully migrating to the GraphQL APIs because it doesn’t give the team a lot of confidence moving forward. Thankfully we have retry logic built into our use of the APIs, but if the 400 errors persist longer than our retry logic then it poses problems for us. Is there any investigation or work being done to improve the reliability of these APIs prior to the deadline for forced migration?

400 response body

<html>

<head>
    <title>400 Bad Request</title>
</head>

<body>
    <center>
        <h1>400 Bad Request</h1>
    </center>
    <hr>
    <center>cloudflare</center>
</body>

</html>

Hey @ktbishop :waving_hand: - we can take a look into this for you for sure - would you happen to have X-Request-ID values from the response headers for both the 5xx (from the other thread)/4xx errors you’re seeing? I can use those to grab the example errors here directly in our logs.

If not, no worries, I can do a bit of digging using the shop/app IDs as well as a timestamp if you can share one (down to the second would be great if possible!)

Hope to hear from you soon.

1 Like

Hi @Alan_G,

I’m also talking to @KyleG-Shopify over in the other thread, just letting you know so we aren’t all stepping on each others toes.

I do have x-request-id values for the recent 503s and 400s. Just let me know if you want them posted here or if you’d prefer DMs.

I mentioned this to KyleG as well, but we’re also seeing some 502s mixed in. Those don’t have IDs for me to share, unfortunately, but those response bodies reference CloudFlare as well:

<html>
<head>
    <title>502 Bad Gateway</title>
</head>
<body>
    <center>
        <h1>502 Bad Gateway</h1>
    </center>
    <hr>
    <center>cloudflare</center>
</body>
</html>
1 Like

@KyleG-Shopify @Alan_G I wanted to come back to this thread because we’re noticing this behavior intermittently on our testing environment but with slightly different circumstances this time.

We’re currently in the process of migrating from Heroku to Google Cloud, and so far we’ve fully migrated our testing environment that is hooked up to multiple dev stores. We’re noticing some intermittent timeouts and 400/Cloudflare errors when making various queries that we weren’t really seeing previously on our test environment. We haven’t made any changes at all to the queries, the only change we’ve made is that we’ve migrated to Google Cloud so our requests are coming from a different IP.

For example, on one of our dev stores we have 1 open fulfillment request. When we make the GQL assignedFulfillmentOrders query on that shop we see it fail about 50% of the time with the 400/Cloudflare error mentioned previously. The other 50% of the time the call is successful and returns the 1 fulfillment order with an open request.

On our heroku instance we didn’t see this issue so we’re assuming it’s related to the GCP migration, but we need some help on the Shopify side to determine what’s going on. Is it possible that these calls are failing intermittently because they’re coming from a different/unfamiliar IP?

Hey @ktbishop, thanks for circling back on this, happy to take a look.

Typically, 400 errors from our GraphQL API would return JSON error responses explaining what went wrong, but since you’re seeing Cloudflare HTML pages instead, this makes me think that the requests might be getting blocked at our edge layer before they even reach Shopify’s API. Given that this started right after your migration to GCP and you’re seeing about 50% failures, my hypothesis is that Cloudflare’s bot detection could potentially be scoring your requests as potentially suspicious and intermittently challenging or blocking those requests (I’ve seen this pop up before). If you’re able to share timestamp of a failed call (down to the second with timezone if possible), your shop URL, and the API client ID you’re using, I’d be happy dig into our logs to see what’s happening on our end here. Hope to hear from you soon!

@Alan_G Thanks for getting back to me. That’s kind of what we were suspecting as well (especially given Cloudflare’s recent struggles).

The dev store in question is block-party-shoppe.myshopify.com
X-Stats-Apiclientid: 5453309

A couple of recent failures:

A request that came back with a 400 and a reference to Cloudflare:
2025-11-24T13:07:40.353853Z UTC

A request that came back with no response body and no response code (may have timed out?):
2025-11-24T15:04:25.937689Z UTC

Let me know if I can get you anything else. Also, if it’s easier, feel free to send me a DM. Whatever works for you!

Thanks @ktbishop - I’ll keep digging into this and loop back with you once I have more info to share, really appreciate the examples here.

Hey @ktbishop - we were able to pull some logs and we’re seeing 499 errors on our end, usually meaning that the “Client Closed the Request”. If you have cf-ray and x-request-id values on hand from the response headers for some of those examples we can take a closer look for sure though, hope to hear from you soon!

@Alan_G Interesting! We’ve always used a 10 second timeout for any requests we send to Shopify, including in our prod environments where there’s a ton of data, and they’ve always been fine until now.

A couple of quick follow up questions:

  1. For the cases where you’re seeing 499s, are you able to see anything about how long those operations took?

  2. Are you seeing anything in relation to the 400s that referenced Cloudflare?

I’d imagine the 499s pertain to some instances we’re seeing where we get no response code or body back, so I’ll check with the GCP folks about any egress settings that might be overriding what we’re setting on our http client. But the 400s that reference Cloudflare don’t seem like they’re getting through to the Shopify backend in the first place.

Thanks for clarifying @ktbishop - very interesting for sure! Looking at our logs there, we found a few 499 errors (client closed request). For these, the average time before the client closed the connection was about 261ms - though that’s measuring when your client disconnected, not how long Shopify was processing, just to clarify. With a 10s timeout on your end, it is weird that the connection would close that quickly.

I’m not seeing the 400 errors with Cloudflare HTML in our logs yet - which supports our theory that those requests are getting blocked at the edge before reaching the API. If you have cf-ray IDs and x-request-id values from those responses though, we can likely trace exactly where they’re getting stopped for sure!

@Alan_G Sorry about that, I didn’t include a request ID on that last example.

Here are a couple of better examples:

  1. An example where the request seemingly ran up against the 10 second timeout:
    x-request-id: aa2978de-fd7f-44e3-8ff7-49b86b79f132
    Would you be able to check and see if maybe the requests are getting hung up in Cloudflare or anything like that? We’ve had our 10 second timeout in place since 2019 and it’s worked without issue. Our prod environment (still on Heroku) is using that 10 second timeout currently and it doesn’t seem to run into this. I’m wondering if maybe these requests are getting hung-up in Cloudflare, eventually making it past, but getting stuck long enough to eat up that 10 second window

  2. Here’s a fresh example where we received the 400/Cloudflare error:
    x-request-id: eda097c7-5b4a-4129-bb2c-24e95acd53ac

Thanks again for all your help on this. Once I hear back from you I’ll take this info back to the GCP folks and go from there.

Hey @ktbishop thanks for these, it’s definitely possible that Cloudflare could be blocking these. I’ll dig into this further for you and loop back once I have more info.

@ktbishop - quick follow up! We were able to pull a few more logs, but just wanted to confirm a few things further with you to narrow down the cause. Do you know about when you started to see the issue pop up and are you also able to replicate the timeouts using just a straight cURL request from within your host or outside of it using the same API credentials (or even using an API client like Postman)?

Just wanted to confirm if this is a wider issue, still looking into this, but wanted to follow up to grab that info from you - thanks!

@Alan_G Thanks for all the quick follow up so far.

This issue popped up pretty much as soon as we started running our test/integration environment on GCP. I want to say that was on November 17th. At the time we still had a test server running in parallel on Heroku, and that one was not having this issue. We turned off that Heroku test/integration instance on November 20th. Meanwhile our prod environment is still running on Heroku currently and isn’t seeing this issue, which is why we were thinking it maybe has to do with the Shopify Cloudflare layer detecting the requests suddenly coming from GCP.

If I run the same requests from Postman they return just fine and super fast but, like Heroku, we’ve been using Postman for a lot of operations for a long time so maybe it’s a “recognized” source.

1 Like

Thanks @ktbishop , appreciate the confirmation on this too. If you’re running Postman locally, your IP might be seen as an acceptable source for sure (or it reads the API user info as “Postman”, which I think is the default setting in the request headers in the Postman app). I’ll loop back with you once I have more info on our end to share and keep you up to date though. Thanks again for your patience on this :slight_smile:

Hey @ktbishop - looping back with you. We’ve been tracking this on our end and it looks like for the last 24 hours at least there hasn’t been another instance of the issue pop up. If you’re able to replicate via cURLand share the response output, we can take a look. Just wanted to share that we haven’t seen a replication of this on our end at the moment.

Let me know if I can help out further.

Hey @Alan_G , thanks for checking back in. It does appear as though the issue has calmed down substantially over the last 3-4 days, but we’ve still seen a few instances of both problems recently:

The most recent instance of the timeout issue was yesterday (Dec 2) at 07:03:01.204584 UTC: X-Request-Id = 7db49950-7219-4ded-b38d-f67f1dbfa3dc
For those ones we don’t get any response data back since it seems to be intermittently running up against the 15 second timeout that our client is setting.

The most recent instance of the 400/CloudFlare issue was today (Dec 3) at 10:03:03.005878 UTC:
X-Request-Id = 7ceee344-1247-446c-b227-fa2ac256d7df
For that one we still get back the same response body of
<html> <head><title>400 Bad Request</title></head> <body> <center><h1>400 Bad Request</h1></center> <hr><center>cloudflare</center> </body> </html>
As well as response headers of
Server: [cloudflare]\nDate: [Wed, 03 Dec 2025 10:02:59 GMT]\nContent-Type: [text/html]\nContent-Length: [155]\nCf-Ray: [9a823b4b2b19bfd6-ATL]\n

I suppose all of this begs a larger question: eventually we’ll be migrating our production instances over to GCP as well, and since it seems like Shopify and/or Cloudflare just needed some time to “get used to” our requests coming from somewhere other than our normal Heroku containers, is there some way we can avoid all this for that migration? With our production containers we can’t really afford to wait an untold amount of time for any issues to subside.

Again, I really appreciate the time and effort you’ve put into helping us with this!

Hey @ktbishop – Thanks for the update and those request IDs! Glad to hear things have calmed down.

To dig deeper into this, we do need a cURL replication of the request when you hit the issue again. Essentially, we need the full plain text HTTP request (URL, headers, and body if applicable) along with the full response including headers. I’d also just make sure to redact any access tokens before sharing. I think the request ID you’re sharing may be coming from your side (usually ours include a sequence of numbers at the end representing the epoch time for the request.) , since we’re not able to pull anything on our end, but having the complete request/response gives our infrastructure team what they need to trace this through our systems. Using Postman, I think you can convert a Postman request into cURL:

On your migration question, totally understand the concern. We generally don’t have any kind of allowlisting or warm-up period on our end where Shopify or Cloudflare needs time to “trust” a new IP range. My hunch is that there might be some subtle difference in how your GCP environment handles requests compared to Heroku. I’m thinking things like connection pooling, timeout configurations, keep-alive behavior, or even small differences in header formatting can sometimes cause unexpected issues. The fact that Postman works fine locally and Heroku has been stable points toward something environmental on the GCP side rather than anything that requires time to settle. Once we can hopefully pinpoint the root cause with the cURL data, we should have a decent checklist of what to watch for when you move production over, so you can make that transition a bit more smoothly.

Let me know if you have any questions on capturing that cURL output!Let me know if you have any questions on capturing that cURL output!

@Alan_G Sorry for the delayed response

So, that’s kind of been the issue with debugging this: when we send those requests to Shopify from Postman, or when someone sends a cURL from the VM in the same subnet as GCP, we don’t see the issues. The only time the requests have the two issues listed above (slowed down to the point of timing out, or receive that 400/Cloudflare response) are when they’re coming from our app in the GCP container. Since the exact same code is running in our production containers without issue (heroku) and has been for years, I’m ruling out issues with the app code.

I’ll see if there’s some better logs we can get from GCP for the failing requests and report back.

Hey @ktbishop - No worries on the delay, apologies for my late reply here too (I’ve been out of office for the last littel bit)

Digging into the GCP logs is a great next step. If possible, it would be really helpful to capture the full outgoing request headers as your app is sending them, along with the raw response headers when the issue occurs (especially the cf-ray value for those 400s if you can). If GCP provides any kind of timing breakdown showing connection time vs. TLS handshake vs. time-to-first-byte, that would be useful too. And if there are any GCP-level network logs that might show where the request is stalling or getting rejected, those could help us piece together what’s happening.

Feel free to loop back/ping me here when you can and we’ll take it from there. Appreciate your patience working through this with me.