Hi folks, thanks again for your patience while we looked into this, I have some context to share and an update.
Right now, this is expected behaviour but it was undocumented. Essentially, the backup rates are being triggered by a connection establishment timeout, which is separate from the response timeout you might be familiar with. We have a 2-second timeout for establishing a TCP connection with carrier service endpoints, in addition to the 10-second response timeout documented in our Carrier Service API docs. If it takes more than 2 seconds to establish the initial connection, we timeout and return backup rates, even if your server eventually connects and responds with a 200 OK within the overall 10-second window. When the connection times out, the actual API request may never be made, which explains why some of you aren’t seeing requests in your logs at all.
Connection establishment typically happens pretty quickly, so 2 seconds is generally a decent amount of time under normal circumstances. If you’re consistently seeing connection timeouts, this usually points to network or DNS resolution issues, servers under high load, or network configuration problems. I’d recommend monitoring connection establishment times specifically (not just overall response times) to your carrier service endpoints, checking your network infrastructure including DNS resolution, load balancers, and server capacity.
For those seeing issues where rates were selected in draft orders but then replaced with backup rates when the order was created, this is likely also be explained by connection timeouts happening at different stages of the order creation flow. This 2-second connection establishment timeout has been in place for some time now.
I hope this helps. Thanks again for all the detailed information you provided and please let me know if I can help out further.
Are you able to share some logs or timetamps about this connection timeout for our specific orders? What step of the TCP connection was slow (TLS handshake? DNS resolution? ACK reply?…).
Our DNS is hosted on AWS, and our carrier is handled by AWS lambdas so we’ll need more information to dig into the network issue with the AWS team.
We again had another backup rate recently (8291028926770).
No worries @Francois_Rulliere - I’ll send you a DM so we can look into this a bit more. I’ll need to grab some potentially more private info, so don’t want to surface that publicly if we can help it! Speak with you that way shortly.
Thank you for the explanation of the two tiers of timeouts. I’m also experiencing an up-tick in backup rates, recently. I don’t think it’s from a TCP Connection Timeout. Can I kindly ask you to look into this order Id: 7339225120835, which received a backup rate. According to my logs, it returned back well within the overall 10s window. Here are my log timestamps:
Http Post response start: 2025-11-21 11:03:22.904 (MST)
Http Post total latency: 2.029786728s
Hi @Phil_Sebastian_Coffe, happy to dig into this for you to see exactly where the timeout occurred. Could you share your shop domain, the carrier service endpoint URL that Shopify is calling, and which AWS region your Lambda is deployed in? That’ll help me correlate our connection logs with your infrastructure setup and figure out what’s happening. Let me know if you’d rather share this via DM and we can discuss that way for sure as well.
I’m suspecting there are timeouts related to TCP handshaking, particularly race conditions with keep-alive, try turning this off and see if it improves.
Also @Alan_G can you explain further how the backup rates with draft orders are being triggered? if the shipping rates have already been fetched and displayed to the staff member and they were able to select a rate, would that not mean that Shopify already has the rates, thus when the order is created, there should be no reason why a backup rate is suddenly used as shopify already has the rate?