Bug: 429 page errors are too aggressive

jason_engage · April 28, 2025, 10:56am

I think your team has made a recent change in the last 1-2 months where you are returning 429 errors very aggressively when trying to “scrape” a merchant webpage (ie. curl).

I suppose you did this to prevent copycat / theft scrapping.

However this is backfiring for apps that actually need to scrape for actual issues. (IE. SEO Apps).

Is there a way you can WHITELIST our server (it’s a static ip)? Or make it less aggressive in general? Its being triggered with just a couple requests and causing all types of random issues (SEO King scrapes pages for MANY reasons).

jason_engage · April 28, 2025, 11:38am

I reduced the amount of scrapping being done by my server. Caused by a settings issue and it does seem to work better now = less 429 (maybe the issue is gone, not sure).

Its something to think about though. The scrapping wasn’t being done maliciously - it’s actually a way to get data from pages that are using page builder apps (can’t rely on the item data fields). Too many users that aren’t using page builders had it activated. However, if our app had 10x more users, and they in fact often used page builders, it would be a problem.

KyleG-Shopify · April 29, 2025, 8:03pm

Hey @jason_engage

Typically limits will be in place to ensure platform stability and security. It’s an interesting use case though, as you’re obviously not doing this maliciously and it’s benefiting merchants.

Can you share more on what you mean when you say you can’t rely on item data fields?

Have you looked in to some of the admin API online store or theme endpoints to see if you can get what you need there without needing to scrape the storefront?

jason_engage · April 30, 2025, 7:02am

I’ve seen some merchants where the Page Contents (Html) are not stored in the Item’s body_html field. I think it has to do with PageFly perhaps, or some other page management systems - which may be saving the contents somewhere else (maybe privately). I can’t say I’ve spent a lot of time tracking it all down, but I’ve seen it several times, and added some options to simply scrape the pages.

If you’re familiar with any of these types of scenarios, and some common ways to work around them, I’m all ears!

KyleG-Shopify · April 30, 2025, 4:43pm

Thanks for sharing that. To make sure your assumptions are the actual issue? Do you have an x-request-id from one of the 429 responses you’ve received recently, preferably from one of your development stores. I can check our logs to confirm this is the case.

From there, my SEO knowledge is probably average, so I’m not fully aware of how other SEO tools that do similar scraping manage this. One thought is to see if it would work to use the API to get the bulk of the information you need and then scrape to fill in the gaps? Alternatively, use webhooks (like product update, collection update, etc) to help narrow down just the resources that may need to be re-scraped.

It may be worth testing to see if updating the shops robot.txt to allow your crawler may help as well. Customize robots.txt

jason_engage · May 2, 2025, 10:13am

I’ve been watching the 429 issues for a few days now. It really looks like you guys ramped up the “aggressivity meter”. Can you check back with the team responsible and confirm or deny this?

Kevin_Sieger · May 6, 2025, 9:23am

Having similar issues using Screaming Frog. Seems like 429 errors are getting way more aggressive.

KyleG-Shopify · May 6, 2025, 2:39pm

Hey Kevin, do you have a request id from one of these responses to help us pinpoint one of these occurrences?

Kevin_Sieger · May 6, 2025, 2:57pm

Sadly I can’t provide you with request ids at the moment, but requests originated from 2001:4860:7:610::fb / 62.143.48.37 for store https://schleiftitan.myshopify.com/

Will try to gather more from team.

Kevin

jason_engage · May 6, 2025, 5:45pm

When we use CURL - there are no request ids. It’s not part of the API. Its direct server access to the webpage. That’s why screaming frog would have similar issues.

KyleG-Shopify · May 6, 2025, 7:35pm

Would there be anything to identify this request that’s being sent with the 429 error you are seeing?

jason_engage · May 7, 2025, 10:19am

Kyle, this issue can only be addressed by asking the team responsible for setting the HTTP 429 Error codes on the merchants sites, if they have recently modified it.

They may have only done so on the ‘.myshopify.com’ domains. They may also be interested to know that it is causing some issues.

We would like to know what they say about it.

You personally, would have no knowledge of this, it wouldn’t be written in any docs, and you wouldn’t be able to detect this issue since you don’t seem to know what an HTTP 429 error code is, or how to generate one. ie. you need to use curl from a server to make GET calls to a merchant website domain.

Do any questions from this forum reach any of the responsible teams? You can’t possibly be able to answer all these questions yourself, without consulting with others, right?

KyleG-Shopify · May 7, 2025, 8:40pm

Thanks again for the additional context Jason. Happy to unpack this further:

You are 100% right, there’s no way I could possibly answer every question without help. In this case, our storefront team has reviewed this thread. They noted there is room for improvement in our docs around rate limits. With that in mind, they have the following suggestions:

implement pacing on your crawl rate
correctly advertise your User-Agent (do not hide it)
apply to cloudflare’s verified bots registry

I would like to share a little context around our internal processes to assure you that questions we respond to aren’t ignored and going off in to the void.

Before bringing issues to our developers, we want to ensure we have as much context as we can gather. Replicating when possible, and finding details in our logs when replication isn’t possible. This ensures we are getting the issue to the correct team and that they have all the necessary details to address it properly. In this case, while I now realize you don’t have clean response headers like with an API request (thank you), details like user-agent, the URL being crawled and a timestamp could have still helped us narrow this down.

Let me know if there’s any other way I can help.

BartCoppens · July 5, 2025, 4:18am

@jason_engage @KyleG-Shopify I confirm that the 429 anti-scraping policy seems to have been recently changed. Our set of E2E tests for our app, automated using Cypress, was working reliably for many months. Recently it started failing (very early into the test suite) because of 429 errors. This makes our E2E tests useless. @KyleG-Shopify is there a process we can follow to whitelist a store (or a specific theme) or a user-agent so that Shopify doesn’t block us from running tests on our own dev store?

KyleG-Shopify · July 8, 2025, 5:38pm

Hey Bart,

Thanks for this. There isn’t currently a process to whitelist a store or specific theme. What was mentioned above is what our team is suggesting:

Did these suggestions improve anything @jason_engage ?

TPA_Admin · July 9, 2025, 7:48pm

I work for Equifax which owns Kount. We have an application that we test using e2e tests as well. This problem has caused us a lot of pain and time lost recently. Something changed around 3 weeks ago.

Our application is receiving 429 errors while loading the checkout page from within Cypress (Selenium alternative). Our usage of this functionality is probably around 100 an hour at the very most. When we start receiving 429s, it stops our usage of the Cypress environment for a long time – 1 hour or more. During that time, we can use the APIs with no issues.

Please lower this “screen scraping” limitation to the point that customers who are performing simple tests and placing a reasonable load are not impacted. This has been very painful.

TPA_Admin · July 10, 2025, 6:58pm

@KyleG-Shopify, we validated that the User-Agent is set to:
Mozilla/5.0 (X11; Linux x86_64)
AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/126.0.0.0 Safari/537.36

Our “crawl rate” is very low. As mentioned earlier, it is around 100-200 an hour at most. We are not web scraping, we are running e2e tests that load only the checkout page via Cypress.

CLoudflare’s verified bots – it says one must have 1000 per day. I would doubt that on our busiest day ever we got over 400.

What can we do at this point? Same as Bart’s issue: E2E tests using Cypress. We have done this for years. Our big issues started around 3 weeks ago.

KyleG-Shopify · July 10, 2025, 7:18pm

Thanks for those details. It’s interesting you’ve noticed a new change around 3 weeks ago. I’ll dig in here to see what I can find out for you.

mrcws · July 15, 2025, 10:19am

Hi @KyleG-Shopify, thanks for investigating this. Adding another voice that we’ve noticed this issue recently too. It’s impacting our e2e tests of our checkout flow. We are also not at a volume to count for verified bots.

TPA_Admin · July 16, 2025, 4:31pm

@KyleG-Shopify, we have added some research. We had our system idle for the entire evening. We made zero calls to Shopify in our E2E tests. After 15 hours of being idle, we ran our tests. From my quick look, 15 calls to Shopify web pages via our Cypress GUI worked. After the first 15 calls, we started our tests 10 minutes later. Failures started at this point.

So, we had 15 calls in a 10 minute period that worked AFTER having been idle for many hours. Whatever setting this is has us debilitated to the point we cannot deliver product.

See table for all X-Request-ID values, the times of day, and success or failure. Successes are 302’s and failures are 429’s. All URLs used in this test were delivered to us via previous endpoints called with direct API usage.

Column 1	Column 2	Column 3	Column 4	H	I	J	K
There are 8 tests. Each test calls `cy.visit()` to request the website twice. First to log in, and second to make the payment.

Target URL	https://{store-name}.myshopify.com/cart/c/…
302 URL	https://{store-name}.myshopify.com/checkouts/cn/…

Run 1 ~ 7:39 AM MDT				Run 4 ~ 8:36 AM MDT
Overall	302s on both calls - Successful Tests			Overall	302s on both calls - Successful Tests
Test Number	Request	Status Code	X-Request-ID	Test Number	Request	Status Code	X-Request-ID
1	1	302	3e8e53ec-fea5-4e40-916f-95510c656417-1752673027	1	1	302	a68338fc-b201-4c04-9b51-682bd9cafb6f-1752676575
	2	302	2264949c-a171-4598-b480-f7d6ec1f7672-1752673042		2	302	20004992-1fd1-4285-92e9-7fc3d250536c-1752676591
2	1	302	974f81bb-28dd-4290-b4fd-3bc9bd419f24-1752672968	2	1	302	82669854-7ea8-45b5-9082-2433a21a6190-1752676579
	2	302	e3a6cb7c-1208-4986-b55f-c970687a8e5c-1752672983		2	302	3f2d356e-35fb-43b2-9051-d14d39308683-1752676593
3	1	302	04aae9d9-afec-4c09-9b11-0524c1a2b246-1752673025	3	1	302	ad31bedd-2447-4f3e-8295-3f24cdf1350b-1752676602
	2	302	4a7a13cd-c97e-4e54-a067-247bf18144c3-1752673041		2	302	abd2004e-1ccb-4031-8754-e6aac28edd47-1752676616
4	1	302	274b435b-6d36-4ead-adab-306a69245e00-1752672974	4	1	302	ce41989e-af78-45b4-b0f7-654af529b9bb-1752676583
	2	302	a57c1ec8-8a1a-4bc7-89ca-5e3898631af1-1752672989		2	302	5263ce74-5612-4229-9add-b0232ce8ac3c-1752676597
5	1	302	5472a180-c80d-4fc0-8bf3-08a841e1dcc3-1752672969	5	1	302	bdcaf892-0d89-44f1-8f82-484e0db5ddc6-1752676582
	2	302	0b52968c-6a92-4889-9923-6e5499917554-1752672984		2	302	f9bee1c0-55e9-4acc-a2e9-353ad155eefc-1752676597
6	1	302	e0354e57-df4b-4f9a-959e-963e9cb2ad48-1752672976	6	1	302	94805b7f-bdf4-4eee-84d2-2841c1f53d57-1752676575
	2	302	301e6987-3b22-4b40-9af0-af2aed46c59c-1752672991		2	302	720a7c3a-3c1d-4b1d-9c6f-fc3464a81dfe-1752676590
7	1	302	47baf4e4-7cb6-4e6a-9244-66b243e2f738-1752673024	7	1	302	5fa68fc7-2256-4d6f-962a-21c07fd6abff-1752676578
	2	302	2ac69fb2-1c77-4562-a8cc-839a61e81ea2-1752673039		2	302	a5b12e6f-a221-4bfe-b02b-1942ebce62fb-1752676593
8	1	302	f2c56328-9223-4a33-b4f2-6c8c755ee693-1752672977	8	1	302	bc06287b-317a-4ce5-967d-a6ae7d926cbc-1752676602
	2	302	bd0e8b4b-fe36-48b8-9cc6-02589c4c1780-1752672992		2	302	281c0e91-2321-4f0c-99d7-f0e4765934a6-1752676617
Run 2 ~ 7:49 AM MDT				Run 5 ~ 8:46 AM MDT
Overall	302s and 429s on first calls, 429s on second calls -Failed Tests			Overall	302s and 429s on first calls, 429s on second calls -Failed Tests
Test Number	Request	Status Code	X-Request-ID	Test Number	Request	Status Code	X-Request-ID
1	1	429	b03ee016-bc3c-40fc-a761-7cf4c4955469-1752673695	1	1	302	813da178-5a57-4827-b9ef-a5278b6c8b24-1752677240
2	1	429	53e48123-ae1c-499c-ba93-5ec68cdd6263-1752673707		2	429	95f860cd-286d-4ea1-bf04-ff29fc850369-1752677255
3	1	302	2e76354b-ce1a-4e32-a133-c2a87ae99bcd-1752673691	2	1	429	660f15b4-42f3-49c9-b96a-9d924dc9ad3e-1752677243
	2	429	c2585bbc-37ec-4c68-bca0-98e4d4a1b815-1752673706	3	1	429	4d19581b-b7e1-4d3f-9ad0-9b9be16a4de5-1752677257
4	1	429	6ef0521d-e101-4053-8f41-a9618b3a0d16-1752673706	4	1	429	3b52939f-ea24-4ca3-bca3-b4e9ff936f3f-1752677246
5	1	302	59c0bdd7-c381-4adc-8a28-0fef5dbc2cb8-1752673705	5	1	302	2bc6318b-dfea-43cd-893c-22b5e2284c08-1752677248
	2	429	67a3e7c6-cbd7-4cc4-873b-6439c06cea9e-1752673721		2	429	1f38c0ec-5ae9-4fbf-a2c4-f1778ebae60b-1752677263
6	1	302	25d475e7-71d5-42d1-9709-829dbd876337-1752673698	6	1	302	40a9771d-3550-4b12-8871-7d0c56f25424-1752677252
	2	429	1117fa67-656b-4779-886d-b37399a792b1-1752673713		2	429	dc114354-2082-47ce-8d64-8fbf07965a56-1752677266
7	1	429	f2f04fad-e74c-47e4-8692-4e44d75a4daa-1752673715	7	1	302	9ad006fe-d61c-4fba-ba87-54d8bfe47b84-1752677260
8	1	302	2a1a6ffc-3093-4056-8923-ffe219e52f4f-1752673705		2	429	b351eb29-7f08-4cfa-a089-243773398a8f-1752677275
	2	429	7535811e-c471-4fa7-ba94-a4253e1ebe80-1752673720	8	1	429	b0c12f05-e0f9-41a7-a296-7c38597ca8a2-1752677267
Run 3 ~ 8:01 AM MDT				Run 6 ~ 8:58 AM MDT
Overall	429s on first calls - Failed Tests			Overall	429s on first calls - Failed Tests
Test Number	Request	Status Code	X-Request-ID	Test Number	Request	Status Code	X-Request-ID
1	1	429	dcffa40d-acf1-4c2d-b282-c32d0ce6a518-1752674419	1	1	429	a81baeaf-017a-4e6e-aefd-5aa6333005ac-1752677975
2	1	429	33143d4b-39ea-4869-9c28-ace33ce21482-1752674433	2	1	429	9d368d64-f800-47a7-a4f1-a4228ff838ff-1752677960
3	1	429	d6ae9d4b-dc45-4896-8a54-aee79b7fd6bb-1752674433	3	1	429	7a8d313a-51da-4003-bdf3-80d99e6fe448-1752677956
4	1	429	82138659-e355-478a-8ef6-74d884a6c3e9-1752674425	4	1	429	e18071ee-d55b-4e25-8314-e3d9ba94bcf7-1752677954
5	1	429	e642f4dd-d794-4e2a-ae87-da8fb2ae851b-1752674436	5	1	429	fa025772-a6bb-47aa-8c20-32c03f8a5bc9-1752677975
6	1	429	22641add-dd4c-475d-b3a0-f72b543c7c1d-1752674436	6	1	429	0574b105-64d2-4419-bcd8-d46f7bf0d263-1752677962
7	1	429	bcc8a5db-4d0d-4448-84dd-f76c6b953f1a-1752674435	7	1	429	3ae06514-f913-4236-9151-c4e9a527219b-1752677958
8	1	429	a32f3556-111a-45a7-b29e-8e3565456b11-1752674443	8	1	429	5b79c354-0ff3-44a2-b9e7-0287cfbf15dd-1752677950

Topic		Replies	Views
Aggressive bot detection with 429 errors AJAX Cart API Online Store and Theme Development ajax-api	84	2092	May 28, 2026
E2E tests on storefront result in 429 Online Store and Theme Development ajax-api	3	241	July 21, 2025
429 Too many requests Shopify CLI and Libraries theme-commands , app-dev-on-localhost	50	1127	June 2, 2026
Automated app test with selenium triggering Cloudflare Built for Shopify	4	369	February 17, 2026
Recent 400 Errors Referencing CloudFlare GraphQL Admin API Troubleshooting rest-migration , general-gql-troubles	21	397	January 8, 2026

Bug: 429 page errors are too aggressive

Related topics