How we can generate translatableContentDigest locally?

When we use translationsRegister the mutation requires translatableContentDigest, the SHA-256 of the original field after Shopify’s internal sanitizer has run. For most descriptions our locally generated hash is accepted, but for any record that contains symbols Shopify rewrites (e.g. &) we get an error “Translatable content hash is invalid.” The reason is clear - we hold in our database raw HTML we imported, while Shopify uses version produced by its sanitizer. One of the workarounds is to query translatableResource first, copy the digest Shopify already stores, then pass that value straight into translationsRegister. Technically that works, but it doubles the number of API calls and forces us to ping Shopify each time for every resource we want to translate. What we’d like instead is to reproduce the sanitizer step on our side so we can generate the same digest without calling the API for each translation to retrieve a digest.

Examples:

  1. Correct (same digests locally and on Shopify)
import { createHash } from "crypto";
const text = "Earning 7% annually will double $1000 in roughly ten years.";
const digest = createHash("sha256").update(text).digest("hex");
// Output:  38f46cc9ac49f815de66dcd5e13da3617ca4afe20f40114aba4e5bd500b8695b
// Shopify: 38f46cc9ac49f815de66dcd5e13da3617ca4afe20f40114aba4e5bd500b8695b
// Text stored in Shopify: same, not changed (sanitized)
  1. With error (digests are different)
import { createHash } from "crypto";
const text = "Saving $100 today & earning <5 % interest will grow your wealth > simply holding cash.";
const digest = createHash("sha256").update(text).digest("hex");
// Output:  e02225c96f45617df927104187ffca5460809b1a81649fb8ca594d0d10dbd905
// Shopify: ef787fac2a975a4c3ce494fd58f9372c87f5621753bbb4d1845a31d85e0f4c6d
// Text stored in Shopify: "Saving $100 today &amp; earning &lt;5 % interest will grow your wealth &gt; simply holding cash."

Questions:

  1. Where can we find a detailed description of the sanitizing step Shopify runs before it hashes a field (libraries used, configuration, order of operations, etc.)?
  2. Which JS libraries (and what settings) can reproduce that sanitization locally so we can generate the same digest?
  3. Is there any other way to obtain or calculate the correct digest without an extra calls to Shopify?

Hi @tmv

There is no public documentation from Shopify that describes the exact sanitizer implementation, libraries, or configuration used before generating the translatableContentDigest. The API documentation and schema only state that the digest is the SHA-256 of the original field after Shopify’s internal sanitizer has run, but do not specify the sanitizer’s logic or which HTML entities are escaped, etc.

If you want to minimize API calls, you could cache digests and only re-fetch when the original content changes, but you cannot avoid at least one call per unique content value unless Shopify publishes their sanitizer logic.