Liquid 'strip' filter only removes ASCII whitespace, ignores Unicode whitespace (U+00A0, U+202F, U+2009, etc.)

Short description of issue

Liquid ‘strip’ filter only removes ASCII whitespace, ignores Unicode whitespace (U+00A0, U+202F, U+2009, etc.)

Reproduction steps

{%- assign s = 'Couleur ’ -%} {# trailing char is U+202F #}
{%- assign stripped = s | strip -%}
{{ stripped | size }} {# expected: 7, actual: 8 #}
{{ stripped == ‘Couleur’ }} {# expected: true, actual: false #}

Same result for U+00A0:
{%- assign s = 'Color ’ -%} {# trailing char is U+00A0 #}
{{ s | strip | size }} {# expected: 5, actual: 6 #}

Additional info

Liquid’s strip filter (and by extension lstrip / rstrip) only removes ASCII
whitespace characters (space U+0020, tab U+0009, CR U+000D, LF U+000A). It does not
remove other Unicode characters that have the White_Space property in the Unicode
standard, including:

  • U+00A0 NO-BREAK SPACE
  • U+202F NARROW NO-BREAK SPACE
  • U+2009 THIN SPACE
  • U+2007 FIGURE SPACE
  • U+200B ZERO WIDTH SPACE (not technically White_Space, but commonly stripped by
    other languages’ equivalents)

This is increasingly problematic because Shopify itself can produce these
characters in storefront strings — e.g. the auto-translated French value of
product.options[*].name ends with U+202F (narrow no-break space) due to French
typographic rules requiring a thin space before :. App and theme developers who
normalize with | downcase | strip to compare against a known list end up with
silent comparison failures that are very hard to diagnose (the character renders
identically to a regular s

What type of topic is this

Bug report

Not a bad call out! We delegate to Ruby’s String#strip (docs) which only removes ASCII whitespace. I’ll have a look at including Unicode whitespace but it might need to be a separate filter due to compatibility with storefronts that might be unknowingly relying on the ASCII-only behaviour.

Interestingly, Rails’ String#squish includes Unicode whitespace (docs).

1 Like

Thanks for looking into this @Gray-Shopify :folded_hands: