Hugging Face Proposes Cross-Origin Storage API for Transformers.js to Slash Redundant Caching

Aira Updated Jun 23, 2026 · 3 min read

Logo: Victor (Hugging Face Staff) — Public domain, via Wikimedia Commons

Hugging Face has published a proposal for a Cross-Origin Storage API designed to eliminate redundant downloads of AI models and WebAssembly runtimes for Transformers.js apps running on different web origins. The specification targets a browser caching limitation that currently forces duplicate storage of identical resources across isolated origin caches 1.

The Cross-Origin Storage API Proposal for Transformers.js

Transformers.js, Hugging Face’s library for running transformer models directly in the browser, relies on task-specific pipelines that automatically download and cache model weights and runtime dependencies on first use. For popular default models like Xenova/whisper-tiny.en (the default automatic speech recognition model) and Xenova/distilbert-base-uncased-finetuned-sst-2-english (the default sentiment analysis model), this caching works seamlessly when a user revisits the same site, with resources served from the local Cache API on subsequent loads 1.

The limitation emerges when the same Transformers.js app is deployed across multiple origins, or embedded as an iframe on third-party sites.

In these scenarios, the browser treats each origin’s cache as fully isolated, forcing a full re-download of all model and runtime files even when the assets are byte-for-byte identical to those already stored for a different origin. Hugging Face’s demo of the issue shows this can add 177MB of duplicate model storage and additional redundant Wasm downloads per cross-origin deployment 1.

Root Cause in Browser Cache Partitioning Design

This behavior is not a bug, but a deliberate privacy and security guardrail. Modern browsers, including Chrome, use a Network Isolation Key to tag cached entries, combining the top-level site, current-frame origin, and resource URL. This prevents timing attacks where a malicious site could measure response latency to infer whether a user has previously visited a specific resource, leaking browsing history 1.

While the security tradeoff is intentional, it creates unnecessary overhead for large, immutable, publicly shared assets like AI model weights and runtime binaries, which are served from consistent CDN endpoints regardless of the originating site. Model files resolve to canonical Hugging Face CDN URLs, while Wasm runtimes resolve to fixed jsDelivr endpoints (such as the onnxruntime-web@1.26.0-dev.20260416-b7804b056c build used by default), meaning the final resource URL is identical across origins even as the cache key differs 1.

For end users, the cumulative impact is measurable: a user who visits three separate sites using the default Whisper-tiny.en model for speech recognition would currently store three separate 177MB copies of the model, totaling 531MB of redundant local storage, plus duplicate copies of the 4,733 kB Wasm runtime for each site. For users with limited local storage or metered internet connections, these redundant downloads add unnecessary data costs and longer load times 1.

Bottom line: Hugging Face’s Cross-Origin Storage API proposal targets a concrete, high-impact caching limitation for browser-based AI workloads, where isolated origin caches force duplicate 177MB model and 4,733 kB Wasm runtime downloads per cross-origin deployment. For users visiting three sites using the default Whisper-tiny.en model, this waste totals 531MB of redundant local storage plus additional duplicate runtime copies per site, increasing load times and data costs for users on limited storage or metered connections.

#ai-news #Hugging Face #transformers-js #web-development #WebAssembly

We may earn commission from affiliate links at no extra cost to you. Last updated: Jun 23, 2026.

Hugging Face Proposes Cross-Origin Storage API for Transformers.js to Slash Redundant Caching

The Cross-Origin Storage API Proposal for Transformers.js

Root Cause in Browser Cache Partitioning Design

Read next

IBM CUGA Agent Harness Ships With 24 Single-File Working Examples

FSU NotebookLM Pilot Cuts Faculty Prep, Fills Study Gaps

PP-OCRv6 Launches on Hugging Face With 50-Language

The zBrandco Edition