Skip to content
All case studies

Nyora for the Web

A browser-native manga reader that runs a full web-scraping parser engine entirely on the user's device with no application backend, reaching arbitrary manga sites through a single 270-line edge proxy while staying in cross-device sync with five native apps.

Stack

  • Vanilla JavaScript (ES modules, no framework)
  • esbuild (code-split, content-hashed production bundle)
  • Cloudflare Pages + Cloudflare Workers
  • Service Worker / PWA (offline app shell)
  • IndexedDB + localStorage
  • Web Crypto (SubtleCrypto SHA-256)
  • Supabase (Auth + Edge Functions) for cross-device sync
  • Google Identity Services (FedCM, id_token flow)
  • AniList GraphQL (client-side tracking)
  • GSAP (view transitions)

Overview

Nyora for the Web is the browser edition of a cross-platform manga reader, and it rests on one unusual claim that the code fully backs up: it is a 100% client-side application with no application backend. Open a tab and you are reading. The catalogue of sources, the search, and the parsers that turn an arbitrary manga website into clean, readable pages all run on the user's own machine, inside the browser tab. The app ships as static files to Cloudflare Pages (a host that serves plain files from servers near each user), installs as a Progressive Web App, and the only server-side code in the entire system is a single small program of roughly 270 lines whose one job is to relay network requests the browser is otherwise forbidden to make.

The headline achievement is not a slick user-interface framework — there isn't one. It is the act of running a complete web-scraping engine, safely and correctly, inside an environment that is deliberately built to stop a web page from reading other websites. (Web scraping means fetching a site's raw HTML and pulling structured data out of it.) Pulling this off requires defeating the browser's cross-origin security rules, cryptographically verifying third-party code before it runs, and preserving an existing server contract so that fifteen screens of application logic never had to be rewritten. The result is a genuine application — multi-source discovery, a polished reader, cross-device sync with five native apps — delivered with essentially no server to lean on.

Technology stack

The app is written in plain JavaScript ES modules with no UI framework: 62 modules totalling about 21,780 lines across the web/ tree. Skipping React or Vue was deliberate. It keeps the dependency surface tiny, makes every line auditable, and lets the app run unbundled during development with no build step at all — point any static file server at web/ and it works. For production, a roughly 70-line build.mjs drives esbuild (a fast bundler) to collapse those sixty-plus chained module requests into a handful of content-hashed, code-split chunks, and stamps a cache-busting id derived from the bundle's own bytes so returning visitors reliably pick up new code instead of a stale copy.

Local data lives in IndexedDB, the browser's built-in structured database, mirroring the native apps' schema, with a synchronous localStorage cache in front of it for instant reads. Cross-device sync runs on Supabase (a managed backend providing authentication plus a small server-side Edge Function). Sign-in uses Google Identity Services through the browser's FedCM id_token flow — a standardised, privacy-preserving login handshake. AniList progress tracking is done client-side over GraphQL. Web Crypto (the browser's native SubtleCrypto SHA-256) performs the supply-chain verification described below. GSAP handles view transitions, and a service worker provides the offline app shell. Hosting is Cloudflare Pages for the static site and a Cloudflare Worker for the proxy — both free at this scale and globally edge-distributed, so the single piece of server code runs close to every user.

Architecture

A small router maps each route name to one of fifteen screen render() functions — explore, library, history, bookmarks, updates, local, suggestions, stats, downloads, settings, details, reader, search, tracker, and browser — each mounted into one view node. The pivotal design decision lives in core/api.js. Every screen was originally written against the native Kotlin NyoraRestServer HTTP interface, the same local server the desktop builds talk to over loopback (a network call to the machine itself). Rather than rewrite all that screen logic for a serverless browser, api.js keeps the identical method signatures and route strings (/sources, /manga/..., /search/global, /suggestions, /downloads) and quietly intercepts them.

Here is what happens end to end when a user opens a series. The details screen calls the same /manga/... route it always did. api.js recognises it as a catalogue route and hands it to parser-runtime.handle(path, method), an in-browser router that parses the query string, resolves the correct site parser, fetches the page's HTML through the Cloudflare Worker (the browser cannot fetch the manga site directly), runs the parser to extract the manga, chapters, and page-image URLs, normalises them into the shared data shapes the native apps use, and caches the result in memory with a lifetime tuned per kind of data — 3 minutes for browse lists, 15 for details, 10 for chapter pages. Personal state (favourites, history, bookmarks, preferences) is served instantly from IndexedDB. The screens never learn the server is gone; from their point of view they are still talking to the native REST API.

Hard problems solved

Running a scraping engine where the browser forbids it

The problem. Browsers enforce a rule called the same-origin policy: a web page may not read the contents of a different website. That rule exists for sound security reasons, but it makes a client-side manga reader — which by definition must read hundreds of other sites — structurally impossible by default. Manga sites send no permissive CORS headers (CORS, Cross-Origin Resource Sharing, is the opt-in a site uses to allow cross-origin reads), and their images sit behind Referer-based hotlink checks that reject any request not appearing to come from the site itself.

Why the obvious approach fails. The textbook answer is "run a backend that scrapes on the user's behalf." But that reintroduces exactly the server this product exists to avoid, with its hosting cost, its maintenance burden, and its position as a party that can see everything the user reads.

The solution. One origin-locked Cloudflare Worker (272 lines) exposes /proxy for HTML and /image for pictures. It injects a realistic browser User-Agent and a per-source Referer and — crucially — follows HTTP redirects by hand through a fetchWithRedirects helper (up to eight hops, redirect: "manual") instead of letting the platform auto-follow. This is the non-obvious part. Hotlink-protected image CDNs check the source-site Referer, so if a redirect hop silently resets the Referer to the CDN's own host, the request comes back 403 Forbidden. Following redirects manually lets the Worker carry the original Referer across every hop. On the way back it strips Content-Security-Policy, X-Frame-Options, and Content-Encoding so the browser's built-in DOMParser can actually read the returned HTML. The Worker is locked to the app's own origins (checked via the Origin and Referer headers), so it can't be abused as an open proxy for arbitrary sites.

Trusting parser code you download at runtime

The problem. Manga sites change their markup constantly, so the parsers that read them must be updatable without shipping a new app build. Nyora delivers them over-the-air (OTA — fetched live at runtime rather than baked into the release) from a third-party GitHub Pages host. But executing code you fetched from elsewhere is among the most dangerous things a web app can do: a compromised host could serve malicious JavaScript that runs with full access to the user's session.

Why the obvious approach fails. A plain fetch-and-run trusts both the network and the host completely. HTTPS protects the bytes in transit, but it says nothing about whether the host itself was tampered with — and a tampered host serves bad code over a perfectly valid HTTPS connection.

The solution. The runtime fetches an OTA manifest, then the parser bundle it names, and verifies both with a SHA-256 cryptographic hash before a single line executes. A hash is a short fingerprint of a file; the expected fingerprint is known in advance, and any change to the file — malicious or accidental — produces a different one. The helper fetchJsonVerified and an explicit bundle-hash compare throw on any mismatch. If anything fails (a network error or a hash that doesn't match), the runtime falls back to a roughly 244 KB embedded parser set. That fallback is lazily imported via import('./web-parsers/index.js'), so it never weighs down the initial page load and is only paid for when the OTA path is genuinely unavailable. The result is over-the-air updatability and offline-first resilience without ever blindly trusting a third party.

Making the browser a peer in a six-platform sync mesh

The problem. A user's library, history, and source preferences must stay identical across the web app and five native apps (Android, iOS, macOS, Windows, Linux). The native apps store their data in a SQL database with TEXT identifiers and sync through Supabase. The browser has no such database and a completely different storage model.

Why the obvious approach fails. Inventing a browser-specific schema would fork the data model and break interoperability the moment a single row crossed between platforms.

The solution. core/db.js mirrors the exact shared row shapes into IndexedDB, so a favourite or history entry looks the same whether it was written on a phone or in a tab. Sync runs through a Supabase Edge Function using per-row last-write-wins — when two devices edit the same row, the most recently changed one wins, decided by each row's updated_at timestamp. The subtle bug, called out directly in the source, is that a brand-new browser session must never push its untouched default source list over another device's deliberate choices. The fix: only sources the user explicitly installs, uninstalls, or pins get a fresh per-source timestamp and become eligible to push (the code notes "the user's installed/pinned sources follow their account too"). Defaults stay silent until the user actually touches them.

Reconciling identifiers across different platforms

The problem. The same manga source is named differently on different platforms. Synced rows arrive carrying forms like JS_<id>, parser:<ClassName>, and MangaSourceRef, while the live OTA catalogue keys its parsers by UPPER_SNAKE_CASE ids. A row that says parser:MangaDex simply does not match a catalogue entry called MANGADEX.

Why the obvious approach fails. A naive direct lookup throws "Unsupported source" on every foreign id and silently drops the user's synced library — the data is present but invisible.

The solution. cleanSourceId() canonicalises every inbound form, and parserFor() retries through a className-to-id map built from the live sources list, deriving UPPER_SNAKE from camelCase (className.replace(/([A-Z])/g, '_$1').toUpperCase()) as a last resort. Normalisation trusts the bundle-stamped canonical id rather than falling back to the source URL, which had previously produced non-canonical ids that broke matching across devices.

A real reader and an offline shell out of static files

The problem. A credible reader has to handle both Japanese-style paged manga (read right-to-left) and Korean-style webtoons (one long vertical scroll), remember each reader's preferences, and stay usable when the network drops — all with no server holding state.

The solution. The reader supports webtoon (continuous vertical) and paged left-to-right and right-to-left modes; it remembers direction, fit, and prefetch both globally and per title, cascading a per-manga override over the global default. It warms the next chapter's pages as the user nears the end of the current one, and it is fully keyboard-drivable (arrows step pages with RTL awareness, n/p change chapter, f toggles fit, Escape exits). The service worker (sw.js) uses a tiered caching strategy: stale-while-revalidate for the app shell and parser engine, cache-first for proxied images (which are effectively immutable), and network-first for everything else — so the interface and the bundled parsers keep working on a flaky or absent connection.

Engineering highlights

  • Code-verified SHA-256 supply-chain check on every over-the-air parser bundle and its manifest, with a lazily imported ~244 KB embedded fallback that never touches the initial page load.
  • A manual, Referer-preserving redirect follower in the Worker that defeats hotlink-protection image CDNs which a normal auto-following fetch cannot reach.
  • An in-browser API shim (core/api.js) that reproduces the native Kotlin REST contract route-for-route, so all fifteen screens run unchanged against a serverless backend.
  • A schema-faithful IndexedDB mirror that lets the browser participate as a first-class peer in timestamp-gated last-write-wins sync across six platforms.
  • Identifier reconciliation (cleanSourceId / parserFor) that canonicalises four different cross-platform id formats so foreign synced rows resolve instead of silently vanishing.
  • Build-free local development alongside a code-split, content-hashed, cache-busted esbuild production path driven by a single ~70-line script.
  • An origin-locked, 272-line Worker as the entire server-side surface area — one edge function instead of a service fleet.

What this demonstrates

This build shows the ability to deliver a genuinely complex product — a multi-source scraping engine, cross-device sync, and a polished reader — under the hardest possible constraints: no application backend, no UI framework, and a browser security model fighting the core feature at every turn. It reflects mature instincts across several senior concerns at once: supply-chain safety (never execute remote code you haven't verified), contract preservation under refactor (change the implementation, not the interface, and fifteen screens cost nothing to migrate), distributed-systems correctness (last-write-wins that won't let a fresh device clobber real data), and pragmatic infrastructure economics (one edge Worker instead of servers to operate and pay for). The hardest part here was knowing what not to build — and that restraint is precisely the achievement.