Mystery Games in 8 Languages: Why Localised Puzzles Beat Translated Ones

May 16, 2026

There’s a tempting shortcut when scaling a daily puzzle to a new language: feed every clue to a translation model, ship eight locales overnight, claim multi-language coverage in your launch post. We tried it as a spike before EveryClue’s launch and threw out the results within a day. Here’s what we learned and how we ended up doing it properly.

What breaks first

The clues. A whodunit clue like “the doctor sits to the left of the inspector” is a one-sentence English idiom carrying a lot of agreement information. Machine-translated into French, it becomes “le docteur s’assoit à gauche de l’inspecteur” — fine — but try Japanese and you get a literal “医者は調査官の左に座る” that reads stiff and unnatural; ウミガメのスープ style would phrase it more like “医者は調査官のすぐ隣、向かって左に座っている”. Both are technically correct. Only one of them sounds like a clue a human would write.

The deeper problem: deduction reads natively or it doesn’t read at all. If a Japanese player has to mentally re-translate every clue back to “what would this sound like if a human wrote it”, they’re not playing a whodunit, they’re doing translation homework.

What we actually did

Three layers, ranked by how much they affect the experience:

1. Clue vocabulary: hand-locked, type-checked. EveryClue ships a clue-vocabulary.ts file with the ~30 verbs and connectors that make up the entire logic-grid vocabulary — is, is not, is to the left of, is between … and …, if … then, exactly one of, and so on. Each of those has a hand-written translation in all eight launch locales, locked at the type level so a missing translation fails the TypeScript build. We will never AI-translate these. The whole logic surface is hand-written.

2. Cultural anchors per locale, not a translation layer. When the AI generates tomorrow’s Japanese grid, it doesn’t translate an English template. It generates the Japanese grid directly, conditioned on cultural anchors specific to Japan — Showa-era Kyoto inns, surname-primary character naming, weapons like 日本刀 and 千枚通し that fit honkaku-style mystery, room categories like 旅館 and 茶室 and 書斎. The Spanish grid is generated with Latin American anchors — Mexico City, Bogotá, Buenos Aires; navaja, veneno, llave inglesa; hacienda, banco, plaza mayor. The two grids are siblings, not translations.

3. Puzzle body: AI-generated, native sampled. The actual clue prose for each daily puzzle is generated directly in the target language by a frontier language model — not translated from an English original. Every Wave 1 week, we sample five puzzles per locale and human-review them for fluency. Anything that reads as “translated from English” gets the locale’s anchors updated and the puzzle regenerated. By Wave 2, most locales are getting fewer than one sampling fix per week.

Why the cultural side matters more than the words

A daily ritual game is mostly about familiarity. When a Brazilian player opens their morning Portuguese puzzle and the cast is João Silva and Maria Costa and the rooms include the sambódromo and the padaria, they’re playing a Brazilian whodunit. When it’s John Smith and Mary Jones and the rooms are manor and library in awkward Portuguese, they’re playing an American whodunit they happen to be reading in Portuguese.

The same applies in reverse. English players who tried our Japanese honkaku-style puzzle for fun reported that even though they read every clue twice and used Google Translate for half the room names, the puzzle felt substantially different from a Baker Street whodunit. That’s the value. Eight languages, eight feels.

What this looks like in production

Every night at 23:30 UTC, the EveryClue pipeline generates 16 puzzles — eight locales times two formats (grid + lateral mystery). Each grid goes through a SAT solver before publishing; each lateral mystery gets a truth field that’s locked from leaking through the AI host. At 00:00 UTC they go live, with hreflang alternates in eight languages so search engines route each player to their own version.

If you’re curious, today’s case in your language is the easiest way to see it. The puzzles ship simultaneously — there’s no “English-first, others later”. The fact that everyone playing today is playing the same case in different languages is, in our experience, one of the small things that makes a daily ritual feel like a global ritual.