"Fan Fiction"
04/03/2026
echo translation ocr llm
Translating a novella from Spanish to English
One of my favourite authors is prolific and has written over a hundred novellas. But only thirty or so have been translated and published in English. And they (New Directions) seem to be publishing around a book a year (although 2026 will see five, and last year technically saw two published). Still, at this rate, it will be a long, long time before I'm able to read all of his work that is available, unless of course I learned Spanish, which would be the ideal way of handling this situation. It seems like that hasn't happened yet though.
So, instead I wanted to experiment with LLM-assisted translations under what I've deemed as fan fiction, since I have no way to tell whether the output is faithful, or hallucinatory; thus it is Borgesian at least, nonetheless.
Did it work? What happened?
It worked! Not only did the translation seem plenty acceptable to me (after a decent amount of trial and error and experiments with various models and methods), but I also had a full Spanish-to-English pipeline system with concrete artifacts at each stage. In total, it took 6h 32m 9s across 69 chunks to translate this novella, and it maybe cost ~$15?
For this project, I wanted to be able to reproduce the results reliably, so success meant three things: usable output, a resumable workflow, and enough saved state to inspect or rerun the result later, and allow for me to iterate on the process at various phases.
BASH
bun run translate -- --sourceText @.data/runs/clowning-full-context/reconstructed.source.md --target en --source es --mode literary --multiRoleFlow true --revisionPass true --contextBeforeParagraphs 2 --contextAfterParagraphs 2 --holisticPass true --holisticContextCharacters 14000 --styleExamples examples/reference-translation-examples.md --styleExampleCount 5 --storySummaryPath examples/story-summary.en.md --translatorDossierPath examples/translator-dossier.md --faithfulnessCheck true --faithfulnessStrict false --outDir .data/runs/clowning-full-context-v2 --force true
The shape
This is a simplified version of the larger package diagram. The important part is that the system is not translating paragraphs in isolation. Style anchors, a story summary, and a translator dossier all feed into the chunk-level decisions. This is the system that produced the final output file that happened to look/sound/read/feel decent.
The loop
The biggest change was treating translation as a loop instead of a single model call. Scan and inspect seams, translate with context, run faithfulness checks in 'report' mode, sweep candidates where needed, curate, then feed the results back into the dossier and style anchors.
Why this one held together
Two things made this run feel more real. First, resumable runtime artifacts like scan.progress.jsonl, translate.progress.jsonl, and the stitch outputs meant I could see where the run was and pick it back up if needed. Second, the committed archive preserved local inputs and outputs that would otherwise be transient. Also, the translation quality was finally satisfactory.
That archive matters because .data and local-inputs are normally transient or gitignored. Freezing the inputs, logs, hashes, and outputs made the run inspectable later.
What is still rough
It's not perfect. I know an actual real translator would do a much better job and if anything it shone a brief light on how interesting and difficult that work is. I had to read it, make edits etc. I'm sure it's missing subtleties, nuances, details that a bilingual speaker would not mess up; but that's what makes it <b>fan fiction</b> to me.
Next
The next step is to keep trying and improving the quality of the translations while keeping the workflow as simple as possible. I'd like to try different authors too, different languages. I'm also only going to do this work with texts that I have bought the physical book of. I don't want to use found PDFs or anything like that, which also turns this project into a bit of a collection project too. Thanks for reading!