EPUB to TXT Converter
Drop a DRM-free EPUB, get plain text with chapter breaks. Useful for note-taking, search indexing, and feeding ebook content into text-processing tools that don't speak HTML.
Drop your EPUB file here
Converts to .txt — stays on your device
Why convert EPUB to TXT?
- Pulling the text of a public-domain book from Project Gutenberg into Obsidian, Notion, or another notes app for highlighting and annotation.
- Feeding an ebook into a script for word frequency analysis, search indexing, or NLP processing.
- Extracting text from your own manuscript draft for a word-count tool or readability checker.
- Generating a plain-text version of a manual for grep / ripgrep search across multiple titles.
- Reading on an e-ink device, terminal pager, or accessibility tool that takes raw text input.
- Building a personal corpus from public-domain ebooks for language model fine-tuning or data analysis.
How our converter works
Your EPUB is unzipped client-side via fflate. We read META-INF/container.xml to find the OPF, walk the spine in reading order, parse each chapter's XHTML, and extract the body text — paragraphs, headings, lists — flattened to plain text with chapter breaks marked by '---' separators. All formatting, links, images, and styles are stripped. The result is a single .txt file with one chapter after another. Conversion runs entirely in your browser.
Frequently asked questions
How are chapters separated?
By a line of three dashes '---' surrounded by blank lines. Most editors and CLI tools (grep, awk, ripgrep) handle this fine, and you can split on it programmatically with a one-liner.
What about footnotes, links, and images?
Stripped. Footnotes appear inline as part of the surrounding paragraph (if the EPUB used inline notes); links lose their URLs but the link text remains; images and figures are dropped entirely.
Will DRM-protected EPUBs work?
No. We work with DRM-free EPUBs only: Project Gutenberg, Standard Ebooks, Smashwords, your own writing, etc. We don't ship DRM-bypass tooling.
What encoding is the output?
UTF-8 — the universal default. Special characters (smart quotes, em dashes, accented letters) survive intact. The Content-Type header on the download is text/plain;charset=utf-8.
Are my files uploaded?
No. fflate, DOMParser, and the file I/O all run as JavaScript on this page. Manuscripts, downloaded ebooks, and personal corpora stay on your device.
About the EPUB format
EPUB is the open ebook standard — a ZIP container of XHTML chapters with a package manifest declaring the reading order. Plain TXT is the universal lowest common denominator: every editor, CLI tool, and processing pipeline reads it. Converting EPUB→TXT is the standard extraction step for anything that needs the words without the formatting: note-taking from public-domain books, NLP analysis, search indexing, readability metrics, or feeding into a script that processes ebook content programmatically. The conversion is necessarily lossy on formatting — you lose styling, links, images, and layout — but lossless on the actual text content, which is usually the point.