pdftoolskit.org
PDF (Portable Document Format) utilities, in the browser
Say hi →

PDF → Text

extract raw text · one .txt per source file

Drop one or more PDFs and pull the text content into a plain-text file per input. Page boundaries are marked with a customisable separator.

Drop PDF files here or click to select

Multiple files allowed

    no files
    Ready.

    When to use this tool

    Pull text out of a PDF for editing, searching, translating, pasting into a chat, or feeding to another tool. The output is plain UTF-8 text — no formatting, no layout. Great for "I just need the words" tasks.

    Step by step

    1. Drop the PDFs. Each input becomes one .txt file in the output.
    2. Customise the page separator if you need a specific divider between pages. Default is --- Page {n} --- on its own line. Use {n} for the page number.
    3. Click "Extract & download". Each result has a thumbnail-free download row; click to save the .txt.

    Common use cases

    Common mistakes

    FAQ

    Why is my PDF returning no text?

    Most likely it's a scan. Open it in a viewer — if you can't select / copy any text by hand, this tool can't extract any either. Run OCR first.

    Is OCR coming?

    Tesseract.js is on the wishlist. It's heavy (10+ MB WASM) so the implementation needs to load it on demand and only when you ask. If you have a strong opinion, drop me a note via contact.

    Can I get the text per-page in separate files?

    Not yet — the output is one .txt per source file with page separators. To split per page, run the .txt through any split-on-marker script.

    Does it preserve hyphenation and line breaks?

    Line breaks within a paragraph become spaces (so words don't run together). Page-final hyphens are not joined automatically — you may see a trailing dash on some words.