PDF to TXT

Save every page of a PDF as a plain UTF-8 text file.

Browse all tools

About PDF to TXT

PDF to TXT extracts the document's text layer and packages it as a downloadable .txt file. UTF-8 encoded with a BOM so Windows tools (Notepad, Excel import) detect the encoding correctly.

Pages are separated by a form-feed character so downstream tools can split them cleanly. For scanned PDFs without a text layer, run PDF OCR first to add one, then convert.

How it works

  1. 1

    Upload the PDF.

  2. 2

    Click Convert.

  3. 3

    Download the .txt file.

When to use it

  • Feed PDF content into an LLM prompt, search index, or grep.
  • Move document text into a plain-text editor for cleanup.
  • Archive PDF text in a format that survives 20-year format drift.

Privacy

Files are processed by the Evixpdf engine in-house with the AGPL-free MIT stack — no third-party cloud upload. Sessions auto-purge after processing.

Frequently asked questions

Short answers to the questions people most often ask about PDF to TXT. Read the one that matches your situation — they're written to be skimmed.

1What encoding is the output?
UTF-8 with a BOM. Notepad, VS Code, Excel's Text Import wizard, and Unix tools all read it correctly.
2How are pages separated?
A form-feed character (\f, ASCII 12) appears between pages. Most text editors treat it as a paragraph break; scripts can split() on it.
3Does this work on scans?
No — scans have no text layer. Run PDF OCR first to add one, then convert.

Still stuck?

Browse our hand-written guides or ask us directly — we usually reply within a business day.

Ready when you are

Try PDF to TXT now

No signup, no email required. Drag your file in and you're done in seconds.