PDF to TXT

Save every page of a PDF as a plain UTF-8 text file.

About PDF to TXT

PDF to TXT extracts the document's text layer and packages it as a downloadable .txt file. UTF-8 encoded with a BOM so Windows tools (Notepad, Excel import) detect the encoding correctly.

Pages are separated by a form-feed character so downstream tools can split them cleanly. For scanned PDFs without a text layer, run PDF OCR first to add one, then convert.

How it works

1
Upload the PDF.
2
Click Convert.
3
Download the .txt file.

When to use it

Feed PDF content into an LLM prompt, search index, or grep.
Move document text into a plain-text editor for cleanup.
Archive PDF text in a format that survives 20-year format drift.

Privacy

Files are processed by the Evixpdf engine in-house with the AGPL-free MIT stack — no third-party cloud upload. Sessions auto-purge after processing.

Frequently asked questions

Short answers to the questions people most often ask about PDF to TXT. Read the one that matches your situation — they're written to be skimmed.

1What encoding is the output?

UTF-8 with a BOM. Notepad, VS Code, Excel's Text Import wizard, and Unix tools all read it correctly.

2How are pages separated?

A form-feed character (\f, ASCII 12) appears between pages. Most text editors treat it as a paragraph break; scripts can split() on it.

3Does this work on scans?

No — scans have no text layer. Run PDF OCR first to add one, then convert.

Still stuck?

Browse our hand-written guides or ask us directly — we usually reply within a business day.

Read guides Contact support

Extract Text

Pull all selectable text out of a PDF as plain text.

TXT to PDF

Turn plain-text files into clean, paginated PDFs.

OCR Recognition

Add a real text layer to scanned PDFs.

Ready when you are

Try PDF to TXT now

No signup, no email required. Drag your file in and you're done in seconds.