PDF to Word

The PDF to Word Converter takes a PDF, pulls out the selectable text, and packages it as an editable .docx Word document — open it in Microsoft Word, Google Docs, LibreOffice Writer, or Apple Pages and start editing. Honest framing: this is a text-only conversion. Layout, tables, images, embedded fonts, and visual formatting are not preserved — those don't survive client-side PDF text extraction in a usable form. If you need true layout-preserving conversion (with tables, images, columns, fonts), you need Adobe Acrobat Pro or another paid product that runs server-side document recovery. This tool is for the very common case where you just need to edit the words and don't care about replicating the original design. Runs entirely in your browser using Mozilla's PDF.js for extraction and the `docx` library for the Word file — your PDF never leaves your machine.

Built by Bob Article by Lace QA by Ben Shipped

🔒 Everything happens in your browser. The PDF never uploads. Close the tab and it's gone.

How to use

  1. 1

    Drop or pick your PDF. Up to 100 MB and 500 pages.

  2. 2

    Read the yellow disclaimer — this is a text-only conversion. If you need preserved tables, images, or layout, this is not the right tool; use Adobe Acrobat Pro instead.

  3. 3

    Click "Convert to Word." The tool reads each page, reconstructs line breaks from the layout, and writes the text into a .docx with one paragraph per visual line and a blank line between pages.

  4. 4

    Download the .docx, named after your source PDF (e.g., report.pdf → report.docx). Open it in Word, Google Docs, LibreOffice, or Pages and edit normally.

Frequently asked questions

Ratings & Reviews

Rate this tool

Sign in to rate and review this tool.

Loading reviews…

What the PDF to Word Converter actually does

The PDF to Word Converter pulls the selectable text out of a PDF and packages it as an editable .docx file. Open it in Microsoft Word, Google Docs, LibreOffice Writer, or Apple Pages, and start editing. The text extraction runs in Mozilla's PDF.js — the same library Firefox uses to render PDFs natively — and the .docx is written by the `docx` library in the browser. Your PDF never uploads.

Be honest up front: this is a text-only conversion. Layout, tables, images, embedded fonts, columns, page-anchored positioning, the carefully tuned margins of the original — none of that survives. The output is your PDF's words, in paragraphs, in reading order, ready to edit. If you need a Word document that opens looking like the original PDF (preserved tables, images, columns, fonts), you need Adobe Acrobat Pro, which runs a server-side document-recovery engine that's been refined for two decades and costs around $20/month. We don't try to compete with that. We cover the case where you just want to edit the wording — change a name, update a date, fix a paragraph, send a revised version — and don't need to replicate the original design. That's a common case, and for it, this tool is the right pick.

How to use it

One screen, one file, one click. Everything runs locally.

  1. Drop or pick your PDF. Up to 100 MB and 500 pages.
  2. Read the yellow disclaimer — this is a text-only conversion. If you need preserved tables, images, or layout, this is not the right tool; use Adobe Acrobat Pro instead.
  3. Click Convert to Word. The tool reads each page, reconstructs line breaks from the layout, and writes the text into a .docx with one paragraph per visual line and a blank line between pages.
  4. Download the .docx, named after your source PDF (e.g., report.pdfreport.docx). Open it in Word, Google Docs, LibreOffice, or Pages and edit normally.

Open the browser's network tab during conversion: after the page itself loads, the tab is silent. PDF.js reads the bytes locally. The docx library writes the Word file locally. The download is served from a blob URL. Your PDF doesn't leave the machine.

A worked example with real numbers

Take a real case: a 12-page contract in PDF, 240 KB, exported from Microsoft Word originally (so the text is embedded, not scanned). Two columns of legal text per page, footer with page numbers, no images, a few clauses in bold.

Conversion takes about 1.4 seconds. The output is a 28 KB .docx that opens cleanly in Word. The text is all there — paragraph-for-paragraph, in reading order. The two-column layout is gone (the .docx is single-column). Bold formatting is gone (we extract text only, not styling). Page numbers in the footer mixed into the body text at the boundary between pages. The bold-text-becomes-plain-text issue means you'll lose visual emphasis; you can re-bold the key clauses by hand once it's open in Word.

Net result: usable. A contract you can edit. You can change the party name, update the effective date, revise a clause, accept-tracked-changes from someone else, then export back to PDF from Word. The two-column layout doesn't matter once it's an editable document because Word lays it out the way Word wants to.

Flip the input: a 50-page scanned PDF of a 1970s technical manual. The convert button runs, then returns a near-empty .docx. The PDF has no embedded text — it's a stack of page images — so there's nothing for the extractor to extract. The right tool here is OCR. Run the scan through our OCR PDF tool first, then bring the resulting text into Word.

Why the layout is not preserved

Real layout-preserving PDF-to-Word conversion is genuinely hard. A PDF stores text as a stream of positioned glyphs — each character has an (x, y) coordinate, a font reference, and a glyph index. There's no marker for "this is a heading," "this is a table row," "this is a footnote." A converter that wants to write a faithful Word document has to infer all that structure from the positions: detect which glyphs form a heading by their font size and weight, detect which lines form a table by spotting a grid pattern in the line positions, detect which content is a sidebar callout, detect column boundaries, detect captions. It's a hard machine-learning problem and a deep ergonomics problem.

Adobe Acrobat Pro does it well because Adobe has been refining their recovery engine since the late 1990s, trained on a corpus of millions of documents, with a stack of heuristics most of us never see. They get tables back as tables, images back as images, columns back as columns. It's worth $20/month if your job involves moving documents between PDF and Word all day.

Open-source browser-side libraries can't match that. They either produce broken output on real-world PDFs (heuristics fail in surprising ways), or they punt on layout entirely and just give you the text. Most "free PDF to Word" tools you find online — iLovePDF, SmallPDF, Smallpdf again rebranded, online2pdf, freepdfconvert.com — choose option three: upload your PDF to their server, run a commercial engine on it, send back the .docx. The result is closer to Acrobat's quality. The cost: your file lives on their server for some retention window, the free tier caps you fast, and the paid tier funnels into a $5-15/month subscription.

We chose differently: extract clean text, write a valid .docx, tell you up front what you're getting. For the case where you need to edit the words — which is the most common reason people open these tools — it's the right trade.

How this compares to Adobe Acrobat, SmallPDF, iLovePDF

Three tiers in this market, picking the right tier saves headaches.

ToolPrivacyLayout fidelityCostBest for
Adobe Acrobat Pro (desktop)Local — runs on your machineHigh — tables, images, columns recovered~$20/monthDaily PDF↔Word work, professional document recovery
iLovePDF / SmallPDF (web)Files uploaded, kept for hoursMedium-high — server-side commercial engineFree with caps, $9-15/month for unlimitedOccasional conversions, layout matters, don't care about upload
This toolLocal — runs in your browserLow — text only, no layoutFree"I just need to edit the wording," sensitive documents, no upload
Google Docs ("Open with")Uploaded to Google DriveMedium — Google's converter is decent on simple PDFsFree if you have a Google accountYou're already in Google's ecosystem

Pick by the trade you care about. Privacy-first → our tool, accepting that you'll lose layout. Fidelity-first → Acrobat Pro on the desktop, or one of the upload services. Free-and-good-enough-for-simple-PDFs → Google Docs if you're OK with the data going to Google.

What you get and what you lose

Knowing the inventory upfront prevents disappointment.

What comes through:

  • Body text. Every selectable glyph in the PDF, in roughly reading order, organized into paragraphs by visual line breaks.
  • Paragraph structure. A blank line between pages, line breaks where the PDF has them, runs of text grouped where the y-coordinate is consistent.
  • Unicode. Accented characters, Cyrillic, Greek, common math symbols, emoji — anything the PDF stored as a Unicode glyph — make it into the .docx as the right characters.
  • Reading order, mostly. Single-column documents come out clean. Two-column documents sometimes interleave the columns; you'll need to fix this by hand in Word.

What gets dropped:

  • Tables. The text inside table cells appears in the .docx as plain paragraphs in roughly the reading order, not as a Word table. Recovering the cell grid would require detecting the table structure from the line positions — out of scope for a text-fidelity tool.
  • Images. Skipped entirely. If you need them, our Extract PDF Images tool will pull them out as separate files for you to insert into Word manually.
  • Formatting. Bold, italic, font sizes, colors, styles — all dropped. The .docx is plain text. Re-format what you need by hand once it's open in Word.
  • Columns. Multi-column layouts collapse to single-column.
  • Headers, footers, page numbers. These often mix into the body text at page boundaries because PDF.js doesn't separately label them.
  • Footnotes. Land in the body text near where they appear on the page, not at the bottom of the page in Word's footnote panel.
  • Hyperlinks. The link text comes through as plain text; the underlying URL is dropped.

The simple test: if your goal is "I want to edit the words," this is the right tool. If your goal is "I want a Word document that opens looking like the PDF," it isn't.

The two-column problem and how to fix it

The single most common complaint about text extraction from real-world PDFs is column interleaving. Academic papers, magazine articles, newspapers, legal documents — anything in two-column or three-column layout — can come out with the columns alternating line-by-line: line 1 of column 1, then line 1 of column 2, then line 2 of column 1, then line 2 of column 2. Unreadable.

This happens because PDF.js returns text items in source order — roughly top-to-bottom, left-to-right within a small y-tolerance — and a two-column layout has lines at the same y-coordinate in both columns. Without column-detection (a layout-recovery step we don't do), the extractor reads them as a single line that crosses the column boundary.

Three fixes, in order of effort:

  1. Try the conversion first. Some PDFs encode column boundaries cleanly enough that the y-tolerance separates the columns naturally. You may get clean output without doing anything.
  2. If columns interleaved, split the PDF. Use our Split PDF tool to extract just one page, then crop the page to a single column before converting. Tedious for long documents but bulletproof.
  3. Use a layout-recovery tool. For heavy column-based documents, Adobe Acrobat Pro or one of the server-side services will detect the columns correctly. The trade-off is the upload, the cost, or both.

When this tool is right, and when it isn't

The right cases:

  • You need to edit the wording. A contract with a name change. An article you want to revise. A report you need to update before sending.
  • Single-column body text. Memos, letters, articles, contracts, eBooks — most "text-shaped" PDFs come through cleanly.
  • Sensitive documents. Anything you'd think twice about uploading: legal, medical, financial, personal. The conversion runs in your browser; nothing leaves the machine.
  • Long PDFs. The 500-page limit is generous because text extraction is cheap. Most online services cap at 25-50 pages on the free tier.

The wrong cases:

  • You need the PDF to look the same in Word. Use Acrobat Pro or a server-side service. We tell you this upfront — there's no point in pretending.
  • The PDF is mostly tables. Financial reports, invoices, structured data — try our PDF to Excel tool, or use Acrobat Pro.
  • The PDF is scanned. No selectable text means nothing to extract. Run it through OCR PDF first to get the text into a usable form.
  • You need images preserved. Use Extract PDF Images to pull them, then insert into Word manually.

Related PDF tools

PDF to Word is one tile in a stack of browser-side PDF tools:

  • Word to PDF — the reverse direction. Runs in the same browser-side mode.
  • Extract Text from PDF — same extraction step, plain .txt output. Pick this if you don't need a .docx wrapper.
  • PDF to Excel — pulls tabular data out of a PDF. The right tool when your PDF is mostly tables.
  • OCR PDF — for scanned PDFs that have no selectable text. Recognizes the words from pixels using Tesseract.
  • Extract PDF Images — pulls the embedded images out as separate files. Pair with this tool to recover both the text and the images.
  • Split PDF — break a long PDF into chunks before converting.

Microapp ships every PDF tool browser-side, with the same trade-offs spelled out on each page. 10% of every dollar of Microapp revenue goes to charity, off the top, audited quarterly — so the tools have to do honest work, which means we tell you when this one isn't the right answer.

Frequently asked questions

Why is the layout not preserved?

Real layout-preserving PDF → Word conversion is a hard problem: the converter has to detect headings, paragraphs, columns, tables, lists, and image placement from a stream of positioned glyphs that has no semantic structure. Adobe Acrobat Pro does it well because they run a server-side recovery engine trained on millions of documents. Open-source client-side libraries can't match that — every honest attempt either produces broken output for real-world PDFs or requires uploading the file to a cloud service. We chose neither: we extract clean text, package it as a valid .docx, and tell you up front that's what you get. It's the right trade for 'I just need to edit the wording.'

What about tables — will they come through?

No. Tables in a PDF are not stored as tables — they're stored as a grid of independently positioned text runs and drawn lines. To reconstruct a table you have to detect the cell grid from the line positions and group the text accordingly, which is exactly the kind of layout recovery this tool deliberately doesn't do. The cell text will appear in your .docx but as plain paragraphs in roughly reading order, not as a Word table. If your PDF is mostly tables (e.g., a financial statement, an invoice), use a dedicated PDF-to-Excel tool or Adobe Acrobat Pro.

Do images come through?

No. Images embedded in the PDF are skipped entirely. The text extraction pass reads glyphs only, and writing images into a .docx requires re-encoding them and computing placement coordinates that match the original page — out of scope for a text-fidelity tool. If you need the images, extract them separately with our PDF to PNG or PDF to JPG tools and insert them into Word manually.

How is this different from Adobe Acrobat Pro?

Acrobat Pro runs a full document recovery pipeline: it detects headings, paragraphs, columns, lists, tables, and image regions, then writes a Word document that looks visually similar to the original PDF. It is the industry-standard tool for this and it costs ~$20/month. We do not try to compete on fidelity — we cover the case where you don't need the visual fidelity, just the editable text. If your output 'must look like the original PDF when reopened in Word,' use Acrobat Pro. If your output 'must contain the text from the PDF so I can edit it in Word,' use this tool.

Is my PDF really not uploaded?

Correct. Both stages run in the browser. PDF.js (the same library that renders PDFs inside Firefox) extracts the text, and the `docx` library builds the Word file in browser memory. Your bytes never leave your machine. Check your browser's network tab during convert: zero outbound requests after the page itself loads.

Does this work on scanned PDFs?

No — and we say so clearly when it doesn't. Scanned PDFs are images of text, not selectable text. To get words out of a scan you need OCR (Optical Character Recognition), which is a different operation. This tool extracts text that's already in the PDF. For scans, run the PDF through an OCR tool first (Adobe Acrobat, macOS Preview, Tesseract, or one of the free online OCR services), save the result, then run it through this tool.

Can I convert a password-protected PDF?

No — PDF.js refuses to open encrypted PDFs. Unlock the PDF first using a desktop reader (Adobe Acrobat: File → Properties → Security → 'Save As' an unprotected copy; or macOS Preview: File → Export → uncheck 'Encrypt') and run the unlocked copy through this tool.

What's the max file size or page count?

100 MB and 500 pages per PDF. Text extraction is faster than full-page rendering, so the limit is generous. For multi-thousand-page documents (legal discovery, large manuscripts), split the PDF first with our PDF Splitter and convert in chunks.

Why are my line breaks weird?

PDF.js returns text items in source order with x/y coordinates; we insert a line break whenever the y-coordinate jumps. Most PDFs come out clean, but two-column documents will interleave the columns and some PDFs have unusual text-positioning that produces extra mid-paragraph breaks. Once the .docx is open in Word, use Find & Replace to clean up: replace `^p` (paragraph mark) with a space, then re-paragraph by hand. It's still faster than retyping.