Adobe InDesign the preparation of PDF files for translation

Back to Resources

Translating a PDF: Why It Breaks
and What to Do Instead

DTP Insights

PDF is the most common file format agencies receive from clients. It is also one of the worst starting points for a translation project. Understanding why — and what to do about it — saves time, money, and avoids DTP problems that only surface at the very end of a project, when fixing them is most expensive.

Why PDF is fragile

PDF was designed for one purpose: to look exactly the same on every screen and every printer, forever. It does this brilliantly. But that strength is also its fundamental weakness when it comes to editing and translation.

A PDF does not store text as a flowing document. It stores every word, every line, every character as an independently positioned object on a page — fixed coordinates, fixed placement. There is no concept of "paragraph" or "heading" in the way a Word document or an InDesign file understands it. The visual result looks like a document. The underlying structure is closer to a technical drawing.

This is why editing a PDF directly — even in Adobe Acrobat, which does allow text editing — is so fragile. The moment a translated sentence is slightly longer than the original, the text frame overflows. Letters break outside their box. Spacing collapses. And because adjacent text boxes are completely independent, fixing one element does not reflow the others. There is no flow to reflow.

💡
Key insight

The PDF is not broken. It is doing exactly what it was designed to do. The problem is asking it to do something it was never designed for.

The PDF-to-Word route: common but costly

The most widespread approach in translation agencies is converting the PDF to a Word document, sending it to a CAT tool for translation, then converting the translated Word back to a formatted file. This works — up to a point.

The conversion from PDF to Word is imperfect by design. Conversion software has to guess at the document structure. It reads coordinates and tries to reconstruct a flowing document from a fixed layout. The result is often:

  • Multi-column layouts that break apart or read in the wrong order
  • Headings extracted as individual text fragments rather than coherent paragraphs
  • Tables split into disconnected cells
  • Soft returns — forced line breaks added by a designer — surviving as hard returns, fragmenting sentences into separate segments inside the CAT tool
  • Paragraph order that does not match the visual reading order of the original

For a translator, this means a single sentence may appear as three separate segments, a heading appears twice, and translation memory suffers. The DTP work afterwards is far heavier than it would need to be.

⚠️
Watch out

Some of these problems can be reduced with careful cleanup before sending to the CAT tool — removing forced line breaks, merging split segments, correcting reading order. But this is manual work and scales poorly across large or complex documents.

When PDF-to-Word is acceptable

Not every project justifies a full InDesign workflow. For short, simple, text-heavy documents — a one-page letter, a simple form, a short press release — PDF to Word conversion is often the most practical route. The key conditions are:

  • The document has a simple, single-column layout with no complex tables or graphics
  • The page count is low enough that manual cleanup is manageable
  • The final delivery does not need to precisely match the original visual layout
  • The client has no source files and cannot provide them
💡
Tip

When converting, enable "Retain Flowing Text" rather than "Retain Page Layout" where available. Flowing text produces cleaner CAT tool output — it sacrifices visual fidelity for structural integrity, which is what matters for translation.

The right approach: work from source files

If the client has InDesign source files, use them. This is always the correct answer. InDesign exports to IDML — InDesign Markup Language — which CAT tools including Trados Studio can process directly. The translation happens inside the actual layout. Paragraph styles, fonts, spacing, and structure are preserved. Post-translation DTP is minimal.

Fixing the most common Trados error

When receiving IDML files for translation, a common error appears: Trados Studio refuses to open the file with the message:

🚫
Trados error message

"The selected IDML file was created by an unsupported version of Adobe InDesign and will not be processed."

The fix is simple and takes thirty seconds. There are two ways depending on whether you want to fix it for one project or for all future projects:

For a specific project

  1. Go to Project Settings
  2. Expand File Types
  3. Select Adobe InDesign CS4-CC IDML
  4. Click Common
  5. Check "Process unsupported file versions"
  6. Do this before adding files — the setting is not retroactive

For all future projects

  1. Go to File → Options
  2. Expand File Types
  3. Select Adobe InDesign CS4-CC IDML
  4. Click Common
  5. Check "Process unsupported file versions"
Result

Trados will now process the IDML regardless of which InDesign version created it. In practice, it works correctly almost every time.

When there are no source files: consider redesigning

If the document is short — up to fifteen or twenty pages — and the client cannot provide source files, it is often worth rebuilding the InDesign file from scratch rather than working from a converted PDF. This sounds counterintuitive but the maths work out.

A clean InDesign rebuild takes a few hours. The resulting IDML can be translated cleanly in a CAT tool. Post-translation DTP is minimal. Compare that to the cumulative hours of cleaning up a broken PDF conversion, fixing missegmented CAT output, and doing heavy DTP correction at the end — often repeated across multiple languages.

💡
Rule of thumb

Under 20 pages with no source files → consider rebuilding in InDesign. Over 20 pages → PDF to Word with thorough cleanup is usually the only practical option.

Designing for translation from the start

The cleanest projects are those where the document was designed with translation in mind before anyone opened InDesign. A designer who understands multilingual DTP will:

  • Use paragraph and character styles consistently throughout — never manual overrides
  • Avoid forced line breaks (Shift+Enter) inside paragraphs — these break segments in CAT tools
  • Avoid manual hyphenation — hyphenated words split across lines become separate segments
  • Leave reasonable space in text frames for text expansion — translated text is often 20–30% longer than English source
  • Design in InDesign from the start — never hand over a PDF when source files could exist

That last point matters most for multi-language projects. A document going into thirty languages should be designed in InDesign as a clean, style-driven file. The IDML goes to translation, comes back, and the DTP work across all languages is straightforward. The alternative — a PDF converted, broken, cleaned up, translated, and reformatted thirty times — multiplies every problem by thirty.

A note on forced line breaks

This is worth emphasising because it is the single most common DTP-introduced problem in translation-ready files. A forced line break (Shift+Enter in InDesign) looks identical to a paragraph break on screen, but behaves completely differently in a CAT tool. The segment boundary falls mid-sentence. The translator sees half a sentence as one segment and the other half as the next. Translation memory cannot match it correctly.

And because the break is invisible unless hidden characters are shown, it often goes unnoticed until the project is already in translation.

How to check

Before any IDML file goes to a CAT tool: Type → Show Hidden Characters (Ctrl/Cmd + Alt + I). Forced line breaks appear as a downward-left arrow. Remove them all.

Summary: how to choose your approach

Situation Best approach
Client has InDesign source files Export to IDML, translate in CAT tool directly
No source files, short simple document Rebuild in InDesign, export IDML
No source files, long or complex document PDF to Word with thorough cleanup before CAT
New project, will be translated to multiple languages Design in InDesign from day one, style-driven, no forced breaks
Further Reading & Resources
ImpressumPrivacy Policy