

PDF is the most common file format agencies receive from clients. It is also one of the worst starting points for a translation project. Understanding why — and what to do about it — saves time, money, and avoids DTP problems that only surface at the very end of a project, when fixing them is most expensive.
PDF was designed for one purpose: to look exactly the same on every screen and every printer, forever. It does this brilliantly. But that strength is also its fundamental weakness when it comes to editing and translation.
A PDF does not store text as a flowing document. It stores every word, every line, every character as an independently positioned object on a page — fixed coordinates, fixed placement. There is no concept of "paragraph" or "heading" in the way a Word document or an InDesign file understands it. The visual result looks like a document. The underlying structure is closer to a technical drawing.
This is why editing a PDF directly — even in Adobe Acrobat, which does allow text editing — is so fragile. The moment a translated sentence is slightly longer than the original, the text frame overflows. Letters break outside their box. Spacing collapses. And because adjacent text boxes are completely independent, fixing one element does not reflow the others. There is no flow to reflow.
The PDF is not broken. It is doing exactly what it was designed to do. The problem is asking it to do something it was never designed for.
The most widespread approach in translation agencies is converting the PDF to a Word document, sending it to a CAT tool for translation, then converting the translated Word back to a formatted file. This works — up to a point.
The conversion from PDF to Word is imperfect by design. Conversion software has to guess at the document structure. It reads coordinates and tries to reconstruct a flowing document from a fixed layout. The result is often:
For a translator, this means a single sentence may appear as three separate segments, a heading appears twice, and translation memory suffers. The DTP work afterwards is far heavier than it would need to be.
Some of these problems can be reduced with careful cleanup before sending to the CAT tool — removing forced line breaks, merging split segments, correcting reading order. But this is manual work and scales poorly across large or complex documents.
Not every project justifies a full InDesign workflow. For short, simple, text-heavy documents — a one-page letter, a simple form, a short press release — PDF to Word conversion is often the most practical route. The key conditions are:
When converting, enable "Retain Flowing Text" rather than "Retain Page Layout" where available. Flowing text produces cleaner CAT tool output — it sacrifices visual fidelity for structural integrity, which is what matters for translation.
If the client has InDesign source files, use them. This is always the correct answer. InDesign exports to IDML — InDesign Markup Language — which CAT tools including Trados Studio can process directly. The translation happens inside the actual layout. Paragraph styles, fonts, spacing, and structure are preserved. Post-translation DTP is minimal.
When receiving IDML files for translation, a common error appears: Trados Studio refuses to open the file with the message:
"The selected IDML file was created by an unsupported version of Adobe InDesign and will not be processed."
The fix is simple and takes thirty seconds. There are two ways depending on whether you want to fix it for one project or for all future projects:
For a specific project
Project SettingsFile TypesAdobe InDesign CS4-CC IDMLCommonFor all future projects
File → OptionsFile TypesAdobe InDesign CS4-CC IDMLCommonTrados will now process the IDML regardless of which InDesign version created it. In practice, it works correctly almost every time.
If the document is short — up to fifteen or twenty pages — and the client cannot provide source files, it is often worth rebuilding the InDesign file from scratch rather than working from a converted PDF. This sounds counterintuitive but the maths work out.
A clean InDesign rebuild takes a few hours. The resulting IDML can be translated cleanly in a CAT tool. Post-translation DTP is minimal. Compare that to the cumulative hours of cleaning up a broken PDF conversion, fixing missegmented CAT output, and doing heavy DTP correction at the end — often repeated across multiple languages.
Under 20 pages with no source files → consider rebuilding in InDesign. Over 20 pages → PDF to Word with thorough cleanup is usually the only practical option.
The cleanest projects are those where the document was designed with translation in mind before anyone opened InDesign. A designer who understands multilingual DTP will:
That last point matters most for multi-language projects. A document going into thirty languages should be designed in InDesign as a clean, style-driven file. The IDML goes to translation, comes back, and the DTP work across all languages is straightforward. The alternative — a PDF converted, broken, cleaned up, translated, and reformatted thirty times — multiplies every problem by thirty.
This is worth emphasising because it is the single most common DTP-introduced problem in translation-ready files. A forced line break (Shift+Enter in InDesign) looks identical to a paragraph break on screen, but behaves completely differently in a CAT tool. The segment boundary falls mid-sentence. The translator sees half a sentence as one segment and the other half as the next. Translation memory cannot match it correctly.
And because the break is invisible unless hidden characters are shown, it often goes unnoticed until the project is already in translation.
Before any IDML file goes to a CAT tool: Type → Show Hidden Characters (Ctrl/Cmd + Alt + I). Forced line breaks appear as a downward-left arrow. Remove them all.
| Situation | Best approach |
|---|---|
| Client has InDesign source files | Export to IDML, translate in CAT tool directly |
| No source files, short simple document | Rebuild in InDesign, export IDML |
| No source files, long or complex document | PDF to Word with thorough cleanup before CAT |
| New project, will be translated to multiple languages | Design in InDesign from day one, style-driven, no forced breaks |