Why Python to PDF Conversion Breaks Formatting and How to Fix It

Paste a Python script into a generic PDF converter and the output is usually wrong before you reach the second function. Indentation collapses into a flat wall of text, long variable names split across two lines at random points, and tab characters disappear without a trace. The code itself is fine. The problem is that most converters treat source code the same way they treat a paragraph of prose — proportional fonts, no fixed column width, no awareness of syntactic structure. Understanding exactly where the rendering breaks down is the only way to get a PDF that actually looks like Python.

Why Python indentation changes how the PDF looks

Python is one of the few languages where whitespace is not cosmetic — it is structural. A two-space indent and a four-space indent are not interchangeable, and a collapsed indent is a syntax error rendered in print. When a PDF renderer uses a proportional font like Arial or Georgia, each character occupies a different horizontal width. A lowercase i takes less space than a capital W, so a sequence of spaces no longer maps to a predictable column position. The result is that nested blocks appear to start at the same margin as the code above them, making the visual hierarchy of your script unreadable.

Monospaced fonts fix this entirely. Every character occupies the same horizontal slot, so four spaces always equals four characters of indentation, regardless of what follows. A properly configured Python-to-PDF tool will default to a monospaced font — Courier, Inconsolata, or a similar fixed-width face — and will never substitute a proportional alternative. If your PDF output shows collapsed indentation, the first thing to check is whether the font being applied is genuinely monospaced.

What happens to long lines and comments

Most Python style guides recommend keeping lines under 80 characters. Real-world code rarely follows that rule consistently. When a line exceeds the printable column width, a PDF renderer has two options: truncate the overflow or wrap it. Truncation silently deletes code. Wrapping breaks the line mid-token, which is visually confusing and can make a chained method call or a long string literal look like two separate statements.

Inline comments are especially vulnerable. A line of code that sits at 75 characters followed by a # clarifying note will frequently push the total past the page margin. The comment either wraps onto the next line — visually detached from the code it describes — or it gets cut off entirely. Neither outcome is acceptable for a code-review printout or a classroom handout where the comment is part of the teaching point. The correct fix is to either reduce the base font size slightly or configure the renderer to use a narrower page margin, giving longer lines room to breathe without wrapping.

When tabs, spaces, and Unicode cause problems

Mixing tabs and spaces in a Python file is already a runtime error in Python 3, but many scripts inherited from Python 2 codebases still contain both. A PDF renderer that normalises all whitespace to spaces before rendering will silently shift the indentation of any tab-indented block, making it appear at a different depth than it actually occupies in the source. This is not a visible error — the code looks plausible — but it misrepresents the actual structure of the script.

Unicode is a separate category of problem. Copy-pasting code from a web page, a Google Doc, or a Slack message frequently introduces characters that look identical to standard ASCII but are not. Smart quotes replace straight quotes in string literals. Em dashes replace double hyphens in comments. Non-breaking spaces replace regular spaces in indentation. When the PDF renderer encounters these characters and lacks the encoding support to display them, the output either shows a replacement glyph, a blank box, or — in the worst case — throws an encoding error and produces a corrupted file. The only reliable fix is to sanitise the source before conversion, replacing smart quotes with straight quotes and stripping non-ASCII whitespace from indentation blocks.

Tab-only indentation is safer than mixed indentation, but it still requires the renderer to define a tab stop width. If the tool assumes a tab equals eight spaces rather than four, every nested block will appear twice as deep as intended.

Best use cases: classroom, code review, and archiving

A teacher preparing a printed handout for a Python lesson needs the indentation to be exact. A student reading a for loop on paper needs to see clearly which lines belong inside the loop body and which do not. A PDF where the indentation has collapsed is not just ugly — it actively teaches the wrong thing about how Python works.

For code review, the requirement is slightly different. Reviewers need line numbers, readable inline comments, and enough page width to see full function signatures without wrapping. A script archiving workflow — saving a finished project as a documented PDF for a client or a compliance record — needs consistent font rendering across all files in the archive, not just the short ones.

The Python to PDF converter at AixKit is built specifically for source code output, applying monospaced rendering and preserving indentation structure rather than treating the input as generic text. For any of these three scenarios, the difference between a code-aware tool and a generic document converter is visible in the first ten lines of output.

Classroom handouts: Indentation accuracy is non-negotiable. Verify the font is monospaced before printing a class set.
Code review printouts: Enable line numbers if the tool supports them. Reviewers need to reference specific lines in discussion.
Script archiving: Run a Unicode sanitisation pass on any script that was edited outside a code editor before converting to PDF.