How Does a PDF *Actually* Work?

An Interactive Explainer

It's Not Just a "Digital Paper"

You probably think of a PDF as a static document, like a digital piece of paper or an image. But it's much more complex and powerful than that.

A PDF is a database of objects and a set of drawing instructions. It's a programming language for describing a page. It tells the PDF reader (like Acrobat) *exactly* where to place every character, line, and image.

Let's break down the main components.

Section 1: The Building Blocks (Objects)

A PDF file is structured around "objects." An object is just a piece of data. There are a few main types. Click to see examples:

Click a button above to learn about an object type.

Section 2: The File Structure (Abridged)

A PDF file has four main parts. Click on the blue text in the example below to see what each part does.

%PDF-1.7
%... binary characters ...

1 0 obj
  << /Type /Page
     /Parent 2 0 R
     /MediaBox [0 0 612 792]
     /Contents 3 0 R
     /Resources << /XObject << /Img1 4 0 R >> >>
  >>
endobj

3 0 obj
  << /Length 44 >>
stream
  BT
  /F1 12 Tf
  72 700 Td
  (Hello, PDF!) Tj
  ET
endstream
endobj

4 0 obj
  << /Type /XObject
     /Subtype /Image
     /Width 200
     /Height 150
     /ColorSpace /DeviceRGB
     /BitsPerComponent 8
     /Filter /DCTDecode
     /Length 12345
  >>
stream
  % ... actual compressed JPEG data ...
endstream
endobj

xref
0 5
0000000000 65535 f
0000000018 00000 n
0000000000 65535 f
0000000115 00000 n
0000000200 00000 n

trailer
<< /Size 5 /Root 1 0 R >>
startxref
230
%%EOF
                

Section 3: The "Drawing" Instructions

This is the fun part. A PDF doesn't store "Hello World" as you see it. It stores *instructions* on how to draw it. The text below is a "Content Stream" (like object 3 from above).

Edit the commands and click Run Instructions to see what happens on the "page" to the right!

Simulated PDF Page (8.5" x 11")

Command Log:

Click "Run Instructions" to see the log...