It's Not Just a "Digital Paper"
You probably think of a PDF as a static document, like a digital piece of paper or an image. But it's much more complex and powerful than that.
A PDF is a database of objects and a set of drawing instructions. It's a programming language for describing a page. It tells the PDF reader (like Acrobat) *exactly* where to place every character, line, and image.
Let's break down the main components.
Section 1: The Building Blocks (Objects)
A PDF file is structured around "objects." An object is just a piece of data. There are a few main types. Click to see examples:
Click a button above to learn about an object type.
Section 2: The File Structure (Abridged)
A PDF file has four main parts. Click on the blue text in the example below to see what each part does.
%PDF-1.7 %... binary characters ... 1 0 obj << /Type /Page /Parent 2 0 R /MediaBox [0 0 612 792] /Contents 3 0 R /Resources << /XObject << /Img1 4 0 R >> >> >> endobj 3 0 obj << /Length 44 >> stream BT /F1 12 Tf 72 700 Td (Hello, PDF!) Tj ET endstream endobj 4 0 obj << /Type /XObject /Subtype /Image /Width 200 /Height 150 /ColorSpace /DeviceRGB /BitsPerComponent 8 /Filter /DCTDecode /Length 12345 >> stream % ... actual compressed JPEG data ... endstream endobj xref 0 5 0000000000 65535 f 0000000018 00000 n 0000000000 65535 f 0000000115 00000 n 0000000200 00000 n trailer << /Size 5 /Root 1 0 R >> startxref 230 %%EOF
Section 3: The "Drawing" Instructions
This is the fun part. A PDF doesn't store "Hello World" as you see it. It stores *instructions* on how to draw it. The text below is a "Content Stream" (like object 3 from above).
Edit the commands and click Run Instructions to see what happens on the "page" to the right!