• Stars
    star
    254
  • Rank 159,552 (Top 4 %)
  • Language
    C++
  • License
    Other
  • Created almost 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The PDF library used by the Chromium project

PDFium

Prerequisites

PDFium uses the same build tooling as Chromium. See the platform-specific Chromium build instructions to get started, but replace Chromium's "Get the code" instructions with PDFium's.

CPU Architectures supported

The default architecture for Windows, Linux, and Mac is "x64". On Windows, "x86" is also supported. GN parameter "target_cpu = "x86"" can be used to override the default value. If you specify Android build, the default CPU architecture will be "arm".

It is expected that there are still some places lurking in the code which will not function properly on big-endian architectures. Bugs and/or patches are welcome, however providing this support is not a priority at this time.

Google employees

Run: download_from_google_storage --config and follow the authentication instructions. Note that you must authenticate with your @google.com credentials. Enter "0" if asked for a project-id.

Once you've done this, the toolchain will be installed automatically for you in the Generate the build files step below.

The toolchain will be in depot_tools\win_toolchain\vs_files\<hash>, and windbg can be found in depot_tools\win_toolchain\vs_files\<hash>\win_sdk\Debuggers.

If you want the IDE for debugging and editing, you will need to install it separately, but this is optional and not needed for building PDFium.

Get the code

The name of the top-level directory does not matter. In the following example, the directory name is "repo". This directory must not have been used before by gclient config as each directory can only house a single gclient configuration.

mkdir repo
cd repo
gclient config --unmanaged https://pdfium.googlesource.com/pdfium.git
gclient sync
cd pdfium

On Linux, additional build dependencies need to be installed by running the following from the pdfium directory.

./build/install-build-deps.sh

Generate the build files

PDFium uses GN to generate the build files and Ninja to execute the build files. Both of these are included with the depot_tools checkout.

Selecting build configuration

PDFium may be built either with or without JavaScript support, and with or without XFA forms support. Both of these features are enabled by default. Also note that the XFA feature requires JavaScript.

Configuration is done by executing gn args <directory> to configure the build. This will launch an editor in which you can set the following arguments. By convention, <directory> should be named out/foo, and some tools / test support code only works if one follows this convention. A typical <directory> name is out/Debug.

use_goma = false  # Googlers only. Ensure goma is installed and running first.
is_debug = true  # Enable debugging features.

# Set true to enable experimental Skia backend.
pdf_use_skia = false

pdf_enable_xfa = true  # Set false to remove XFA support (implies JS support).
pdf_enable_v8 = true  # Set false to remove Javascript support.
pdf_is_standalone = true  # Set for a non-embedded build.
is_component_build = false # Disable component build (Though it should work)

For sample applications like pdfium_test to build, one must set pdf_is_standalone = true.

By default, the entire project builds with C++17.

By default, PDFium expects to build with a clang compiler that provides additional chrome plugins. To build against a vanilla one lacking these, one must set clang_use_chrome_plugins = false.

When complete the arguments will be stored in <directory>/args.gn, and GN will automatically use the new arguments to generate build files. Should your files fail to generate, please double-check that you have set use_sysroot as indicated above.

Building the code

You can build the sample program by running: ninja -C <directory> pdfium_test You can build the entire product (which includes a few unit tests) by running: ninja -C <directory> pdfium_all.

Running the sample program

The pdfium_test program supports reading, parsing, and rasterizing the pages of a .pdf file to .ppm or .png output image files (Windows supports two other formats). For example: <directory>/pdfium_test --ppm path/to/myfile.pdf. Note that this will write output images to path/to/myfile.pdf.<n>.ppm. Run pdfium_test --help to see all the options.

Testing

There are currently several test suites that can be run:

  • pdfium_unittests
  • pdfium_embeddertests
  • testing/tools/run_corpus_tests.py
  • testing/tools/run_javascript_tests.py
  • testing/tools/run_pixel_tests.py

It is possible the tests in the testing directory can fail due to font differences on the various platforms. These tests are reliable on the bots. If you see failures, it can be a good idea to run the tests on the tip-of-tree checkout to see if the same failures appear.

Pixel Tests

If your change affects rendering, a pixel test should be added. Simply add a .in or .pdf file in testing/resources/pixel and the pixel runner will pick it up at the next run.

Make sure that your test case doesn't have any copyright issues. It should also be a minimal test case focusing on the bug that renders the same way in many PDF viewers. Try to avoid binary data in streams by using the ASCIIHexDecode simply because it makes the PDF more readable in a text editor.

To try out your new test, you can call the run_pixel_tests.py script:

$ ./testing/tools/run_pixel_tests.py your_new_file.in

To generate the expected image, you can use the make_expected.sh script:

$ ./testing/tools/make_expected.sh your_new_file.pdf

Please make sure to have optipng installed which optimized the file size of the resulting png.

.in files

.in files are PDF template files. PDF files contain many byte offsets that have to be kept correct or the file won't be valid. The template makes this easier by replacing the byte offsets with certain keywords.

This saves space and also allows an easy way to reduce the test case to the essentials as you can simply remove everything that is not necessary.

A simple example can be found here.

To transform this into a PDF, you can use the fixup_pdf_template.py tool:

$ ./testing/tools/fixup_pdf_template.py your_file.in

This will create a your_file.pdf in the same directory as your_file.in.

There is no official style guide for the .in file, but a consistent style is preferred simply to help with readability. If possible, object numbers should be consecutive and /Type and /SubType should be on top of a dictionary to make object identification easier.

Embedding PDFium in your own projects

The public/ directory contains header files for the APIs available for use by embedders of PDFium. The PDFium project endeavors to keep these as stable as possible.

Outside of the public/ directory, code may change at any time, and embedders should not directly call these routines.

Code Coverage

Code coverage reports for PDFium can be generated in Linux development environments. Details can be found here.

Chromium provides code coverage reports for PDFium here. PDFium is located in third_party/pdfium in Chromium's source code. This includes code coverage from PDFium's fuzzers.

Waterfall

The current health of the source tree can be found here.

Community

There are several mailing lists that are setup:

Note, the Reviews and Bugs lists are typically read-only.

Bugs

PDFium uses this bug tracker, but for security bugs, please use Chromium's security bug template and add the "Cr-Internals-Plugins-PDF" label.

Contributing code

See the CONTRIBUTING document for more information on contributing to the PDFium project.

More Repositories

1

chromium

The official GitHub mirror of the Chromium source
15,034
star
2

badssl.com

🔒 Memorable site for testing clients against bad SSL configs.
HTML
2,807
star
3

-archived-chromium

Old and archived, see https://github.com/chromium/chromium instead.
1,721
star
4

permission.site

A site to test the interaction of web APIs and browser permissions.
JavaScript
1,180
star
5

hstspreload.org

🔒 Chromium's HSTS preload list submission website.
Go
773
star
6

dom-distiller

Distills the DOM
Java
607
star
7

ballista

An interoperability system for the modern web.
JavaScript
537
star
8

crashpad

A crash-reporting system
C++
416
star
9

hterm

MOVED: Please use the new libapps repo on chromium.googlesource.com instead
JavaScript
338
star
10

vs-chromium

A Visual Studio extension containing a collection of tools to help contributing code to the Chromium project.
C#
279
star
11

mini_chromium

A small collection of useful low-level (“base”) routines from Chromium
C++
249
star
12

web-page-replay

DEPRECATED - Use WebPageReplayGo instead:
Python
233
star
13

octane

The JavaScript Benchmark Suite for the modern web
JavaScript
178
star
14

trickuri

HTML
141
star
15

hstspreload

🔒🔍 A Go package to scan sites against requirements for Chromium-maintained HSTS preload list.
Go
114
star
16

suspicious-site-reporter

Extension for reporting suspicious sites to Safe Browsing.
JavaScript
89
star
17

subspace

A concept-centered standard library for C++20, enabling safer and more reliable products and a more modern feel for C++ code.; Also home of Subdoc the code-documentation generator.
C++
85
star
18

gyp

GYP is a Meta-Build system: a build system that generates other build systems.
Python
75
star
19

caterpillar

Project to investigate porting Chrome Apps to websites.
Python
56
star
20

axiom

Axiom Project
JavaScript
51
star
21

vim-codesearch

Vim integration for Chromium Codesearch at https://cs.chromium.org
Python
39
star
22

crsym

Go
34
star
23

mus-preso

Public mus presentations
JavaScript
33
star
24

chromium-ads-detection

28
star
25

content_analysis_sdk

This repository contains the SDK that DLP agents may use to become service providers for the Google Chrome Content Analysis Connector.
C++
24
star
26

codesearch-py

Python library for accessing Chromium CodeSearch via https://cs.chromium.org
Python
23
star
27

auto-zoom

Automatically zoom web pages based on their content
JavaScript
21
star
28

blink-intent-tracker

A service to automatically track blink-dev intents.
Python
20
star
29

dom-distiller-dist

Distribution packages for DOM Distiller (https://github.com/chromium/dom-distiller).
JavaScript
19
star
30

permissions.request

A polyfill for the navigator.permissions.request() API
TypeScript
14
star
31

requestautocomplete-magento-extension

Magento extension for requestAutocomplete
JavaScript
14
star
32

ozone-client

Example external ozone platform implementation offering RFB access to an ozone content shell.
Python
10
star
33

ACDC4GC

JavaScript
9
star
34

eclipse-gn

GN meta-build language support for the Eclipse IDE
Java
6
star