• Stars
    star
    655
  • Rank 68,765 (Top 2 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created about 11 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

PDF file size optimizer

README for pdfsizeopt

pdfsizeopt is a program for converting large PDF files to small ones, without decreasing visual quality or removing interactive features (such as hyperlinks). More specifically, pdfsizeopt is a free, cross-platform command-line application (for Linux, Windows, macOS and Unix) and a collection of best practices to optimize the size of PDF files, with focus on PDFs created from TeX and LaTeX documents. pdfsizeopt is written in Python, so it is a bit slow, but it offloads some of the heavy work to its faster (C, C++ and Java) dependencies.

Doesn't pdfsizeopt work with your PDF? Report the issue here: https://github.com/pts/pdfsizeopt/issues

Send donations to the author of pdfsizeopt: https://flattr.com/submit/auto?user_id=pts&url=https://github.com/pts/pdfsizeopt

Getting started: how to run pdfsizeopt

If it is your first time trying pdfizeopt, follow these instructions. (This section was updated on 2023-02-15.)

It's easy to install and run pdfsizeopt on modern Linux and Windows systems with an x86 processor. If you have such a system, jump directly to one of the following sections (Installation instructions and usage on Linux or Installation instructions and usage on Windows). It will take less than 5 minutes.

It's easy to install and run pdfsizeopt on a Mac (both Intel x86 processors and ARM processors with Apple Silicon are supported). If you have such a system, jump directly to the section Installation instructions and usage on macOS (not using Docker). It will take less than 5 minutes.

Alternatively (but not recommended because it's slower), it's possible to run pdfsizeopt within Docker on the following systems: Linux amd64, macOS 64-bit Intel x86 (amd64, x86_64), macOS 64-bit ARM (Apple Silicon, e.g. M1 or M2 chip). After that, jump directly to the section Installation instructions and usage with Docker on Linux and macOS. That last step will take less than 5 minutes.

If you are using an operating system other than Linux, Windows or macOS (on a computer with Intel processor), the easiest way to try pdfsizeopt is borrowing a friend's computer with Linux, Windows or macOS, or renting a Linux VM in the cloud. The reason why it's difficult to run pdfsizeopt on other kinds of systems is because pdfsizeopt has some required dependencies, some of them are old versions (e.g. Python 2.4--2.7, Ghostscript 9.05), so you'll have to compile the right versions of the dependencies first, which may take several hours and lots of frustrating trial-and-error even for experienced hackers.

It's technically possible to port pdfsizeopt to other systems (and make it easy to install), but the author of pdfsizeopt doesn't have the free time to create and maintain such a port. As an FYI, see #154 about porting to Apple Silicon.

Installation instructions and usage on Linux

There is no installer, you need to run some commands in the command line to download and install. pdfsizeopt is a command-line only application, there is no GUI.

To install pdfsizeopt on a Linux system (with architecture i386 or amd64), open a terminal window and run these commands (without the leading $):

  $ mkdir ~/pdfsizeopt
  $ cd ~/pdfsizeopt
  $ wget -O pdfsizeopt_libexec_linux.tar.gz https://github.com/pts/pdfsizeopt/releases/download/2023-04-18/pdfsizeopt_libexec_linux-v9.tar.gz
  $ tar xzvf pdfsizeopt_libexec_linux.tar.gz
  $ rm -f    pdfsizeopt_libexec_linux.tar.gz
  $ wget -O pdfsizeopt.single https://raw.githubusercontent.com/pts/pdfsizeopt/master/pdfsizeopt.single
  $ chmod +x pdfsizeopt.single
  $ ln -s pdfsizeopt.single pdfsizeopt

To optimize a PDF, run the following command:

  ~/pdfsizeopt/pdfsizeopt input.pdf output.pdf

If the input PDF has many images or large images, pdfsizeopt can be very slow. You can speed it up by disabling pngout, the slowest image optimization method, like this:

  ~/pdfsizeopt/pdfsizeopt --use-pngout=no input.pdf output.pdf

pdfsizeopt creates lots of temporary files (psotmp.*) in the output directory, but it also cleans up after itself.

It's possible to optimize a PDF outside the current directory. To do that, specify the pathname (including the directory name) in the command-line.

Please note that the commands above download all dependencies (including Python and Ghostscript) as well. It's possible to install some of the dependencies with your package manager, but these steps are considered alternative and more complicated, and thus are not covered here.

Please note that pdfsizeopt works perfectly on any x86 and amd64 Linux system. There is no restriction on the libc, Linux distribution etc. because pdfsizeopt uses only its statically linked x86 executables, and it doesn't use any external commands (other than pdfsizeopt, pdfsizeopt.single and pdfsizeopt_libexec/*) on the system. pdfsizeopt also works perfectly on x86 FreeBSD systems with the Linux emulation layer enabled.

To avoid typing ~/pdfsizeopt/pdfsizeopt, add "$HOME/pdfsizeopt" to your PATH (probably in your ~/.bashrc), open a new terminal window, and the command pdfsizeopt will work from any directory.

You can also put pdfsizeopt to a directory other than ~/pdfsizeopt , as you like.

Additionally, you can install some extra image imptimizers (see more in the Image optimizers section below):

  $ cd ~/pdfsizeopt
  $ wget -O pdfsizeopt_libexec_extraimgopt_linux-v3.tar.gz https://github.com/pts/pdfsizeopt/releases/download/2017-01-24/pdfsizeopt_libexec_extraimgopt_linux-v3.tar.gz
  $ tar xzvf pdfsizeopt_libexec_extraimgopt_linux-v3.tar.gz
  $ rm -f    pdfsizeopt_libexec_extraimgopt_linux-v3.tar.gz

Installation instructions and usage on Windows

There is no installer, you need to run some commands in the command line (black Command Prompt window) to download and install. pdfsizeopt is a command-line only application, there is no GUI.

Create folder C:\pdfsizeopt, download https://github.com/pts/pdfsizeopt/releases/download/2023-04-18/pdfsizeopt_win32exec-v9.zip , and extract its contents to the folder C:\pdfsizeopt, so that the file C:\pdfsizeopt\pdfsizeopt.exe exists.

Download https://raw.githubusercontent.com/pts/pdfsizeopt/master/pdfsizeopt.single and save it to C:\pdfsizeopt, as C:\pdfsizeopt\pdfsizeopt.single .

To optimize a PDF, run the following command:

  C:\pdfsizeopt\pdfsizeopt input.pdf output.pdf

in the command line, which is a black Command Prompt window, you can start it by Start menu / Run / cmd.exe, or finding Command Prompt in the start menu.

(Press Tab to get filename completion while typing.)

Since you have to type the input filename as a full pathname, it's recommended to create a directory with a short name (e.g. C:\pdfs), and copy the input PDF there first.

If the input PDF has many images or large images, pdfsizeopt can be very slow. You can speed it up by disabling pngout, the slowest image optimization method, like this:

  C:\pdfsizeopt\pdfsizeopt --use-pngout=no input.pdf output.pdf

To avoid typing C:\pdfsizeopt\pdfsizeopt, add C:\pdfsizeopt to (the end of) the system PATH, open a new Command Prompt window, and the command pdfsizeopt will work from any directory.

Depending on your environment, filenames with accented characters may not work in the Windows version of pdfsizeopt. To play it safe, make sure your input and output files have names with letters, numbers, underscore (_), dash (-), dot (.) and plus (+). The backslash () and the slash (/) are both OK as the directory separator.

Spaces in filenames and pathnames should work, but you need to put double quotes (") around the name.

Filenames with some punctuation characters (such as double quote ("), question mark (?) and asterisk ()) and nonprintable characters (such as newline) will not work on Windows. This is because Windows doesn't support these characters ([\x00..\x1f":<>?|\x7f] in filenames at all, and it uses / and \ as directory separator.

You can also put pdfsizeopt to a directory other than C:\pdfsizeopt , but it won't work if there is whitespace or there are accented characters in any of the folder names.

Please note that pdfsizeopt works perfectly in Wine (tested with wine-1.2 on Ubuntu Lucid and wine-1.6.2 on Ubuntu Trusty), but it's a bit slower than running it natively (as a Linux or Unix program).

Installation instructions and usage with Docker on Linux and macOS

These instructions work on the following systems: Linux amd64, macOS 64-bit Intel x86 (amd64, x86_64), macOS 64-bit ARM (Apple Silicon, e.g. M1 or M2 chip). The version of Linux or macOS doesn't matter (old systems such as macOS Leopard 10.5 also work), as long as it has Docker installed and working.

The programs in the Docker image ptspts/pdfsizeopt are compiled for Linux i386 (32-bit Intel x86), and these binaries happen to work in all platforms mentioned above, even with Apple Silicon. (Tested on 2023-02-21.)

There is no installer, you need to run some commands in the command line to download and install. pdfsizeopt is a command-line only application, there is no GUI.

First, check that you have Docker installed properly by running this command and checking for the OK at the end:

  docker version && echo OK

If you don't get OK, because the `docker' command was not found, then Docker is not installed to your computer. Installation instructions (on 2023-02-22):

  • To install Docker on Linux, you have two options: Docker Engine (https://docs.docker.com/engine/install/ , within the Server section) or Docker Desktop (https://docs.docker.com/desktop/install/linux-install/). Any of them would work.

  • To install Docker on macOS, install Docker Desktop (https://docs.docker.com/desktop/install/mac-install/).

    Then (on macOS), add the docker command to your PATH by running the following command (copy-paste it, don't type, to avoid typos):

      (echo; echo 'export PATH="/Applications/Docker.app/Contents/Resources/bin:$PATH"') >>~/.profile
    

    Then (on macOS), close the Terminal app, and open it again (so that changes to ~/.profile take effect).

  • After the installation, retry the docker version command above.

Remove any previous Docker images of pdfsizeopt:

  docker image rm ptspts/pdfsizeopt

Do a test optimization run, which exercises all dependencies of pdfsizeopt:

  curl -L -o deptest.pdf https://github.com/pts/pdfsizeopt/raw/master/deptest/deptest.pdf
  docker run -v "$PWD:/workdir" -u "$(id -u):$(id -g)" --rm -it ptspts/pdfsizeopt pdfsizeopt deptest.pdf

If you get a (harmless) warning message like

  WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested

, and you don't want to get it again, then add --platform linux/amd64 after the -it:

  docker run -v "$PWD:/workdir" -u "$(id -u):$(id -g)" --rm -it --platform linux/amd64 ptspts/pdfsizeopt pdfsizeopt deptest.pdf

To optimize a PDF, run this command:

  docker run -v "$PWD:/workdir" -u "$(id -u):$(id -g)" --rm -it ptspts/pdfsizeopt pdfsizeopt input.pdf output.pdf

If the input PDF has many images or large images, pdfsizeopt can be very slow. You can speed it up by disabling pngout, the slowest image optimization method, like this:

  docker run -v "$PWD:/workdir" -u "$(id -u):$(id -g)" --rm -it ptspts/pdfsizeopt pdfsizeopt --use-pngout=no input.pdf output.pdf

pdfsizeopt creates lots of temporary files (psotmp.*) in the output directory, but it also cleans up after itself.

It's possible to optimize a PDF outside the current directory. To do that, specify the pathname (including the directory name) in the command-line.

To avoid typing a long command, run

  (echo '#! /bin/sh'; echo 'exec docker run -v "$PWD:/workdir" -u "$(id -u):$(id -g)" --rm -it ptspts/pdfsizeopt pdfsizeopt "$@"') >pdfsizeopt && chmod 755 pdfsizeopt

, and then copy the pdfsizeopt script to your PATH, then open a new terminal window, and now this command will also work to optimize a PDF:

  pdfsizeopt input.pdf output.pdf

Please note that the ptspts/pdfsizeopt Docker image is updated very rarely. To use a more up-to-date version of pdfsizeopt, run these commands to download:

  curl -L -o pdfsizeopt.single https://raw.githubusercontent.com/pts/pdfsizeopt/master/pdfsizeopt.single
  chmod +x pdfsizeopt.single

Then run this command to optimize a PDF:

  docker run -v "$PWD:/workdir" -u "$(id -u):$(id -g)" --rm -it ptspts/pdfsizeopt ./pdfsizeopt.single --use-pngout=no input.pdf output.pdf

If you want to have extra image optimizers included on Linux, use ptspts/pdfsizeopt-with-extraimgopt instead of ptspts/pdfsizeopt in the commands above. Example:

  docker run -v "$PWD:/workdir" -u "$(id -u):$(id -g)" --rm -it ptspts/pdfsizeopt-with-extraimgopt pdfsizeopt --use-image-optimizer=sam2p,jbig2,pngout,zopflipng,optipng,advpng,ECT input.pdf output.pdf

Installation instructions and usage on macOS

These instructions work on Macs with macOS Catalina 10.15 (and even older, maybe macOS Snow Leopard 10.6) -- macOS Ventura 13 (and even newer), having a 64-bit ARM processor (Apple Silicon) or a 64-bit Intel x86 (x86_64, amd64) processor. The programs are compiled for 64-bit Intel x86 processors, and they work on 64-bit ARM processors as well, using the Rosetta 2 emulation in macOS. These instructions were tested and known to work on macOS Ventura 13.3, both with 64-bit Intel x86 (x86_64, amd64) processor and Apple Silicon (ARM processor).

If you have an older Mac running Mac OS X Leopard 10.5 -- macOS Mojave 10.14, follow the section Installation instructions and usage on older macOS instead.

These instructions are not tested yet. See #154 for progress updates.

There is no installer, you need to run some commands in the command line to download and install. pdfsizeopt is a command-line only application, there is no GUI.

To install pdfsizeopt on a macOS system, open a terminal window and run these commands (without the leading $):

  $ mkdir ~/pdfsizeopt
  $ cd ~/pdfsizeopt
  $ curl -L -o pdfsizeopt_libexec_darwin.tar.gz https://github.com/pts/pdfsizeopt/releases/download/2023-04-18/pdfsizeopt_libexec_darwinc64-v9.tar.gz
  $ tar xzvf pdfsizeopt_libexec_darwin.tar.gz
  $ rm -f    pdfsizeopt_libexec_darwin.tar.gz
  $ curl -L -o pdfsizeopt.single https://raw.githubusercontent.com/pts/pdfsizeopt/master/pdfsizeopt.single
  $ chmod +x pdfsizeopt.single
  $ ln -s pdfsizeopt.single pdfsizeopt

Do a test optimization run, which exercises all dependencies of pdfsizeopt:

  $ curl -L -o deptest.pdf https://github.com/pts/pdfsizeopt/raw/master/deptest/deptest.pdf
  $ ~/pdfsizeopt/pdfsizeopt deptest.pdf

... and open (view) deptest.pdf and the corresponding optimized deptest.pso.pdf .

To optimize a PDF, run the following command:

  ~/pdfsizeopt/pdfsizeopt input.pdf output.pdf

If the input PDF has many images or large images, pdfsizeopt can be very slow. You can speed it up by disabling pngout, the slowest image optimization method, like this:

  ~/pdfsizeopt/pdfsizeopt --use-pngout=no input.pdf output.pdf

Also, if you have an 32-bit Mac, then the pngout bundled with pdfsizeopt won't work (because it needs a 64-bit Mac), so you have to force --use-pngout=no . See the section Image optimizers for alternatives of pngout.

pdfsizeopt creates lots of temporary files (psotmp.*) in the output directory, but it also cleans up after itself.

It's possible to optimize a PDF outside the current directory. To do that, specify the pathname (including the directory name) in the command-line.

Please note that the commands above download most dependencies (including Ghostscript, but excluding Python) as well. Everything should work as instructed above, out of the box. If you are experiencing problems, please report an issue on https://github.com/pts/pdfsizeopt/issues .

To avoid typing ~/pdfsizeopt/pdfsizeopt, add "$HOME/pdfsizeopt" to your PATH (probably in your ~/.bashrc), open a new terminal window, and the command pdfsizeopt will work from any directory.

You can also put pdfsizeopt to a directory other than ~/pdfsizeopt , as you like.

Installation instructions and usage on older macOS

These instructions should work on older Macs running Mac OS X Leopard 10.5 -- macOS Mojave 10.14, and having a 32-bit or 64-bit Intel x86 processor. The programs are compiled for 32-bit Intel x86 (i386) processor (and also work on a 64-bit Intel processor with macOS Mojave 10.14 or earlier), except for the pngout tool, which needs at least Mac OS X Snow Leopard 10.6 and a 64-bit Intel processor.

There is no installer, you need to run some commands in the command line to download and install. pdfsizeopt is a command-line only application, there is no GUI.

To install pdfsizeopt on an older macOS system, open a terminal window and run these commands (without the leading $):

  $ mkdir ~/pdfsizeopt
  $ cd ~/pdfsizeopt
  $ curl -L -o pdfsizeopt_libexec_darwin.tar.gz https://github.com/pts/pdfsizeopt/releases/download/2023-04-18/pdfsizeopt_libexec_darwin-v9.tar.gz
  $ tar xzvf pdfsizeopt_libexec_darwin.tar.gz
  $ rm -f    pdfsizeopt_libexec_darwin.tar.gz
  $ curl -L -o pdfsizeopt.single https://raw.githubusercontent.com/pts/pdfsizeopt/master/pdfsizeopt.single
  $ chmod +x pdfsizeopt.single
  $ ln -s pdfsizeopt.single pdfsizeopt

Do a test optimization run, which exercises all dependencies of pdfsizeopt:

  $ curl -L -o deptest.pdf https://github.com/pts/pdfsizeopt/raw/master/deptest/deptest.pdf
  $ ~/pdfsizeopt/pdfsizeopt deptest.pdf

... and open (view) deptest.pdf and the corresponding optimized deptest.pso.pdf .

To optimize a PDF, run the following command:

  ~/pdfsizeopt/pdfsizeopt input.pdf output.pdf

If the input PDF has many images or large images, pdfsizeopt can be very slow. You can speed it up by disabling pngout, the slowest image optimization method, like this:

  ~/pdfsizeopt/pdfsizeopt --use-pngout=no input.pdf output.pdf

Also, if you have a Mac with a 32-bit Intel x86 processor, then the pngout bundled with pdfsizeopt won't work (because it needs a 64-bit processor), so you have to force --use-pngout=no . See the section Image optimizers for alternatives of pngout.

pdfsizeopt creates lots of temporary files (psotmp.*) in the output directory, but it also cleans up after itself.

It's possible to optimize a PDF outside the current directory. To do that, specify the pathname (including the directory name) in the command-line.

Please note that the commands above download most dependencies (including Ghostscript, but excluding Python) as well. Everything should work as instructed above, out of the box. If you are experiencing problems, please report an issue on https://github.com/pts/pdfsizeopt/issues .

To avoid typing ~/pdfsizeopt/pdfsizeopt, add "$HOME/pdfsizeopt" to your PATH (probably in your ~/.bashrc), open a new terminal window, and the command pdfsizeopt will work from any directory.

You can also put pdfsizeopt to a directory other than ~/pdfsizeopt , as you like.

Installation instructions and usage on FreeBSD

There is no installer, you need to run some commands in the command line to download and install. pdfsizeopt is a command-line only application, there is no GUI.

pdfsizeopt works perfectly on x86 FreeBSD systems with the Linux emulation layer enabled. So, enable the Linux emulation layer on your FreeBSD system, and then follow the Installation instructions and usage on Linux.

Alterantively, you can follow the Installation instructions and usage on generic Unix, but that needs much more work on your part (and it's inconvenient and error-prone), because you need to install many dependencies separately, possibly compiling some of them from source.

Installation instructions and usage on generic Unix

Doing this is increasingly hard in 2023, because pdfsizeopt needs Python 2.4--2.7 and Ghostscript 9.05, both very old, and thus hard to install to a modern system.

There is no installer, you need to run some commands in the command line (black Command Prompt window) to download and install. pdfsizeopt is a command-line only application, there is no GUI.

pdfizeopt is a Python script. It works with Python 2.4, 2.5, 2.6 and 2.7 (but it doesn't work with Python 3.x). So please install Python first.

Create a new directory named pdfsizeopt, and download this link there: https://raw.githubusercontent.com/pts/pdfsizeopt/master/pdfsizeopt.single

Rename it to pdfsizeopt and make it executable by running the following commands (without the leading $):

  $ cd pdfsizeopt
  $ mv pdfsizeopt.single pdfsizeopt
  $ chmod +x pdfsizeopt

If your Python executable is not /usr/bin/python, then edit the first line (starting with #!) in the pdfsizeopt script accordingly.

Try it with:

  $ ./pdfsizeopt --version
  info: This is pdfsizeopt ZIP rUNKNOWN size=105366.

pdfsizeopt has many dependencies. For full functionality, you need all of them. Install as many as you can, and put them to the PATH.

Dependencies:

  • Python (command: python). Version 2.4, 2.5, 2.6 and 2.7 work (3.x doesn't work).
  • Ghostscript (command: gs): Version 9.05 is recommended, 8.50 should also work, and some early 9.x versions such as 9.14.1 also work. The most recent versions don't work, especially for font optimization.
  • jbig2 (command: jbig2): Install from source: https://github.com/pts/pdfsizeopt-jbig2 If you are unable to install, use pdfsizeopt --use-jbig2=no .
  • pngout (command: pngout): Download binaries from here: http://www.jonof.id.au/kenutils Source code is not available. If you are unable to install, use pdfsizeopt --use-pngout=no .
  • imgdataopt (command: imgdataopt): Install from source: https://github.com/pts/imgdataopt To make pdfsizeopt able to use it, copy the imgdataopt program file as sam2p (e.g. /usr/local/bin/sam2p) to your PATH. If you are unable to install it, use pdfsizeopt --do-optimize-images=no . Some Linux distributions have sam2p binaries, but they tend to be too old. Alternatively, sam2p >=0.49.3 + png22pnm also works instead of imgdataopt, but imgdataopt is easier to install.
  • The Multivalent PDF compressor (written in Java) is an optional dependency of pdfsizeopt, turned off by default. Don't bother installing it.

After installation, use pdfsizeopt as:

  $ ./pdfsizeopt input.pdf output.pdf

You can add the directory containing pdfsizeopt to the PATH, so the command pdfsizeopt will work from any directory.

Image optimizers

pdfsizeopt can use the following external tools to make images in embedded PDF files smaller:

  • sam2p (used by default, cannot be disabled)
  • jbig2 (used by default, disable with --use-jbgi2=no)
  • pngout (used by default, disable with --use-pngout=no)
  • zopflipng (not enabled by default)
  • optipng (not enabled by default)
  • advpng (not enabled by default)
  • ECT (not enabled by default)

To enable or disable any image optimizer, specify all image optimizers you want to be enabled like this: --use-image-optimizer=optipng,jbig2 . This will also disable the default pngout.

You can also specify custom image optimizer command patterns by specifying separate, additional --use-image-optimier= flags, like this:

  --use-image-optimizer="optipng %(sourcefnq)s -o6 -fix -force %(optipng_gray_flags)s-out %(targetfnq)s"

You always have to specify %(targetfnq) in the command pattern.

Specify --do-debug-image-optimizers=yes to see which image optimizers are enabled (and their full command-line) for the current run.

At startup, pdfsizeopt checks that the requested image optimizers are available (as program files), and fails if some of them are missing. To ignore those which are missing, specify --do-require-image-optimizers=no .

It's your (the user's) responsibility to install the image optimizers and add them to the PATH. If you follow the installation instructions for Windows and Linux above, the default image optimizers (sam2p, jbig2 and pngout) will be installed for you. For Linux, there are also installation instructions above for extra image optimizers (zopflipng, optipng, advpng and ECT).

Troubleshooting

1. pdfsizeopt fails for some fonts.

Specify --do-unify-fonts=no and --do-regenerate-all-fonts=no .

If it still fails, specify --do-optimize-fonts=no .

In either case, please report it on https://github.com/pts/pdfsizeopt/issues

2. pdfsizeopt fails for some images.

Specify --do-optimize-images=no .

Please report it on https://github.com/pts/pdfsizeopt/issues

3. pdfsizeopt is too slow processing images.

Specify --use-pngout=no . This disables pngout, which is the slowest optimization step for images.

4. pdfsizeopt fails without creating the output PDF.

Please report it on https://github.com/pts/pdfsizeopt/issues , attaching the input PDF file and the console output of pdfsizeopt. Your report is very much appreciated.

If pdfsizeopt exits with an uncaught exception, it may leave some temporary files (psotmp.*) behind in the current directory. You can remove these files.

Please note that pdfsizeopt is not resilient in processing corrupt PDF files (i.e. those which are not compliant to the PDF standard). So if pdfsizeopt fails, then the reason may be a bug in pdfsizeopt or a corrupt PDF input file. Nevertheless, please report an issue (see above).

5. The output PDF of pdfsizeopt doesn't look like the same as the input PDF.

Please report it on https://github.com/pts/pdfsizeopt/issues , attaching the input PDF file and the output PDF file (.pso.pdf) and the console output of pdfsizeopt. Your report is very much appreciated.

6. pdfsizeopt is unable to find some input files on Windows.

This may happen if the filename or the full pathname contains any character other than the ASCII letters (a-z and A-Z), digits (0-9), underscore (_), ASCII dash (-), plus (+), dot (.), backslash () or slash (/). Typically these characters don't work:

  • spaces and tabs: This is easy to fix, just wrap the filename in double quotes ("), the usual way.

  • double quotes ("): This can't happen, filenames on Windows are not allowed to contain double quotes. If you need to pass a non-filename argument with a double quote in it to pdfsizeopt, do this. Wrap the argument in double quotes ("), replace all double quotes (") with ", and (in parallel to the previous replacement) replace a sequence backslashes () and an double quote (") immediately following them by duplicating the backslashes and replacing the double quote (") with ". This sounds complicated, but this is the usual way for other programs as well, see https://stackoverflow.com/a/4094897/97248 .

  • newlines and other non-space whitespace: This won't work, the Windows Command Prompt (cmd.exe) doesn't allow these characters in command-line arguments. Also Windows doesn't allow them in filenames.

  • accented characters (such as á and ő). These characters won't work (or it may work for only some characters, depending on the active code page) in the PDF filename specified in the commandline, or in the full pathname of pdfsizeopt (so don't install pdfsizeopt to C:\bőr, it won't work).

    Accented characters (outside the active code page) will not work in the full pathname of pdfsizeopt (such as C:\bőr\pdfsizeopt.exe). That's because Python is unable to call external programs (os.system, os.popen, os.spawnl and subprocess.call) with accented characters in their name, because it uses the single-byte API.

  • anything which is not ASCII printable (code between 33 and 126, inclusive): If not covered above, this may not work. See the description of accented characters.

If some filenames still don't work, the workarounds are:

  • renaming or copying the file (and folders) in Windows Explorer, and passing the renamed file to pdfsizeopt
  • using pdfsizeopt on a Unix system (e.g. Linux, FreeBSD, macOS) instead

Accented characters in PDF filename could be made work the following way (as a future improvement work to pdfsizeopt):

  • pdfsizeopt.exe should call the 16-bit API (GetCommandLineW) instead of the single-byte API (GetCommandLineA) to get the arguments

  • pdfsizeopt.exe should escape the non-ASCII characters in the arguments (e.g. as U+12AB)

  • pdfsizeopt.exe should run pdfsizeopt.single like this:

    .../pdfsizeopt_win32exec/pdfsizeopt_python.exe .../pdfsizeopt.single --args-u+ ...

  • pdfsizeopt Python code should recognize --args-u+, and when finding the filename, it should convert it to unicode (by keeping ASCII except for U+12AB), and it should pass tha unicode-typed value to open(...). Such an open(...) works in Python 2.6 on Windows.

  • When displaying filenames, pdfsizeopt Python code should still display the ASCII with the U+12AB escaping. Thus the win32console module is not needed. Thus filenames will be displayed leglibly but incorrectly (not copy-pasteably) in the Command Prompt window.

  • No escaping is needed in command lines of helper programs (e.g. gs, sam2p), because it's all ASCII, because filenames are autogenerated temporary fil names, which are all ASCII, and path to pdfsizeopt itself is required to the ASCII.

Accented characters in the pathname of pdfsizeopt.single can be made work this way (as a future improvement work to pdfsizeopt):

  • Do the accented characters in the filename above first.

  • pdfsizeopt.exe should use wgetcwd to get the current directory.

  • pdfsizeopt.exe should use wchdir to change to the directory of pdfsizeopt.single .

  • pdfsizeopt.exe should prepend the directories pdfsizeopt_win32exec and pdfsizeopt_win32exec/pdfsizeopt_gswin to the PATH, using wputenv.

  • pdfsizeopt.exe should run pdfsizeopt.single like this:

      pdfsizeopt_python.exe pdfsizeopt.single --args-u+ --cwd=... ...
    

    , where the value of --cwd= is the escaped (U+12AB) version of the result of wgetcwd.

  • pdfsizeopt Python code should prepend the value of --cwd=... to the input filename if it's relative.

  • pdfsizeopt Python code shouldn't modify the PATH if --cwd=... is present. (Does this environment variable propagation work in Python 2.6.? Let's try!)

  • It's still true that no escaping is needed in command lines of external programs (e.g. gs, sam2p), because it's all ASCII, because temporary file names are all ASCII, and path to pdfsizeopt itself is required to the ASCII. Escaping is needed if the pathname of the temporary directory (TEMP variable) needs escaping.

7. Error on Windows: The application failed to initialize properly (0xc0000034). Click on OK to terminate the application.

This error has happened on a Windows XP system. The solution: download msvcr90.dll (or find it somewhere already on your system), and copy it into pdfsizeopt_win32exec (next to python26.dll). Any version of msvcr90.dll will work:

  • msvcr90.dll 9.0.21022.8 (655872 bytes)
  • msvcr90.dll 9.0.30729.6161 (653136 bytes)
  • msvcr90.dll 9.0.30729.9247 (653968 bytes)

8. Error on Windows: The system cannot execute the specified command.

This error has happened on a Windows XP system when the file Microsoft.VC90.CRT.manifest was missing from the pdfsizeopt_win32exec directory. The solution: reinstall pdfsieopt, the directory pdfsizeopt_win32exec in the newest version has that file.

9. Ghostscript errors with Type1CParser and Type1CConverter

Please install pdfsizeopt by following the installation instructions on https://github.com/pts/pdfsizeopt . By doing so, pdfsizeopt will use Ghostscript 9.05 bundled with it, and it will work.

More documentation

More Repositories

1

pts-tinype

tiny hello-world Win32 PE .exe
Assembly
79
star
2

staticpython

statically linked Python 2.x, 3.x and Stackless for i386 Linux, Mac OS X and FreeBSD
Python
43
star
3

sam2p

raster (bitmap) image converter with smart PDF and PostScript (EPS) output
C++
36
star
4

pts-dropbear

Dropbear SSH tools with ed25519 and other improvements by pts
C
26
star
5

pts-line-bisect

Fast and lightweight binary search on text files
C
25
star
6

pts-merge-history-bash

Helper script to automatically synchronize Bash shell history and remember it forever
Shell
17
star
7

chacha20

Pure Python 2 and Python 3 implementations of the ChaCha20 stream cipher
Python
16
star
8

pts-xcom

pts-xcom: DOS COM encoder for the Universal ASCII message
Assembly
16
star
9

pts-tcc

tiny, self-contained C compiler using TCC + uClibc
C
16
star
10

dosmc

C compiler driver to produce tiny DOS .exe and .com executables
Perl
14
star
11

kvikdos

a very fast headless DOS emulator for Linux
C
13
star
12

flickrurlget

flickrurlget: Flickr photo downloader from command-line in batch
Shell
13
star
13

pts-tiny-7z-sfx

tiny .7z extractor and SFX, size-optimized for Linux i386
C
12
star
14

pdfsizeopt-jbig2

Self-contained JBIG2 compressor for PDF files
C
12
star
15

pts-xtiny

gcc wrapper and libc for creating tiny (200-byte) Linux i386 executables
C
11
star
16

rsyncbin

rsync binaries
10
star
17

staticperl

statically linked Perl 5.10 for i386 Linux and FreeBSD
Perl
10
star
18

pts-swiggle

fast, command-line JPEG thumbnail generator
C
9
star
19

int-overflow

Detecting integer overflow in C and C++
C
9
star
20

megapubdl

Download public files from MEGA (mega.nz).
Python
9
star
21

pts-clang-xstatic

Portable LLVM Clang compiler with static binary generation using uClibc
C
9
star
22

pts-mini-gpl-page

List of project URLs formerly bundled together as pts-mini-gpl
8
star
23

tif22pnm

TIFF-to-PNM converter and png22pnm, a PNG-to-PNM converter
C
8
star
24

pts-mkwinpe

Shell scripts for Linux to build bootable Windows PE .iso images for QEMU
Shell
5
star
25

historic-texlive

convenient installers for old TeX Live releases
Perl
4
star
26

pts-zcat

portable and minimalistic Flate decompression filter
Roff
4
star
27

external-sk-libfido2

external U2F (FIDO) authenticator for OpenSSH
C
4
star
28

insight-mininasm

compiling the Insight real-mode DOS 16-bit debugger with mininasm
Assembly
4
star
29

pe-setstub

replace DOS stub in Win32 PE .exe
Perl
4
star
30

tinyc32

tiny C library and toolchain for writing SYSLINUX COM32R executables
Shell
4
star
31

py_ssh_keygen_ed25519

ssh-keygen for ed25519 keypairs in Pure Python
Python
4
star
32

muxzcat

tiny and portable .xz and .lzma decompression filter
C
4
star
33

homebrew-utils

`brew install' scripts for some software on https://github.com/pts
Ruby
3
star
34

latex-uni8

Universal inputenc, fontenc and babel for pdflatex + lualatex
TeX
3
star
35

imgdataopt

raster (bitmap) image data size optimizer
C
3
star
36

tiny-ssh-keygen-ed25519

tiny ssh-keygen for ed25519 keypairs in standard C
C
3
star
37

tinyveracrypt

versatile and compatible block device encryption setup
Shell
3
star
38

pyfindimagedupes

Finds similar duplicate images
Python
3
star
39

upxbc

UPX-based compressor for execuables and data files
Shell
3
star
40

ptssh

portable SSH client for Unix
Perl
3
star
41

a2ping

convert image and vector graphics to compatible EPS and PDF
Perl
3
star
42

pts-nasm-fullprog

libraries for writing full executable programs with NASM
Assembly
3
star
43

pdfconcat

C
3
star
44

portable-mariadb

portable, small, precompiled, statically linked MariaDB binary distribution for Linux i386
Perl
3
star
45

com0exe

DOS .com program to .exe converter
DIGITAL Command Language
3
star
46

geek-clock

drawing a geek clock with TikZ
TeX
2
star
47

rsa

Python
2
star
48

pi

Calculate the digits of pi in fast and/or obfuscated ways
Python
2
star
49

mmshget

mmsh:// (MMS-over-HTTP) video stream downloader and reference implementation
Python
2
star
50

pts-tcc64

tiny, self-contained C compiler for Linux amd64 using TCC + glibc
Shell
2
star
51

pts-decompress-nrv

educational C code for NRV decompression
C
2
star
52

pts-quote

display a random quote on the console
Roff
2
star
53

tinygpgs

symmetric key encryption compatible with GPG in Python
Python
2
star
54

minilibc32

size-optimized, minimalistic libc for Linux i386
Assembly
2
star
55

bitcoin_wallet_dump

Dump bitcoin addresses and private keys.
Shell
2
star
56

fast_vector_append

doing std::vector::push_back without copying
C++
2
star
57

mplaylist

Audio playlist player using mplayer, with checkpointing
Python
2
star
58

detect_protocol

detect what protocol the TCP client is speaking
Python
2
star
59

pts-qiv

Enhanced version of qiv (Quick Image Viewer)
C
2
star
60

pts-parse-int

Parse integers from strings
C++
2
star
61

pts-debootstrap

portable debootstrap for i386 and amd64
Shell
2
star
62

pts-cpudetect

library to detect x86 CPU mode and type
Assembly
1
star
63

bip39tgen

bip39tgen: 12-word BIP39 mnemonic generator
Python
1
star
64

ssh_connect_fast

ssh(1) trampoline for faster connection setup
C
1
star
65

py-intersect-sorted

Python code to compute the intersection of two sorted lists
Python
1
star
66

sesamessh

passphrase-based SSH client
Shell
1
star
67

stbx86

precompiled and statically linked binaries and libraries for Linux i386
1
star
68

pts-red-black-tree

balanced binary tree with node insertion and iteration in C
C
1
star
69

stackmat-clock

various tools to communicate with the StackMat display
Python
1
star
70

pts-printf

C++ typesafe version of the ANSI C printf() family of functions
C++
1
star
71

pts-stackless-httpd

experimental HTTP server using Stackless Python
Python
1
star
72

pts-static-libdevmapper

C
1
star
73

javascript-multiplication

benchmarking bigint multiplication implementations in JavaScript and other languages
ActionScript
1
star
74

unzip_scan

tool to extract and scan truncated ZIP files
Shell
1
star
75

pts-hello-zig

small (hello-world) Zig programs and their binary sizes
Shell
1
star
76

uevalrun

self-contained computation sandbox for Linux
C
1
star
77

owtarget16

targeting ELKS and other 8086 operating systems with the OpenWatcom C compiler
C
1
star
78

wnckkeys

fork of xbindkeys for activating windows up, down, left or right to the current window, using libwnck
C
1
star
79

quiz_formatter

Format single-choice and multiple-choice quizzes as HTML for the web and for printed proofreading
Python
1
star
80

ppfiletagger

file tagging and search by tag for Linux
Python
1
star
81

pts-osxcross

compile C and C++ programs for macOS on Linux amd64
Shell
1
star
82

pts-openssl-static

Compiling the openssl tool a statically-linked Linux binary
Shell
1
star
83

python-xattr-compat

portable Python module to query and set POSIX extended attributes of files
Python
1
star
84

hdproductid

Display product info of hard drives on Linux
C
1
star
85

encrypt-script-bash

passphrase-based encryption of Bash shell scripts
Shell
1
star
86

portable-tetex

teTeX binaries for new and old Linux systems
Shell
1
star
87

pts-qpdf

self-contained qpdf executables
C
1
star
88

pts-windows-nt-qemu

install and run Windows NT in QEMU on Linux
Shell
1
star
89

oldfs2tar

extract filesystem images to .tar files
Shell
1
star
90

pts-build-micropython-xstatic

micropython for Linux i386, statically linked
Shell
1
star
91

pts.github.io

Github pages repo for pts
Shell
1
star
92

chbreak

breaking out of chroot using chdir(".."), and mitigation techniques
C
1
star