• Stars
    star
    104
  • Rank 329,792 (Top 7 %)
  • Language
    C++
  • Created almost 12 years ago
  • Updated about 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

tool for analyzing and converting PDF

Flattr this git repo

PDF Utils for node

This library contains tools for analysing and converting PDF files. You can get metadata, extract text, render pages to svg or png, all with our beloved asynchronous programming style.

It is planed to support extracting links from the document and create ImageMaps (You remember them, don't you?) on the fly. Also pdfutils should support password locked files. But that's still on the todo.

The library is currently beta. This means it has incomplete error handling and it lacks a testing suite.

Installation

To install pdfutils you have to install libpoppler-glib first.

Using Debian execute:

apt-get install libpoppler-glib-dev libpoppler-glib8 libcairo2-dev libcairo2

Using CentOS execute:

yum install poppler poppler-glib-devel

Using MacOS and Macports:

port install poppler

or if you prefere brew:

brew install poppler --with-glib
export PKG_CONFIG_PATH=/usr/X11/lib/pkgconfig

Then install pdfutils

npm install pdfutils

Usage

See this very basic example:

var pdfutils = require('pdfutils').pdfutils;

pdfutils("document.pdf", function(err, doc) {
	doc[0].asPNG({maxWidth: 100, maxHeight: 100}).toFile("firstpage.png");
});

3sloc to generate thumbnails of PDFs. Awesome!

Here a bit more documentation:

pdfutils(source, callback)

this function is a factory for Documents

arguments:

  • source: can be a Buffer or a String. If it's a string, read from the file. If it's a buffer, treat the buffer content as in-memory PDF. Please make sure to not change the buffer while using it by pdfutils!
  • callback(err, doc): a callback with the following arguments:
    • err: an error string when the pdf couldn't be loaded successfully, otherwise null
    • doc: an instance of Document when the pdf is loaded successfully, otherwise undefined

Class PDFDocument

This class is generated by pdfutils(source, callback) described above.

members:

  • 0, 1, 2, 3, 4, ... , n instances of the Pages contained by the Document. See description of Page below
  • length: number of Pages in a document
  • author: the author of the document or null if not known
  • creationDate: the creation date as integer since 1970-01-01
  • creator: creator of the document or null if unknown
  • format: exact format of this PDF file or null if unknown
  • keywords: keywords of the document as string or null if unknown
  • linearized: true if document is linearized, otherwise false
  • metadata: Metadata as string
  • modDate: last modification of pdf as integer since 1970-01-01
  • pageLayout: the layout of the pages. Can be on of the following strings or null if unknown:
    • singlePage
    • oneColumn
    • twoColumnLeft
    • twoColumnRight
    • twoPageLeft
    • twoPageRight
  • pageMode: the suggested viewing mode of a page. Can be one of the following strings or null if unkown:
    • none
    • useOutlines
    • useThumbs
    • fullscreen
    • useOc
    • useAttachments
  • permissions: the permissions of this document. Is an object with the following members:
    • print: whether the user is allowed to print
    • modify: whether the user is allowed to modify the document
    • copy: whether the user is allowed to take copies of this document
    • notes: whether the user is allowed to make notes
    • fillForm: whether the user is allowed to fill out forms
  • producer: producer of a document or null if unknown
  • subject: subject of this document or null if unknown
  • title: title of the document or null if unknown

Class PDFPage

This class represents a page of a document

members:

  • width: width of the document
  • height: width of the document
  • index: number of this page.
  • label: label of this page or null if no label was defined.
  • links: array containing links of a page
  • asSVG(opts): returns an instance of PageJob described below, opts is an optional argument with an Object with the following optional fields:
    • maxWidth: maximal width of the resulting SVG in px.
    • minWidth: minimal width of the resulting SVG in px.
    • maxHeight: maximal height of the resulting SVG in px.
    • minHeight: minimal height of the resulting SVG in px.
    • width: the width of the resulting SVG in px. Overwrites minWidth and maxWidth.
    • height: the height of the resulting SVG in px. Overwrites minHeight and maxHeight.
  • asPDF(opts): returns an instance of PageJob described below, opts is an optional argument with an Object with the following optional fields:
    • maxWidth: maximal width of the resulting PDF in pt.
    • minWidth: minimal width of the resulting PDF in pt.
    • maxHeight: maximal height of the resulting PDF in pt.
    • minHeight: minimal height of the resulting PDF in pt.
    • width: the width of the resulting PDF in pt. Overwrites minWidth and maxWidth.
    • height: the height of the resulting PDF in pt. Overwrites minHeight and maxHeight.
  • asPNG(opts): returns an instance of PageJob described below, opts is an optional argument with an Object with the following optional fields:
    • maxWidth: maximal width of the resulting PNG in px
    • minWidth: minimal width of the resulting PNG in px
    • maxHeight: maximal height of the resulting PNG in px
    • minHeight: minimal height of the resulting PNG in px
    • width: the width of the resulting PNG in px. Overwrites minWidth and maxWidth.
    • height: the height of the resulting PNG in px. Overwrites minHeight and maxHeight.
  • asText(opts): returns an instance of PageJob described below. opts is an optional argument with an Object, which is currently ignored.

Class PDFPageJob

This class inherits Stream. It handles converting a Page (described above) to SVG, PNG or Text

members:

  • links: array containing links of a page, translated to fit the output page.

events:

  • data: emitted when a new chunk of the converted file is available
  • end: emitted when the file is successfully converted
  • error: emitted when the file cannot be converted. Is not implemented yet.

members:

  • toFile(path, [options]): writes a page to the file in the desired format.
  • see Stream for further members.

License

This module is licensed under GPL.

More Repositories

1

socket.io-java-client

Socket.IO Client Implementation in Java
Java
948
star
2

terminal.js

Javascript terminal emulator library that aims to be xterm compliant and is supposed to work in browsers and node.js.
JavaScript
590
star
3

node-webterm

simple demo application for child_pty and terminal.js.
JavaScript
244
star
4

smu

simple markup - markdown like syntax
C
192
star
5

sltar

Minimal implementation of tar.
C
105
star
6

child_pty

a modern node.js module for interacting with pseudo terminals.
JavaScript
89
star
7

bgs

simple background setter based on imlib2
C
60
star
8

node-urlify

simplifies converting utf8 strings to ASCII strings which can be used as readable URL-segments.
JavaScript
56
star
9

mongoose-cache

Caches Database querys the easy way. This module is currently not developed. Nevertheless, I will apply patches.
JavaScript
36
star
10

sqsh-tools

πŸ—œοΈ fast r/o squashfs implementation written in C.
C
35
star
11

jQR

Generates a QR-Code in Plain Javascript using canvas
JavaScript
27
star
12

chelf

change or display the stack size of an ELF binary
C
22
star
13

irc-message-action

Github Action to Interact with IRC Channels and Users
JavaScript
20
star
14

jQRange

Range Plugin for jQuery
JavaScript
16
star
15

nson

nson is a data framework for C with a very fast JSON and property list parser.
C
13
star
16

reveal.js-ace

a reveal.js plugin that allows to embed ACE editors in a reveal.js presentation.
JavaScript
12
star
17

qemuconf

simple qemu launcher with config file support
C
7
star
18

quickjs-dts

quickjs typescript definitions
Makefile
6
star
19

node-termios

C++
6
star
20

java2js

Java to Javascript Transpiler - research project
JavaScript
5
star
21

quotefm-node

The Quote.fm API for node
JavaScript
5
star
22

hexadisplay

LED display with ESP32
Rust
5
star
23

socket.io-java-client-new

Java implementation of the upcoming socket.io-1.0 protocol. For now, it supports only engine.io.
Java
4
star
24

OAuthUtils-Android

Eases the authentication with an OAuth Provider
Java
3
star
25

zsh-eval-cached

Caches completitions
Shell
3
star
26

pam_ns

simple PAM module that uses unshare(2) to isolate a user.
C
3
star
27

pb

C
3
star
28

dewey

dewey is a simple version parser and comperator that aims to be compatible to NetBSD and xbps' comperator implementation.
Rust
2
star
29

btun

bidirectional tunnel through a webbrowser
C
2
star
30

vnc2gif

JavaScript
2
star
31

libfetch

C
2
star
32

greasemonkey-mastodon-translate

Simple script to translate toots on mastodon for instances that do not support translation.
JavaScript
2
star
33

keycloak-metrics-spi-k8s

init container for jboss/keycloak to enable prometheus monitoring at startup.
Dockerfile
2
star
34

en.wipikedia.org

HTML
2
star
35

url-shortener-action

JavaScript
2
star
36

CodeRacoon

an eclipse plugin to find and examine source code of class files without saving them locally.
Java
2
star
37

e

a text editor written in C
C
2
star
38

void-docker

Dockerfile
1
star
39

ewmhgestures

use OSX-like gestures on Linux
C
1
star
40

libsqsh-lzo

LZO extension to libsqsh.
C
1
star
41

reverse-tcp

Rust
1
star
42

swtterminator

port of Terminator (http://software.jessies.org/terminator/) to Eclipse
Java
1
star
43

PollEnflug

Simplistic and performant HTTP long poll server
JavaScript
1
star
44

void-make-vm

Shell
1
star
45

adventofcode

JavaScript
1
star
46

Gottox

1
star
47

pdfium

patchset to make pdfium run on node
C
1
star
48

joyexec

Execute commands by a Joystick
C
1
star
49

saxmir

CSS selectors for SAX
Java
1
star
50

libRacoon

a library which finds and examines source code of class files without saving them locally.
Java
1
star
51

runit

a UNIX init scheme with service supervision
C
1
star
52

c-microbench

C
1
star
53

get-ip

Small microservice that reports the clients IP address
Rust
1
star
54

nile

C
1
star
55

KeyboardLed

Small Service for Milestone2. Enables keyboard backlight for cyanogen mod. Not needed anymore
Java
1
star
56

zeditor

Stateless Richtext Editor
JavaScript
1
star
57

mk

calls the right build system for the current project
Rust
1
star