• Stars
    star
    950
  • Rank 48,110 (Top 1.0 %)
  • Language
    Perl
  • Created over 13 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Generate pretty ebooks in various formats from source trees in various programming languages

Name

code2ebook - Generate ebooks in various formats from source trees in various programming languages

Table of Contents

Description

Want to browse big source code trees in your Kindle or iPad?

This project provides utilities to help generating pretty ebook files in various formats directly from arbitrary source trees.

Generate static HTML sites from source trees

Before you begin, ensure you have installed all the prerequisites (well, don't be scared, just vim, ctags, and perl ;)).

src2html.pl

The src2html.pl script can generate an HTML tree from the source tree that you specify, for example:

export PATH=/path/to/code2ebook:$PATH

cd /path/to/your/project/
src2html.pl --tab-width 4 --color --cross-reference \
            --navigator --line-numbers . 'Your Book Title'

The resulting HTML site can be viewed in a web browser. And the entry point is html_out/index.html (according to the command above). The default output directory is ./html_out/, and you can change that by specifying the --out-dir=DIR option (or -o DIR for short).

The following image shows what a typical HTML page looks like when rendered by a web browser:

minimal C source file example

See Sample eBooks for more complicated real-world sample outputs.

Note that, for ebook readers lacking colors (like Amazon Kindle), then you should not specify the --color option for the src2html.pl script.

The output is essentially an HTML-formatted "ebook", which is readily browsable in a web browser on either a PC or a tablet.

Back to TOC

Change CSS style

It is worth mentioning that if you do not like the default colors or have further style requirements, you can just specify the --css FILE option to make the HTML pages use your own CSS file. You can start with the default CSS file (named default.css) in this project.

Back to TOC

Usage

For the full usage of this script, specify the -h or --help options. One sample output is

src2html.pl [options] dir book-title

Options:
    --charset CHARSET     Specify the charset used by the HTML
                          outputs. Default to UTF-8.

    -c
    --color               Use full colors in the HTMTL outputs.

    --css FILE            Use FILE as the CSS file to render the HTML
                          pages instead of using the default style.

    -e PATTERN
    --exclude PATTERN     Specify a pattern for the source code files to be
                          excluded.

    -h
    --help                Print this help.

    -i PATTERN
    --include PATTERN     Specify the pattern for extra source code file names
                          to include in the HTML output. Wildcards
                          like * and [] are supported. And multiple occurrences
                          of this option are allowed.

    --include-only PATTERN
                          Specify the files to be processed and all other
                          files are excluded. This option takes higher
                          priority than "--include PATTERN".

    -j N
    --jobs N              Specify the number of jobs to execute simultaneously.
                          Default to 1. CPAN module Parallel::ForkManager is
                          required when the number is bigger than 1.

    -l
    --line-numbers        Display source code line numbers in the HTML
                          output.

    -n
    --navigator           Generate a navigator bar containing the "Top Level"
                          and "One Level Up" links in the HTML output pages.

    -o DIR
    --out-dir DIR         Specify DIR as the target directory holding the HTML
                          output. Default to "./html_out".

    -t N
    --tab-width N         Specify the tab width (number of spaces) in the
                          source code. Default to 8.

    -x
    --cross-reference     Turn on cross referencing links in the HTML output.

Copyright (C) Yichun Zhang (agentzh) <[email protected]>.

Back to TOC

Speed up on multi-core processors

You are recommended to use the option -j N to speed up on a multi-core processor when you have a large code base. Just a quick reminder, CPAN module Parallel::ForkManager is required when the number N is bigger than 1.

The following example shows that it takes more than 30 minutes with -j 1 to generate an HTML tree from the [ngx_openresty-1.9.3.2] (https://openresty.org/download/ngx_openresty-1.9.3.2.tar.gz) code base. While it takes less than 3 minutes with -j 18 to generate the same HTML tree on my 24-core processor.

$ time src2html.pl --navigator --color --cross-reference --line-numbers \
                   -j 1 ngx_openresty-1.9.3.2/bundle openresty-1.9.3.2
...
real    30m43.686s
user    30m26.818s
sys     0m15.420s

$ time src2html.pl --navigator --color --cross-reference --line-numbers \
                   -j 18 ngx_openresty-1.9.3.2/bundle openresty-1.9.3.2
...
real    2m49.172s
user    35m56.337s
sys     0m28.412s

Back to TOC

HTML output features

This src2html.pl script generates pretty HTML pages for each source code file featuring

Back to TOC

  1. Summarized data types, macros, global variables, and functions defined in each source code file shown as TOC at the beginning of the corresponding HTML page.
  2. Colorful syntax highlighting via the vim program (enabled by the --color option, or -c for short).
  3. Cross-reference links to the definition lines of the referenced data types, macros, global variables, and functions across all the source code lines (similar to the LXR Cross Referencer but ours is much more lightweight). This is enabled by the --cross-reference option (or the -x option for short).

Back to TOC

Source file types recognized

Right now all the file extension names known to your ctags program are supported. But .html and .htm files are always excluded to avoid infinite recursion.

For the full language list supported by this tool, just type the following command:

ctags --list-maps=all

You can explicitly include extra source files by specifying as many --include=PATTERN options as you like, as in

src2html.pl --include='src/*.blah' --include='*foo*' . "my project"

Similarly, you can also exclude files by specifying one or more --exclude=PATTERN options.

Back to TOC

Convert the HTML site to ebooks in various formats

Now that we have the HTML-formatted "ebook", we can generate ebooks in other formats like .mobi and .epub using Calibre.

Back to TOC

Generate MOBI ebooks for Kindle

For example, to generate a .mobi file for Kindle DX:

cd /path/to/your/project/

# assuming we specified the "." directory while running src2html.pl
ebook-convert html_out/index.html my-project.mobi \
    --output-profile kindle_dx --no-inline-toc \
    --title "Your Book Title" --publisher 'Your Name' \
    --language en --authors 'Your Author Name'

In this example, the resulting ebook file is named my-project.mobi in the current working directory.

Note: On OS X you have to go to Preferences->Advanced->Miscellaneous and click install command line tools to make the command line tools available after you installed the app. On other platforms, just start a terminal and type the command.

Here we use the value "kindle_dx" for the --output-profile option assuming that we want to view the ebook in Kindle DX. You should use "kindle" for other (smaller) models of Kindle.

The ebook-convert utility is provided by Calibre, see its online documentation for full usage:

https://manual.calibre-ebook.com/generated/en/ebook-convert.html

Well you need both Perl and Python ;)

Back to TOC

Generate EPUB ebooks for iPad/iPhone

Below is a simple sample command to generate the .epub ebook from the HTML site:

cd /path/to/your/project/

# assuming we specified the "." directory while running src2html.pl
ebook-convert html_out/index.html my-project.epub \
    --output-profile ipad3 \
    --no-default-epub-cover \
    --title "Your Book Title" --publisher 'Your Name' \
    --language en --authors 'Your Author Name'

In this example, the resulting ebook file is named my-project.epub in the current working directory, which is readily readable in apps like iBooks on iPad or iPhone.

Back to TOC

Generate PDF ebooks for Sony Digital Paper

For example, to generate a .pdf file for Sony Digital Paper:

cd /path/to/your/project/

# assuming we specified the "." directory while running src2html.pl
ebook-convert html_out/index.html my-project.pdf \
	--override-profile-size \
	--paper-size a4 \
	--pdf-default-font-size 12 \
	--pdf-mono-font-size 12 \
	--margin-left 10 --margin-right 10 \
	--margin-top 10 --margin-bottom 10 \
	--page-breaks-before='/'

In this example, the resulting ebook file is named my-project.pdf in the current working directory, which is readily readable in Sony Digital Paper or other e-reader devices supporting PDF ebooks (but you may need to adjust the --paper-size a4 option if your device screen is too small for A4 pages).

Back to TOC

Prerequisites

You need to install the following dependencies of a recent version (the newer, the better):

should be readily available in almost all the Linux distributions by simply installing the ctags package.

should be readily available in almost all the Linux distributions by simply installing the vim package.

should be readily available in almost all the Linux distributions by simply installing the perl package. It is worth mentioning that perl 5.10 or above is highly recommended due to performance boost in the perl regex engine. But older versions of perl should also work as well, just much slower (like 30x slower).

All these components are very common programs in the *NIX world.

Back to TOC

Sample eBooks

Below provides some sample ebooks generated from real-world opensource projects like weighttp, LuaJIT, and Nginx.

Back to TOC

weighttp

weighttp is a lightweight and small benchmarking tool for webservers.

Back to TOC

LuaJIT

LuaJIT is a Just-In-Time Compiler (JIT) for the Lua programming language. Lua is a powerful, dynamic and light-weight programming language.

Back to TOC

Nginx

Nginx is an open source reverse proxy server for HTTP, HTTPS, SMTP, POP3, and IMAP protocols, as well as a load balancer, HTTP cache, and a web server (origin server).

Back to TOC

GDB

The GNU Project Debugger.

Back to TOC

ktap

ktap is a lightweight script-based dynamic tracing tool for Linux.

Back to TOC

Known issues

  • vim is the dominating performance bottleneck when running the src2html.pl tool with the --color option.
  • When a tag has multiple targets, only the first one is picked up (in the future, we may provide a middle page listing all the targets for the user to choose, just as in the LXR Cross Referencer).
  • The default ctags program on Mac OS X is not Exuberant ctags, and thus not supported at all. You can install the right ctags utility via Homebrew, for example.

Back to TOC

TODO

  • Better support for languages like Java.

Back to TOC

Author

Yichun "agentzh" Zhang (章亦春) [email protected], OpenResty Inc.

Back to TOC

Copyright and License

This module is licensed under the BSD license.

Copyright (C) 2015-2017, by Yichun "agentzh" Zhang, OpenResty Inc.

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Back to TOC

See Also

Back to TOC

More Repositories

1

sshbatch

SSH::Batch for cluster operations
Perl
115
star
2

chunkin-nginx-module

HTTP 1.1 chunked-encoding request body support for Nginx
C
95
star
3

perl-systemtap-toolkit

Real-time analyzing and diagnosing tools for perl 5 based on SystemTap
Perl
89
star
4

cheater

A tool and a language that help generating random complex database instance based on predefined rules
Perl
75
star
5

old-openresty

Obsolete 1st generation of OpenResty written mostly in Perl. Please check out the new OpenResty based on Nginx and Lua instead.
Perl
70
star
6

amazon-polly-batch

Convert large plain text files to MP3 files via Amazon Polly
Perl
67
star
7

memcacheq

A queue-size-aware version of memcacheq
C
48
star
8

lua-resty-multipart-parser

Simple multipart data parser for OpenResty/Lua
Lua
38
star
9

mysql-driver-benchmark

My EC2 scripts
Gnuplot
28
star
10

dns-nginx-module

Nginx non-blocking module for talking to DNS name servers directly
C
23
star
11

mod-libmemcached-cache

libmemcached-based cache storage module for Apache2's mod_cache
C
21
star
12

re1

agentzh's fork of Russ Cox's re1 toy regex library
C
19
star
13

lua-php-utils

PHP-style utility functions for LuaJIT applications
Perl
17
star
14

salent

A toy x86 disassembler and x86 style toy chip
Assembly
16
star
15

makefile-graphviz-pm

Perl CPAN module Makefile::GraphViz - Draw building flowcharts from Makefiles using GraphViz
Perl
15
star
16

agentzh.org

The homepage for agentzh.org
HTML
12
star
17

vdombrowser

VDOM Browser, a QtWebKit-based web browser with VDOM support mainly for webpage analyzer debugging
C++
10
star
18

makefile-parser-pm

Perl CPAN module Makefile::Parser - A simple parser for Makefiles
Perl
10
star
19

vdomwebkit

QtWebKit 4.5.2 with VDOM dumper support
Shell
9
star
20

wrapalloc

Wrapping up the glibc allocation and free API functions via LD_PRELOAD to ease dynamic tracing
C
9
star
21

queue-memcached-buffered

Perl client library for memcacheq operations
Perl
8
star
22

cookiexs

CGI::Cookie::XS - HTTP Cookie parser in pure C
Perl
8
star
23

vdomkit

Command-line utility and FastCGI server wrapped around vdomwebkit
C++
8
star
24

re-dfa-pm

Perl module re::DFA
Perl
8
star
25

dodo

Graphical debugger for nginx internals
Perl
8
star
26

perl-parsing-library-benchmark

Benchmark for various Perl 5 & Perl 6 parsing libraries and builtins.
Perl
7
star
27

unisimu

University School Projects for Yichun Zhang
HTML
7
star
28

uml-class-simple-pm

CPAN module UML::Class::Simple: Render simple UML class diagrams, by loading the code
Perl
6
star
29

qjson

agentzh's fork of QJson, the Qt C++ parsing library for JSON data
C++
6
star
30

vdompm

VDOM.pm, the VDOM parser and VDOM manipulation library for the VDOM data generated by vdomkit or VdomBrowser
Perl
5
star
31

gdb

agentzh's fork of gdb with optimizations for its python binding
C
5
star
32

xclips

XClips: Expert System Development Language
Perl
5
star
33

VRG-solver

Problem solver for 3-dimensional geometrical qualitative problems for high schools
Perl
5
star
34

makefile-dom-pm

Perl CPAN module Makefile::DOM - Simple DOM parser for Makefiles
Perl
4
star
35

ru-RecDescent

Recursive Descent Parser Generator for Various Languages
Perl
4
star
36

usdt-sample

Standalone DTrace USDT (User-Level Statically Defined Tracing) sample
D
4
star
37

iScribblet

Scribblet bookmarklet for iPhone/iPad
JavaScript
3
star
38

redis-more-pm

Redis::More module for Perl
Perl
3
star
39

searchall

SearchAll: a Firefox extension that does side-by-side search engine comparison
JavaScript
3
star
40

appears

A simple genome analyzer (as per http://groups.google.com/group/perlchina/browse_thread/thread/a019034ad1dc584e )
C++
3
star
41

foo

testing repos
2
star