• Stars
    star
    1,452
  • Rank 32,283 (Top 0.7 %)
  • Language
    JavaScript
  • Created over 13 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A javascript library for diffing text and generating corresponding HTML views

Introduction

I needed a good in-browser visual diff tool, and couldn’t find anything suitable, so I built jsdifflib in Feb 2007 and open-sourced it soon thereafter. It’s apparently been used a fair bit since then. Maybe you’ll find it useful.

If you do find jsdifflib useful, please support my open source work via a bitcoin donation/tip to 19qCqZxAdRF4eZfyZD2GQnAWk2Mz7DZZVf. Thanks!

Overview

jsdifflib is a Javascript library that provides:

  1. a partial reimplementation of Python’s difflib module (specifically, the SequenceMatcher class)

  2. a visual diff view generator, that offers side-by-side as well as inline formatting of file data

Yes, I ripped off the formatting of the diff view from the Trac project. It’s a near-ideal presentation of diff data as far as I’m concerned. If you don’t agree, you can hack the CSS to your heart’s content.

jsdifflib does not require jQuery or any other Javascript library.

Python Interoperability

The main reason why I reimplemented Python’s difflib module in Javascript to serve as the algorithmic basis for jsdifflib was that I didn’t want to mess with the actual diff algorithm — I wanted to concentrate on getting the in-browser view right. However, because jsdifflib’s API matches Python’s difflib’s SequenceMatcher class in its entirety, it’s trivial to do the actual diffing on the server-side, using Python, and pipe the results of that diff calculation to your in-browser diff view. So, you have the choice of doing everything in Javascript on the browser, or falling back to server-side diff processing if you are diffing really large files.

Most of the time, we do the latter, simply because while jsdifflib is pretty fast all by itself, and is totally usable for diffing "normal" files (i.e. fewer than 100K lines or so), we regularly need to diff files that are 1 or 2 orders of magnitude larger than that. For that, server-side diffing is a necessity.

Demo & Examples

You can give jsdifflib a try without downloading anything. Just click the link below, put some content to be diffed in the two textboxes, and diff away.

That page also contains all of the examples you’ll need to use jsdifflib yourself, but let’s look at them here, anyway.

Diffing using Javascript

Here’s the function from the demo HTML file linked to above that diffs the two pieces of text entered into the textboxes on the page:

function diffUsingJS() {
    // get the baseText and newText values from the two textboxes, and split them into lines
    var base = difflib.stringAsLines($("baseText").value);
    var newtxt = difflib.stringAsLines($("newText").value);

    // create a SequenceMatcher instance that diffs the two sets of lines
    var sm = new difflib.SequenceMatcher(base, newtxt);

    // get the opcodes from the SequenceMatcher instance
    // opcodes is a list of 3-tuples describing what changes should be made to the base text
    // in order to yield the new text
    var opcodes = sm.get_opcodes();
    var diffoutputdiv = $("diffoutput");
    while (diffoutputdiv.firstChild) diffoutputdiv.removeChild(diffoutputdiv.firstChild);
    var contextSize = $("contextSize").value;
    contextSize = contextSize ? contextSize : null;

    // build the diff view and add it to the current DOM
    diffoutputdiv.appendChild(diffview.buildView({
        baseTextLines: base,
        newTextLines: newtxt,
        opcodes: opcodes,
        // set the display titles for each resource
        baseTextName: "Base Text",
        newTextName: "New Text",
        contextSize: contextSize,
        viewType: $("inline").checked ? 1 : 0
    }));

    // scroll down to the diff view window.
    location = url + "#diff";
}

There’s not a whole lot to say about this function. The most notable aspect of it is that the diffview.buildView() function takes an object/map with specific attributes, rather than a list of arguments. Those attributes are mostly self-explanatory, but are nonetheless described in detail in code documentation in diffview.js.

Diffing using Python

This isn’t enabled in the demo link above, but I’ve included it to exemplify how one might use the opcode output from a web-based Python backend to drive jsdifflib’s diff view.

function diffUsingPython() {
    dojo.io.bind({
        url: "/diff/postYieldDiffData",
        method: "POST",
        content: {
            baseText: $("baseText").value,
            newText: $("newText").value,
            ignoreWhitespace: "Y"
        },
        load: function (type, data, evt) {
            try {
                data = eval('(' + data + ')');
                while (diffoutputdiv.firstChild) diffoutputdiv.removeChild(diffoutputdiv.firstChild);
                $("output").appendChild(diffview.buildView({
                    baseTextLines: data.baseTextLines,
                    newTextLines: data.newTextLines,
                    opcodes: data.opcodes,
                    baseTextName: data.baseTextName,
                    newTextName: data.newTextName,
                    contextSize: contextSize
                }));
            } catch (ex) {
                alert("An error occurred updating the diff view:\n" + ex.toString());
            }
        },
        error: function (type, evt) {
            alert('Error occurred getting diff data. Check the server logs.');
        },
        type: 'text/javascript'
    });
}
Warning

This dojo code was written in 2007, and I haven’t looked at dojo for years now. In any case, you should be able to grok what’s going on.

As you can see, I’m partial to using dojo for ajaxy stuff. All that is happening here is the base and new text is being POSTed to a Python server-side process (we like pylons, but you could just as easily use a simple Python script as a cgi). That process then needs to diff the provided text using an instance of Python’s difflib.SequenceMatcher class, and return the opcodes from that SequenceMatcher instance to the browser (in this case, using JSON serialization). In the interest of completeness, here’s the controller action from our pylons application that does this (don’t try to match up the parameters shown below with the POST parameters shown in the Javascript function above; the latter is only here as an example):

@jsonify
def diff (self, baseText, newText, baseTextName="Base Text", newTextName="New Text"):
    opcodes = SequenceMatcher(isjunk, baseText, newText).get_opcodes()
    return dict(baseTextLines=baseText, newTextLines=newText, opcodes=opcodes,
                baseTextName=baseTextName, newTextName=newTextName)

Future Directions

The top priorities would be to implement the ignoring of empty lines, and the indication of diffs at the character level with sub-highlighting (similar to what Trac’s diff view does).

I’d also like to see the difflib.SequenceMatcher reimplementation gain some more speed — it’s virtually a line-by-line translation from the Python implementation, so there’s plenty that could be done to make it more performant in Javascript. However, that would mean making the reimplementation diverge even more from the "reference" Python implementation. Given that I don’t really want to worry about the algorithm, that’s not appealing. I’d much rather use a server-side process when the in-browser diffing is a little too pokey.

Other than that, I’m open to suggestions.

Note

I’m no longer actively developing jsdifflib. It’s been sequestered (mostly out of simple neglect) to my company’s servers for too long; now that it’s on github, I’m hoping that many of the people that find it useful will submit pull requests to improve the library. I will do what I can to curate that process.

License

jsdifflib carries a BSD license. As such, it may be used in other products or services with appropriate attribution (including commercial offerings). The license is prepended to each of jsdifflib’s files.

Downloads

jsdifflib consists of three files — two Javascript files, and one CSS file. Why two Javascript files? Because I wanted to keep the reimplementation of the python difflib.SequenceMatcher class separate from the actual visual diff view generator. Feel free to combine and/or optimize them in your deployment environment.

You can download the files separately by navigating the project on github, you can clone the repo, or you can download a zipped distribution via the "Downloads" button at the top of this project page.

Release History

  • 1.1.0 (May 18, 2011): Move project to github; no changes in functionality

  • 1.0.0 (February 22, 2007): Initial release

More Repositories

1

friend

An extensible authentication and authorization library for Clojure Ring web applications and services.
Clojure
1,157
star
2

austin

The ClojureScript browser-REPL rebuilt stronger, faster, easier.
Clojure
511
star
3

url

Makes working with URLs in Clojure and ClojureScript easy
Clojure
249
star
4

clojure-type-selection-flowchart

Flowchart for choosing the right Clojure type definition form
235
star
5

clojurescript.test

A maximal port of `clojure.test` to ClojureScript. DEPRECATED
Clojure
166
star
6

double-check

@reiddraper's test.check (née simple-check), made Clojure/ClojureScript-portable DEPRECATED
Clojure
125
star
7

bandalore

A Clojure client library for Amazon's Simple Queue Service (SQS)
Clojure
77
star
8

nrepl-python-client

A Python client for nREPL, the Clojure network REPL
Python
49
star
9

friend-demo

An über-demo of most (eventually, all) that Friend has to offer.
Clojure
37
star
10

clojure-web-deploy-conj

A sample Clojure web application project with support for deployment via pallet and jclouds, as presented at the first Clojure Conj in October, 2010.
Clojure
29
star
11

pprng

portable pseudo-random number generators for Clojure and ClojureScript DEPRECATED
HTML
26
star
12

raposo

25
star
13

stopthatrightnow

HTML
12
star
14

ancap-news

A Chrome/Firefox extension to help bring out hacker news' true colors
JavaScript
11
star
15

yonder

Go eval this Clojure[Script] over there.
Clojure
8
star
16

silly-shootout

Shell
7
star
17

immutant-aws

WIP WIP WIP — automating AMI baking + vagrant usage of clustered immutant
Shell
5
star
18

mostly-lazy

Clojure
5
star
19

cemerick-mvn-repo

Chas Emerick's micro-mvn-repository
4
star
20

s3-photo-archiver

archival storage of photo and video media in AWS S3
Java
3
star
21

bollocks

2
star
22

.emacs.d

Portable emacs configuration file.
Emacs Lisp
2
star
23

Lemerick

2
star
24

this-plt-life

a little "script" for downloading all the gifs from http://this-plt-life.tumblr.com/
Clojure
2
star
25

utc-dates

A simple date formatting/parsing library.
Clojure
1
star
26

clutch-clojurescript

1
star
27

ikvm-mono-exit-hang

Reproduction of bug described here: https://sourceforge.net/mailarchive/forum.php?thread_name=B75FBFED-D6C4-4E19-BD7E-B4F331F3C3E9%40snowtide.com&forum_name=ikvm-developers
C#
1
star