• This repository has been archived on 24/Jul/2019
  • Stars
    star
    230
  • Rank 174,053 (Top 4 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created over 11 years ago
  • Updated about 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A grunt task that takes html snapshots from websites. Useful to make ajax sites crawlable

grunt-html-snapshot

Makes it easy to provide html snapshots for client side applications so that they can be indexed by web crawlers

Getting Started

This plugin requires Grunt ~0.4.0

If you haven't used Grunt before, be sure to check out the Getting Started guide, as it explains how to create a Gruntfile as well as install and use Grunt plugins. Once you're familiar with that process, you may install this plugin with this command:

npm install grunt-html-snapshot --save-dev

Once the plugin has been installed, it may be enabled inside your Gruntfile with this line of JavaScript:

grunt.loadNpmTasks('grunt-html-snapshot');

htmlSnapshot task

Run this task with the grunt htmlSnapshot command.

configuring the htmlSnapshot task

    grunt.initConfig({
        htmlSnapshot: {
            all: {
              options: {
                //that's the path where the snapshots should be placed
                //it's empty by default which means they will go into the directory
                //where your Gruntfile.js is placed
                snapshotPath: 'snapshots/',
                //This should be either the base path to your index.html file
                //or your base URL. Currently the task does not use it's own
                //webserver. So if your site needs a webserver to be fully
                //functional configure it here.
                sitePath: 'http://localhost:8888/my-website/',
                //you can choose a prefix for your snapshots
                //by default it's 'snapshot_'
                fileNamePrefix: 'sp_',
                //by default the task waits 500ms before fetching the html.
                //this is to give the page enough time to to assemble itself.
                //if your page needs more time, tweak here.
                msWaitForPages: 1000,
                //sanitize function to be used for filenames. Converts '#!/' to '_' as default
                //has a filename argument, must have a return that is a sanitized string
                sanitize: function (requestUri) {
                    //returns 'index.html' if the url is '/', otherwise a prefix
                    if (/\/$/.test(requestUri)) {
                      return 'index.html';
                    } else {
                      return requestUri.replace(/\//g, 'prefix-');
                    }
                },
                //if you would rather not keep the script tags in the html snapshots
                //set `removeScripts` to true. It's false by default
                removeScripts: true,
                //set `removeLinkTags` to true. It's false by default
                removeLinkTags: true,
                //set `removeMetaTags` to true. It's false by default
                removeMetaTags: true,
                //Replace arbitrary parts of the html
                replaceStrings:[
                    {'this': 'will get replaced by this'},
                    {'/old/path/': '/new/path'}
                ],
                // allow to add a custom attribute to the body
                bodyAttr: 'data-prerendered',
                //here goes the list of all urls that should be fetched
                urls: [
                  '',
                  '#!/en-gb/showcase'
                ],
                // a list of cookies to be put into the phantomjs cookies jar for the visited page
                cookies: [
                  {"path": "/", "domain": "localhost", "name": "lang", "value": "en-gb"}
                ],
				// options for phantomJs' page object
				// see http://phantomjs.org/api/webpage/ for available options
				pageOptions: {
					viewportSize : {
						width: 1200,
						height: 800
					}
				}
              }
            }
        }
    });

Release History

  • 0.6.1 - trigger warnings with grunt.warn(msg, 6) instead of grunt.log(msg)
  • 0.6.0 - Provide a function hook for the file name sanitization (by @mrgamer)
  • 0.5.0 - Add option to set cookies. Also fixed a bug for scenarios where multiple instances of the tasks are being used in parallel.
  • 0.4.0 - Add more sophisticated replace functionality to transform the html output (thanks to @okcoker)
  • 0.3.0 - Escape tabs & introduced new option bodyAttr to place a custom attribute on the body
  • 0.2.1 - fixed a bug where quotes where missing from the html
  • 0.2.0 - added option to remove script tags from the output
  • 0.1.0 - Initial release

More Repositories

1

angular-todo-app

A clone of the backbone todo app implemented with angularjs
JavaScript
64
star
2

Knockout-Rx

Knockout-Rx-Playground
JavaScript
31
star
3

stackoverflow-knockout-example

Example code used for this posting: http://stackoverflow.com/questions/6089727/how-to-architecture-a-webapp-using-jquery-mobile-and-knockoutjs
JavaScript
27
star
4

SignalR.Reactive

Laying the power of the Reactive Extensions on top of SignalR
JavaScript
18
star
5

swiftcore.js

A lightweight microkernel / IOC Container that is very flexibile
JavaScript
14
star
6

octosense

A github timeline reader that works
JavaScript
10
star
7

rusty-rlp

Python bindings for rust rlp
Rust
8
star
8

grunt-release-branch

A grunt task that makes working with release branches (aka gh-pages) a breeze
JavaScript
7
star
9

sencha-playground

Some random things I put together with Sencha Touch
7
star
10

fe-real-world-examples

Real world examples written in Fe
Solidity
7
star
11

rxjs-contribute

A collection of community operators for the awesome RxJS library
JavaScript
6
star
12

cgnjs-rx-examples

Examples from my Rx talk at June 2011's Cologne.js
JavaScript
6
star
13

Angular-Wikipedia-App

JavaScript
3
star
14

Blockbuster

A simple API for directory cleanup with a bunch of useful filter commands
C#
3
star
15

RxCuts

A Rx-based library that lets you easily listen for complex shortcuts
JavaScript
3
star
16

ng2-app

JavaScript
3
star
17

Knockout-ToDo-App

It's a clone of the backbone todo demo app that uses an experimental knockout approach
JavaScript
3
star
18

egghead-wikipedia-demo

TypeScript
3
star
19

simple_dao

Dead simple DAO written in Fe
Solidity
3
star
20

evm-extensions

A collection of performance critical EVM helpers that originate in Py-EVM
Python
2
star
21

lahja

MOVED TO: https://github.com/ethereum/lahja
2
star
22

ParallelBenchmark

C#
1
star
23

try_git

1
star
24

bountiful

Get paid to break things
JavaScript
1
star
25

dotfiles

My personal dotfiles. Pirated, tweaked and hand-crafted
Vim Script
1
star
26

lidos-finest

Python
1
star