• Stars
    star
    1,577
  • Rank 29,545 (Top 0.6 %)
  • Language
    TypeScript
  • License
    MIT License
  • Created over 13 years ago
  • Updated 12 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Sitemap-generating framework for node.js

sitemap MIT LicenseBuild StatusMonthly Downloads

sitemap is a high-level streaming sitemap-generating library/CLI that makes creating sitemap XML files easy. What is a sitemap?

Table of Contents

Installation

npm install --save sitemap

Generate a one time sitemap from a list of urls

If you are just looking to take a giant list of URLs and turn it into some sitemaps, try out our CLI. The cli can also parse, update and validate existing sitemaps.

npx sitemap < listofurls.txt # `npx sitemap -h` for more examples and a list of options.

For programmatic one time generation of a sitemap try:

  const { SitemapStream, streamToPromise } = require( 'sitemap' )
  const { Readable } = require( 'stream' )

  // An array with your links
  const links = [{ url: '/page-1/',  changefreq: 'daily', priority: 0.3  }]

  // Create a stream to write to
  const stream = new SitemapStream( { hostname: 'https://...' } )

  // Return a promise that resolves with your XML string
  return streamToPromise(Readable.from(links).pipe(stream)).then((data) =>
    data.toString()
  )

Serve a sitemap from a server and periodically update it

Use this if you have less than 50 thousand urls. See SitemapAndIndexStream for if you have more.

const express = require('express')
const { SitemapStream, streamToPromise } = require('sitemap')
const { createGzip } = require('zlib')
const { Readable } = require('stream')

const app = express()
let sitemap

app.get('/sitemap.xml', function(req, res) {
  res.header('Content-Type', 'application/xml');
  res.header('Content-Encoding', 'gzip');
  // if we have a cached entry send it
  if (sitemap) {
    res.send(sitemap)
    return
  }

  try {
    const smStream = new SitemapStream({ hostname: 'https://example.com/' })
    const pipeline = smStream.pipe(createGzip())

    // pipe your entries or directly write them.
    smStream.write({ url: '/page-1/',  changefreq: 'daily', priority: 0.3 })
    smStream.write({ url: '/page-2/',  changefreq: 'monthly',  priority: 0.7 })
    smStream.write({ url: '/page-3/'})    // changefreq: 'weekly',  priority: 0.5
    smStream.write({ url: '/page-4/',   img: "http://urlTest.com" })
    /* or use
    Readable.from([{url: '/page-1'}...]).pipe(smStream)
    if you are looking to avoid writing your own loop.
    */

    // cache the response
    streamToPromise(pipeline).then(sm => sitemap = sm)
    // make sure to attach a write stream such as streamToPromise before ending
    smStream.end()
    // stream write the response
    pipeline.pipe(res).on('error', (e) => {throw e})
  } catch (e) {
    console.error(e)
    res.status(500).end()
  }
})

app.listen(3000, () => {
  console.log('listening')
});

Create sitemap and index files from one large list

If you know you are definitely going to have more than 50,000 urls in your sitemap, you can use this slightly more complex interface to create a new sitemap every 45,000 entries and add that file to a sitemap index.

const { createReadStream, createWriteStream } = require('fs');
const { resolve } = require('path');
const { createGzip } = require('zlib')
const {
  simpleSitemapAndIndex,
  lineSeparatedURLsToSitemapOptions
} = require('sitemap')

// writes sitemaps and index out to the destination you provide.
simpleSitemapAndIndex({
  hostname: 'https://example.com',
  destinationDir: './',
  sourceData: lineSeparatedURLsToSitemapOptions(
    createReadStream('./your-data.json.txt')
  ),
  // or (only works with node 10.17 and up)
  sourceData: [{ url: '/page-1/', changefreq: 'daily'}, ...],
  // or
  sourceData: './your-data.json.txt',
}).then(() => {
  // Do follow up actions
})

Want to customize that?

const { createReadStream, createWriteStream } = require('fs');
const { resolve } = require('path');
const { createGzip } = require('zlib')
const { Readable } = require('stream')
const {
  SitemapAndIndexStream,
  SitemapStream,
  lineSeparatedURLsToSitemapOptions
} = require('sitemap')

const sms = new SitemapAndIndexStream({
  limit: 50000, // defaults to 45k
  lastmodDateOnly: false, // print date not time
  // SitemapAndIndexStream will call this user provided function every time
  // it needs to create a new sitemap file. You merely need to return a stream
  // for it to write the sitemap urls to and the expected url where that sitemap will be hosted
  getSitemapStream: (i) => {
    const sitemapStream = new SitemapStream({ hostname: 'https://example.com' });
    // if your server automatically serves sitemap.xml.gz when requesting sitemap.xml leave this line be
    // otherwise you will need to add .gz here and remove it a couple lines below so that both the index 
    // and the actual file have a .gz extension
    const path = `./sitemap-${i}.xml`; 

    const ws = sitemapStream
      .pipe(createGzip()) // compress the output of the sitemap
      .pipe(createWriteStream(resolve(path + '.gz'))); // write it to sitemap-NUMBER.xml

    return [new URL(path, 'https://example.com/subdir/').toString(), sitemapStream, ws];
  },
});

// when reading from a file
lineSeparatedURLsToSitemapOptions(
  createReadStream('./your-data.json.txt')
)
.pipe(sms)
.pipe(createGzip())
.pipe(createWriteStream(resolve('./sitemap-index.xml.gz')));

// or reading straight from an in-memory array
sms
.pipe(createGzip())
.pipe(createWriteStream(resolve('./sitemap-index.xml.gz')));

const arrayOfSitemapItems = [{ url: '/page-1/', changefreq: 'daily'}, ...]
Readable.from(arrayOfSitemapItems).pipe(sms) // available as of node 10.17.0
// or
arrayOfSitemapItems.forEach(item => sms.write(item))
sms.end() // necessary to let it know you've got nothing else to write

Options you can pass

const { SitemapStream, streamToPromise } = require('sitemap');
const smStream = new SitemapStream({
  hostname: 'http://www.mywebsite.com',
  xslUrl: "https://example.com/style.xsl",
  lastmodDateOnly: false, // print date not time
  xmlns: { // trim the xml namespace
    news: true, // flip to false to omit the xml namespace for news
    xhtml: true,
    image: true,
    video: true,
    custom: [
      'xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"',
      'xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"',
    ],
  }
 })
// coalesce stream to value
// alternatively you can pipe to another stream
streamToPromise(smStream).then(console.log)

smStream.write({
  url: '/page1',
  changefreq: 'weekly',
  priority: 0.8, // A hint to the crawler that it should prioritize this over items less than 0.8
})

// each sitemap entry supports many options
// See [Sitemap Item Options](./api.md#sitemap-item-options) below for details
smStream.write({
  url: 'http://test.com/page-1/',
  img: [
    {
      url: 'http://test.com/img1.jpg',
      caption: 'An image',
      title: 'The Title of Image One',
      geoLocation: 'London, United Kingdom',
      license: 'https://creativecommons.org/licenses/by/4.0/'
    },
    {
      url: 'http://test.com/img2.jpg',
      caption: 'Another image',
      title: 'The Title of Image Two',
      geoLocation: 'London, United Kingdom',
      license: 'https://creativecommons.org/licenses/by/4.0/'
    }
  ],
  video: [
    {
      thumbnail_loc: 'http://test.com/tmbn1.jpg',
      title: 'A video title',
      description: 'This is a video'
    },
    {
      thumbnail_loc: 'http://test.com/tmbn2.jpg',
      title: 'A video with an attribute',
      description: 'This is another video',
      'player_loc': 'http://www.example.com/videoplayer.mp4?video=123',
      'player_loc:autoplay': 'ap=1',
      'player_loc:allow_embed': 'yes'
    }
  ],
  links: [
    { lang: 'en', url: 'http://test.com/page-1/' },
    { lang: 'ja', url: 'http://test.com/page-1/ja/' }
  ],
  androidLink: 'android-app://com.company.test/page-1/',
  news: {
    publication: {
      name: 'The Example Times',
      language: 'en'
    },
    genres: 'PressRelease, Blog',
    publication_date: '2008-12-23',
    title: 'Companies A, B in Merger Talks',
    keywords: 'business, merger, acquisition, A, B',
    stock_tickers: 'NASDAQ:A, NASDAQ:B'
  }
})
// indicate there is nothing left to write
smStream.end()

Examples

For more examples see the examples directory

API

Full API docs can be found here

Maintainers

License

See LICENSE file.

More Repositories

1

github-markdown-toc

Easy TOC creation for GitHub README.md
Shell
3,104
star
2

nodeenv

Virtual environment for Node.js & integrator with virtualenv
Python
1,634
star
3

Dockerfile.vim

Vim syntax file & snippets for Docker's Dockerfile
Vim Script
692
star
4

github-markdown-toc.go

Easy TOC creation for GitHub README.md (in go)
Go
486
star
5

envirius

Universal Virtual Environments Manager
Shell
330
star
6

typogr.js

Typography utils for javascript
JavaScript
296
star
7

nodeguide.ru

nodeguide.ru
CSS
176
star
8

awsping

Console tool to check the latency to each Amazon EC2 region
Go
156
star
9

pip-bash-completion

bash autocompletion for pip
Shell
77
star
10

robots.js

Parser for robots.txt for node.js
JavaScript
65
star
11

operating-systems-three-easy-pieces-pdf

Tool for single pdf creation from http://ostep.org
Python
34
star
12

rust-bookmarks

Bookmarks about rust programming language
14
star
13

erlang-libs

Listing of useful erlang libraries
JavaScript
8
star
14

btxer

Simple bitcoin transactions tracker
Go
8
star
15

docker-uptime

A Dockerfile that installs the latest mongodb, nodejs and uptime
Shell
7
star
16

docker-munin-nginx

A Dockerfile that installs munin, nginx, and sshd
Shell
5
star
17

docker-sentry-postgres

A Dockerfile that installs sentry, postgres, and sshd
Python
5
star
18

virtualenv-bash-completion

bash completion for virtualenv
3
star
19

veco

Shell wrapper for the most popular version control systems
Shell
3
star
20

algo.rs

Rust
2
star
21

erlang-iptables

Simple Erlang wrapper to iptables
Erlang
2
star
22

pbvm

Protocol Buffers Version Manager
Go
2
star
23

flask-noextref

Provides support for hiding external URL for Flask based applications.
Python
2
star
24

eppi

Erlang based Python Package Index
Erlang
2
star
25

envirius.rs

envirius in Rust
Rust
2
star
26

ekalinin

Personal repo
1
star
27

enviriusx

Universal Virtual Environments Manager
Go
1
star
28

envirius.hs

Universal Virtual Environments Manager in Haskell
Haskell
1
star
29

sis

Simple Image Server
Go
1
star
30

lyseru

Python
1
star
31

cluedo-junior-logs

Logs for cluedo junior game
HTML
1
star
32

marktime.py

Python stopwatch module for humans.
Python
1
star
33

debianworld.ru

Makefile
1
star