• Stars
    star
    106
  • Rank 315,799 (Top 7 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created about 6 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Gatsby plugin that automatically creates robots.txt for your site

NPM version Actions Build Status Reliability Rating Coverage FOSSA Status

gatsby-plugin-robots-txt

Create robots.txt for your Gatsby site on build.

Install

yarn add gatsby-plugin-robots-txt

or

npm install --save gatsby-plugin-robots-txt

How To Use

gatsby-config.js

module.exports = {
  siteMetadata: {
    siteUrl: 'https://www.example.com'
  },
  plugins: ['gatsby-plugin-robots-txt']
};

Options

This plugin uses generate-robotstxt to generate content of robots.txt and it has the following options:

Name Type Default Description
host String ${siteMetadata.siteUrl} Host of your site
sitemap String / String[] ${siteMetadata.siteUrl}/sitemap-index.xml Path(s) to sitemap.xml
policy Policy[] [] List of Policy rules
configFile String undefined Path to external config file
output String /robots.txt Path where to create the robots.txt

gatsby-config.js

module.exports = {
  plugins: [
    {
      resolve: 'gatsby-plugin-robots-txt',
      options: {
        host: 'https://www.example.com',
        sitemap: 'https://www.example.com/sitemap.xml',
        policy: [{userAgent: '*', allow: '/'}]
      }
    }
  ]
};

env-option

gatsby-config.js

module.exports = {
  plugins: [
    {
      resolve: 'gatsby-plugin-robots-txt',
      options: {
        host: 'https://www.example.com',
        sitemap: 'https://www.example.com/sitemap.xml',
        env: {
          development: {
            policy: [{userAgent: '*', disallow: ['/']}]
          },
          production: {
            policy: [{userAgent: '*', allow: '/'}]
          }
        }
      }
    }
  ]
};

The env key will be taken from process.env.GATSBY_ACTIVE_ENV first ( see Gatsby Environment Variables for more information on this variable), falling back to process.env.NODE_ENV. When this is not available then it defaults to development.

You can resolve the env key by using resolveEnv function:

gatsby-config.js

module.exports = {
  plugins: [
    {
      resolve: 'gatsby-plugin-robots-txt',
      options: {
        host: 'https://www.example.com',
        sitemap: 'https://www.example.com/sitemap.xml',
        resolveEnv: () => process.env.GATSBY_ENV,
        env: {
          development: {
            policy: [{userAgent: '*', disallow: ['/']}]
          },
          production: {
            policy: [{userAgent: '*', allow: '/'}]
          }
        }
      }
    }
  ]
};

configFile-option

You can use the configFile option to set specific external configuration:

gatsby-config.js

module.exports = {
  plugins: [
    {
      resolve: 'gatsby-plugin-robots-txt',
      options: {
        configFile: 'robots-txt.config.js'
      }
    }
  ]
};

robots-txt.config.js

module.exports = {
  host: 'https://www.example.com',
  sitemap: 'https://www.example.com/sitemap.xml',
  policy: [{userAgent: '*'}]
};

Netlify

If you would like to disable crawlers for deploy-previews you can use the following snippet:

gatsby-config.js

const {
  NODE_ENV,
  URL: NETLIFY_SITE_URL = 'https://www.example.com',
  DEPLOY_PRIME_URL: NETLIFY_DEPLOY_URL = NETLIFY_SITE_URL,
  CONTEXT: NETLIFY_ENV = NODE_ENV
} = process.env;
const isNetlifyProduction = NETLIFY_ENV === 'production';
const siteUrl = isNetlifyProduction ? NETLIFY_SITE_URL : NETLIFY_DEPLOY_URL;

module.exports = {
  siteMetadata: {
    siteUrl
  },
  plugins: [
    {
      resolve: 'gatsby-plugin-robots-txt',
      options: {
        resolveEnv: () => NETLIFY_ENV,
        env: {
          production: {
            policy: [{userAgent: '*'}]
          },
          'branch-deploy': {
            policy: [{userAgent: '*', disallow: ['/']}],
            sitemap: null,
            host: null
          },
          'deploy-preview': {
            policy: [{userAgent: '*', disallow: ['/']}],
            sitemap: null,
            host: null
          }
        }
      }
    }
  ]
};

query-option

By default the site URL will come from the Gatsby node site.siteMeta.siteUrl. Like in Gatsby's sitemap plugin an optional GraphQL query can be used to provide a different value from another data source as long as it returns the same shape:

gatsby-config.js

module.exports = {
  plugins: [
    {
      resolve: 'gatsby-plugin-robots-txt',
      options: {
        query: `{
          site: MyCustomDataSource {
            siteMetadata {
              siteUrl
            }
          }
        }`
      }
    }
  ]
};

License

FOSSA Status