url-metadata
Request an http(s) url and scrape its html metadata. Includes Open Graph Protocol (og:) and Twitter Card meta tags.
Support also added for JSON-LD.
Under the hood, this package does some post-request processing on top of the javascript native fetch
API.
To report a bug or request a feature please open an issue or pull request in GitHub.
Usage
Works with Node.js version >=18.0.0
or in the browser when bundled (with browserify or webpack for example).
Use previous version 2.5.0
which uses the (now-deprecated) request
module instead if you don't have access to javascript-native fetch
API in your target environment.
Install:
$ npm install url-metadata --save`
In your project file:
const urlMetadata = require('url-metadata')
urlMetadata('https://www.npmjs.com/package/url-metadata')
.then((metadata) => {
console.log(metadata)
// do stuff with the metadata
},
(err) => {
console.log(err)
})
To override the default options (see below), pass in a second argument:
const urlMetadata = require('url-metadata')
urlMetadata('https://www.npmjs.com/package/url-metadata', {
requestHeaders: {
'User-Agent': 'foo',
'From': '[email protected]'
}
}).then((metadata) => {
console.log(metadata)
// do stuff with the metadata
}).catch((err) => {
console.log(err)
})
Options & Defaults
This module's default options are the values below that you can override:
{
// custom request headers
requestHeaders: {
'User-Agent': 'url-metadata/3.0 (npm module)',
'From': '[email protected]',
}
// `fetch` API cache setting for request
cache: 'no-cache',
// `fetch` API mode (ex: `cors`, `no-cors`, `same-origin`, etc)
mode: 'cors',
// timeout in milliseconds, default is 10 seconds
timeout: 10000,
// number of characters to truncate description to
descriptionLength: 750,
// force image urls in selected tags to use https,
// valid for 'image', 'og:image' and 'og:image:secure_url' tags
ensureSecureImageRequest: true,
// return raw response body as string
includeResponseBody: false
}
Returns
Returns a promise that is resolved with an object if the response is successful. Note that the url
field returned will be the last hop in the request chain. So if you passed in a url that was generated by a url shortener you'll get back the final destination as the url
.