• Stars
    star
    650
  • Rank 69,267 (Top 2 %)
  • Language
    TypeScript
  • License
    MIT License
  • Created about 8 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

⛓ Extract web links information: title, description, images, videos, etc. [via OpenGraph], runs on mobiles and node.

Repo no longer maintained

Sorry! No energy or time. Feel free to fork it and publish your own version.

Link Preview JS

npm i link-preview-js

Before creating an issue

It's more than likely there is nothing wrong with the library:

  • It's very simple; fetch html, parse html, look for OpenGraph html tags.
  • Unless HTML or the OpenGraph standard change, the library will not break
  • If the target website you are trying to preview redirects you to a login page the preview will fail, because it will parse the login page
  • If the target website does not have OpenGraph tags the preview will most likely fail, there are some fallbacks but in general it will not work
  • You cannot preview (fetch) another web page from YOUR web page. This is an intentional security feature of browsers called CORS

If you use this library and find it useful please consider sponsoring me, open source takes a lot of time and effort.

Link Preview

Allows you to extract information from a HTTP url/link (or parse a HTML string) and retrieve meta information such as title, description, images, videos, etc. via OpenGraph tags.

GOTCHAs

  • You cannot request a different domain from your web app (Browsers block cross-origin-requests). If you don't know how same-origin-policy works, here is a good intro, therefore this library works on node (back-end environments) and certain mobile run-times (cordova or react-native).
  • This library acts as if the user would visit the page, sites might re-direct you to sign-up pages, consent screens, etc. You can try to change the user-agent header (try with google-bot or with Twitterbot), but you need to work around these issues yourself.

API

getLinkPreview: you have to pass a string, doesn't matter if it is just a URL or a piece of text that contains a URL, the library will take care of parsing it and returning the info of first valid HTTP(S) URL info it finds.

getPreviewFromContent: useful for passing a pre-fetched Response object from an existing async/etc. call. Refer to example below for required object values.

import { getLinkPreview, getPreviewFromContent } from "link-preview-js";

// pass the link directly
getLinkPreview("https://www.youtube.com/watch?v=MejbOFk7H6c").then((data) =>
  console.debug(data)
);

////////////////////////// OR //////////////////////////

// pass a chunk of text
getLinkPreview(
  "This is a text supposed to be parsed and the first link displayed https://www.youtube.com/watch?v=MejbOFk7H6c"
).then((data) => console.debug(data));

////////////////////////// OR //////////////////////////

// pass a pre-fetched response object
// The passed response object should include, at minimum:
// {
//   data: '<!DOCTYPE...><html>...',     // response content
//   headers: {
//     ...
//     // should include content-type
//     content-type: "text/html; charset=ISO-8859-1",
//     ...
//   },
//   url: 'https://domain.com/'          // resolved url
// }
yourAjaxCall(url, (response) => {
  getPreviewFromContent(response).then((data) => console.debug(data));
});

Options

Additionally you can pass an options object which should add more functionality to the parsing of the link

Property Name Result
imagesPropertyType (optional) (ex: 'og') Fetches images only with the specified property, meta[property='${imagesPropertyType}:image']
headers (optional) (ex: { 'user-agent': 'googlebot', 'Accept-Language': 'en-US' }) Add request headers to fetch call
timeout (optional) (ex: 1000) Timeout for the request to fail
followRedirects (optional) (default 'error') For security reasons, the library does not automatically follow redirects ('error' value), a malicious agent can exploit redirects to steal data, posible values: ('error', 'follow', 'manual')
handleRedirects (optional) (with followRedirects 'manual') When followRedirects is set to 'manual' you need to pass a function that validates if the redirectinon is secure, below you can find an example
resolveDNSHost (optional) Function that resolves the final address of the detected/parsed URL to prevent SSRF attacks
getLinkPreview("https://www.youtube.com/watch?v=MejbOFk7H6c", {
  imagesPropertyType: "og", // fetches only open-graph images
  headers: {
    "user-agent": "googlebot" // fetches with googlebot crawler user agent
    "Accept-Language": "fr-CA", // fetches site for French language
    // ...other optional HTTP request headers
  },
  timeout: 1000
}).then(data => console.debug(data));

SSRF Concerns

Doing requests on behalf of your users or using user provided URLs is dangerous. One of such attacks is a trying to fetch a domain which redirects to localhost and so the users getting the contents of your server (doesn't affect mobile runtimes). In order to mittigate this attack you can use the resolveDNSHost option:

// example how to use node's dns resolver
const dns = require("node:dns");
getLinkPreview("http://maliciousLocalHostRedirection.com", {
  resolveDNSHost: async (url: string) => {
    return new Promise((resolve, reject) => {
      const hostname = new URL(url).hostname;
      dns.lookup(hostname, (err, address, family) => {
        if (err) {
          reject(err);
          return;
        }

        resolve(address); // if address resolves to localhost or '127.0.0.1' library will throw an error
      });
    });
  },
}).catch((e) => {
  // will throw a detected redirection to localhost
});

This might add some latency to your request but prevents loopback attacks.

Redirections

Same as SSRF, following redirections is dangerous, the library errors by default when the response tries to redirect the user. There are however some simple redirections which are valid (e.g. http to https) and you might want to allow, you can do it via:

await getLinkPreview(`http://google.com/`, {
  followRedirects: `manual`,
  handleRedirects: (baseURL: string, forwardedURL: string) => {
    const urlObj = new URL(baseURL);
    const forwardedURLObj = new URL(forwardedURL);
    if (
      forwardedURLObj.hostname === urlObj.hostname ||
      forwardedURLObj.hostname === "www." + urlObj.hostname ||
      "www." + forwardedURLObj.hostname === urlObj.hostname
    ) {
      return true;
    } else {
      return false;
    }
  },
});

Response

Returns a Promise that resolves with an object describing the provided link. The info object returned varies depending on the content type (MIME type) returned in the HTTP response (see below for variations of response). Rejects with an error if response can not be parsed or if there was no URL in the text provided.

Text/HTML URL

{
  url: "https://www.youtube.com/watch?v=MejbOFk7H6c",
  title: "OK Go - Needing/Getting - Official Video - YouTube",
  siteName: "YouTube",
  description: "Buy the video on iTunes: https://itunes.apple.com/us/album/needing-getting-bundle-ep/id508124847 See more about the guitars at: http://www.gretschguitars.com...",
  images: ["https://i.ytimg.com/vi/MejbOFk7H6c/maxresdefault.jpg"],
  mediaType: "video.other",
  contentType: "text/html; charset=utf-8",
  videos: [],
  favicons:["https://www.youtube.com/yts/img/favicon_32-vflOogEID.png","https://www.youtube.com/yts/img/favicon_48-vflVjB_Qk.png","https://www.youtube.com/yts/img/favicon_96-vflW9Ec0w.png","https://www.youtube.com/yts/img/favicon_144-vfliLAfaB.png","https://s.ytimg.com/yts/img/favicon-vfl8qSV2F.ico"]
}

Image URL

{
  url: "https://media.npr.org/assets/img/2018/04/27/gettyimages-656523922nunes-4bb9a194ab2986834622983bb2f8fe57728a9e5f-s1100-c15.jpg",
  mediaType: "image",
  contentType: "image/jpeg",
  favicons: [ "https://media.npr.org/favicon.ico" ]
}

Audio URL

{
  url: "https://ondemand.npr.org/anon.npr-mp3/npr/atc/2007/12/20071231_atc_13.mp3",
  mediaType: "audio",
  contentType: "audio/mpeg",
  favicons: [ "https://ondemand.npr.org/favicon.ico" ]
}

Video URL

{
  url: "https://www.w3schools.com/html/mov_bbb.mp4",
  mediaType: "video",
  contentType: "video/mp4",
  favicons: [ "https://www.w3schools.com/favicon.ico" ]
}

Application URL

{
  url: "https://assets.curtmfg.com/masterlibrary/56282/installsheet/CME_56282_INS.pdf",
  mediaType: "application",
  contentType: "application/pdf",
  favicons: [ "https://assets.curtmfg.com/favicon.ico" ]
}

License

MIT license

More Repositories

1

sol

MacOS launcher & command palette
TypeScript
1,921
star
2

react-native-quick-sqlite

Fast SQLite for react-native.
477
star
3

turbo-secure-storage

Secure Storage Turbo Module for React Native
Java
102
star
4

react-native-macos-menubar-template

A template project for a macOS menu bar/tray app with react-native-mac-os
Ruby
84
star
5

react-native-bump-version

Small script I use to bump my react-native releases (`yarn bump`)
Shell
42
star
6

react-native-jsi-template

Template for react-native jsi module
Java
34
star
7

osp-toolkit

TypeScript
10
star
8

jsi_benchmark

Java
10
star
9

bin

Personal utility scripts
Shell
8
star
10

messer

Messer WAS a native macOS menu bar app to quickly do simple image manipulations: resize, convert, pad, etc
Swift
6
star
11

generative_rust

Rust
5
star
12

jsi-cpr-test

Java
5
star
13

RNAppClip

Demo of app clip with the new arch enabled
Swift
3
star
14

raycast_google_translate

Translate on raycast via google translate
TypeScript
2
star
15

cidemon

👹 MacOS menu bar app to monitor your CI jobs/deployments
TypeScript
2
star
16

google_translate_supported_languages

A list of supported languages in Google Translate
1
star
17

cidemon_issues

Repo containing CI Demon issues and request
1
star
18

messer_site

HTML
1
star
19

site_sol

JavaScript
1
star
20

expo-sqlite-benchmark

TypeScript
1
star
21

awesome_turbo_module

Just a repo trying to get turbo modules + codegen (typescript) to work
Java
1
star
22

libcprtest

Testing how to compile/link libcpr into an iOS/RN project
CMake
1
star
23

osp-haskell

Algorithms/Competitive Programming in haskell
Haskell
1
star
24

ospfranco.github.io

My personal site, uses Jekyll
HTML
1
star
25

advanced-algorithms

UMSS Advanced Algorithms class, mostly competitive exercises
Java
1
star