• Stars
    star
    184
  • Rank 202,974 (Top 5 %)
  • Language
    Rust
  • License
    Apache License 2.0
  • Created about 7 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Rust library to find links such as URLs and email addresses in plain text, handling surrounding punctuation correctly

Linkify

Linkify is a Rust library to find links such as URLs and email addresses in plain text. It's smart about where a link ends, such as with trailing punctuation.

Documentation Crate ci codecov

Introduction

Your reaction might be: "Do I need a library for this? Why not a regex?". Let's look at a few cases:

  • In http://example.com/. the link should not include the trailing dot
  • http://example.com/, should not include the trailing comma
  • (http://example.com/) should not include the parens

Seems simple enough. But then we also have these cases:

  • https://en.wikipedia.org/wiki/Link_(The_Legend_of_Zelda) should include the trailing paren
  • http://üñîçøðé.com/ä should also work for Unicode (including Emoji and Punycode)
  • <http://example.com/> should not include angle brackets

This library behaves as you'd expect in the above cases and many more. It uses a simple scan with linear runtime.

In addition to URLs, it can also find email addresses.

Demo 🧑‍🔬

Try it out online on the demo playground (Rust compiled to WebAssembly): https://robinst.github.io/linkify/

If you want to use it on the command line, try lychee. It uses linkify to extract all links and checks if they're valid, but it can also just print them like this:

$ echo 'Test https://example.org (and https://example.com)' | lychee --dump -
https://example.org/
https://example.com/

Usage

Basic usage:

extern crate linkify;

use linkify::{LinkFinder, LinkKind};

let input = "Have you seen http://example.com?";
let finder = LinkFinder::new();
let links: Vec<_> = finder.links(input).collect();

assert_eq!(1, links.len());
let link = &links[0];

assert_eq!("http://example.com", link.as_str());
assert_eq!(14, link.start());
assert_eq!(32, link.end());
assert_eq!(&LinkKind::Url, link.kind());

Option to allow URLs without schemes:

use linkify::LinkFinder;

let input = "Look, no scheme: example.org/foo";
let mut finder = LinkFinder::new();

// true by default
finder.url_must_have_scheme(false);

let links: Vec<_> = finder.links(input).collect();
assert_eq!(links[0].as_str(), "example.org/foo");

Restrict the kinds of links:

use linkify::{LinkFinder, LinkKind};

let input = "http://example.com and [email protected]";
let mut finder = LinkFinder::new();
finder.kinds(&[LinkKind::Email]);
let links: Vec<_> = finder.links(input).collect();

assert_eq!(1, links.len());
let link = &links[0];
assert_eq!("[email protected]", link.as_str());
assert_eq!(&LinkKind::Email, link.kind());

See full documentation on docs.rs.

Conformance

This crates makes an effort to respect the various standards, namely:

At the same time, it does not guarantee that the returned links are valid. If in doubt, it rather returns a link than skipping it.

If you need to validate URLs, e.g. for checking TLDs, use another library on the returned links.

Contributing

Pull requests, issues and comments welcome! Make sure to add tests for new features and bug fixes.

License

Linkify is distributed under the terms of both the MIT license and the Apache License (Version 2.0). See LICENSE-APACHE and LICENSE-MIT for details. Opening a pull requests is assumed to signal agreement with these licensing terms.

More Repositories

1

taglib-ruby

Ruby interface for the TagLib C++ library, for reading and writing meta-data (tags) of many audio formats
C++
250
star
2

autolink-java

Java library to extract links (URLs, email addresses) from plain text; fast, small and smart
Java
204
star
3

git-merge-repos

Program for merging multiple Git repositories into one, preserving previous history, tags and branches
Java
134
star
4

id3lib-ruby

Ruby interface to the id3lib C++ library for easily editing ID3 tags of MP3 audio files
C++
41
star
5

frozen-bubble-android

git svn clone of http://frozenbubbleandroid.googlecode.com/svn/
Java
34
star
6

guava-java8-presentation

Examples of using Guava, with some Java 8 additions
JavaScript
30
star
7

curlall

Simple curl-like CLI tool to automatically page through APIs
Rust
25
star
8

brainztag

Command line tool to tag and rename music albums using MusicBrainz data
Python
21
star
9

jar-manifest-formatter

Pretty-prints JAR manifest files (used by OSGi)
JavaScript
9
star
10

digitec_watcher

Script to watch the Digitec website for price or delivery status changes and send out notifications per e-mail
Ruby
7
star
11

ausballot

Tiny website for previewing the ballot papers (house and senate) for Austalian federal elections
TypeScript
5
star
12

egit

Eclipse Git plugin
Java
3
star
13

nis-ffi

NIS (YP) library using libc's libnsl through ruby-ffi
Ruby
2
star
14

advent-of-code-2022

https://adventofcode.com/2022
Java
2
star
15

ejb3unit

Fork of http://ejb3unit.sourceforge.net/ for upgrading to JPA 2.0
Java
2
star
16

7langs7weeks

Exercises from "Seven Languages in Seven Weeks"
Ruby
1
star
17

egit-mergetool-encoding-problem

Sample repository showing a bug in EGit's merge tool
1
star
18

clojure-sudoku

Simple Sudoku solver in Clojure
Clojure
1
star
19

swig-ruby-subclass-namespace-problem

Example for SWIG Ruby problem with subclass in other namespace
Ruby
1
star