• Stars
    star
    609
  • Rank 73,614 (Top 2 %)
  • Language
    Ruby
  • License
    MIT License
  • Created about 15 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Domain name parser for Ruby based on the Public Suffix List.

Public Suffix for Ruby

PublicSuffix is a Ruby domain name parser based on the Public Suffix List.

Build Status Tidelift dependencies

Links

Requirements

PublicSuffix requires Ruby >= 2.6. For an older versions of Ruby use a previous release.

Installation

You can install the gem manually:

gem install public_suffix

Or use Bundler and define it as a dependency in your Gemfile:

gem 'public_suffix'

If you are upgrading to 2.0, see 2.0-Upgrade.md.

Usage

Extract the domain out from a name:

PublicSuffix.domain("google.com")
# => "google.com"
PublicSuffix.domain("www.google.com")
# => "google.com"
PublicSuffix.domain("www.google.co.uk")
# => "google.co.uk"

Parse a domain without subdomains:

domain = PublicSuffix.parse("google.com")
# => #<PublicSuffix::Domain>
domain.tld
# => "com"
domain.sld
# => "google"
domain.trd
# => nil
domain.domain
# => "google.com"
domain.subdomain
# => nil

Parse a domain with subdomains:

domain = PublicSuffix.parse("www.google.com")
# => #<PublicSuffix::Domain>
domain.tld
# => "com"
domain.sld
# => "google"
domain.trd
# => "www"
domain.domain
# => "google.com"
domain.subdomain
# => "www.google.com"

Simple validation example:

PublicSuffix.valid?("google.com")
# => true

PublicSuffix.valid?("www.google.com")
# => true

# Explicitly forbidden, it is listed as a private domain
PublicSuffix.valid?("blogspot.com")
# => false

# Unknown/not-listed TLD domains are valid by default
PublicSuffix.valid?("example.tldnotlisted")
# => true

Strict validation (without applying the default * rule):

PublicSuffix.valid?("example.tldnotlisted", default_rule: nil)
# => false

Fully Qualified Domain Names

This library automatically recognizes Fully Qualified Domain Names. A FQDN is a domain name that end with a trailing dot.

# Parse a standard domain name
PublicSuffix.domain("www.google.com")
# => "google.com"

# Parse a fully qualified domain name
PublicSuffix.domain("www.google.com.")
# => "google.com"

Private domains

This library has support for switching off support for private (non-ICANN).

# Extract a domain including private domains (by default)
PublicSuffix.domain("something.blogspot.com")
# => "something.blogspot.com"

# Extract a domain excluding private domains
PublicSuffix.domain("something.blogspot.com", ignore_private: true)
# => "blogspot.com"

# It also works for #parse and #valid?
PublicSuffix.parse("something.blogspot.com", ignore_private: true)
PublicSuffix.valid?("something.blogspot.com", ignore_private: true)

If you don't care about private domains at all, it's more efficient to exclude them when the list is parsed:

# Disable support for private TLDs
PublicSuffix::List.default = PublicSuffix::List.parse(File.read(PublicSuffix::List::DEFAULT_LIST_PATH), private_domains: false)
# => "blogspot.com"
PublicSuffix.domain("something.blogspot.com")
# => "blogspot.com"

Add domain to list

If you want to manually add a domain to the list just run:

PublicSuffix::List.default << PublicSuffix::Rule.factory('onmicrosoft.com')

What is the Public Suffix List?

The Public Suffix List is a cross-vendor initiative to provide an accurate list of domain name suffixes.

The Public Suffix List is an initiative of the Mozilla Project, but is maintained as a community resource. It is available for use in any software, but was originally created to meet the needs of browser manufacturers.

A "public suffix" is one under which Internet users can directly register names. Some examples of public suffixes are ".com", ".co.uk" and "pvt.k12.wy.us". The Public Suffix List is a list of all known public suffixes.

Why the Public Suffix List is better than any available Regular Expression parser?

Previously, browsers used an algorithm which basically only denied setting wide-ranging cookies for top-level domains with no dots (e.g. com or org). However, this did not work for top-level domains where only third-level registrations are allowed (e.g. co.uk). In these cases, websites could set a cookie for co.uk which will be passed onto every website registered under co.uk.

Clearly, this was a security risk as it allowed websites other than the one setting the cookie to read it, and therefore potentially extract sensitive information.

Since there is no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list of all top-level domains and the level at which domains can be registered. This is the aim of the effective TLD list.

As well as being used to prevent cookies from being set where they shouldn't be, the list can also potentially be used for other applications where the registry controlled and privately controlled parts of a domain name need to be known, for example when grouping by top-level domains.

Source: https://wiki.mozilla.org/Public_Suffix_List

Not convinced yet? Check out this real world example.

Does PublicSuffix make requests to Public Suffix List website?

No. PublicSuffix comes with a bundled list. It does not make any HTTP requests to parse or validate a domain.

Support

Library documentation is auto-generated from the README and the source code, and it's available at https://rubydoc.info/gems/public_suffix.

Consider subscribing to Tidelift which provides Enterprise support for this project as part of the Tidelift Subscription. Tidelift subscriptions also help the maintainers by funding the project, which in turn allows us to ship releases, bugfixes, and security updates more often.

Security and Vulnerability Reporting

Full information and description of our security policy please visit SECURITY.md

Changelog

See the CHANGELOG.md file for details.

License

Copyright (c) 2009-2023 Simone Carletti. This is Free Software distributed under the MIT license.

The Public Suffix List source is subject to the terms of the Mozilla Public License, v. 2.0.

Definitions

tld = Top level domain, this is in reference to the last segment of a domain, sometimes the part that is directly after the "dot" symbol. For example, mozilla.org, the .org portion is the tld.

sld = Second level domain, a domain that is directly below a top-level domain. For example, in https://www.mozilla.org/en-US/, mozilla is the second-level domain of the .org tld.

trd = Transit routing domain, or known as a subdomain. This is the part of the domain that is before the sld or root domain. For example, in https://www.mozilla.org/en-US/, www is the trd.

FQDN = Fully Qualified Domain Names, are domain names that are written with the hostname and the domain name, and include the top-level domain, the format looks like [hostname].[domain].[tld]. for ex. [www].[mozilla].[org].

More Repositories

1

whois

An intelligent — pure Ruby — WHOIS client and parser.
Ruby
1,096
star
2

breadcrumbs_on_rails

A simple Ruby on Rails plugin for creating and managing a breadcrumb navigation.
Ruby
937
star
3

tabs_on_rails

Tabs on Rails is a simple Rails plugin for creating and managing tabs and navigation menus.
Ruby
295
star
4

publicsuffix-go

Domain name parser for Go based on the Public Suffix List.
Go
170
star
5

whois-parser

An intelligent — pure Ruby — WHOIS parser.
Ruby
97
star
6

dnscaa

Go
69
star
7

www-delicious

Ruby client for delicious.com API.
Ruby
62
star
8

apachelogregex

Ruby parser for Apache log files based on regular expressions.
Ruby
54
star
9

actionmailer_with_request

A simple plugin to make the Rails request context available for generating URLs in ActionMailer.
Ruby
54
star
10

ruby-4-rails.showoff

Learning Ruby, with Rails in mind.
Ruby
45
star
11

rubyist

Ruby Quality Guidelines
Ruby
34
star
12

brighella

A simple URL-masking redirect service built on Go.
Go
14
star
13

apachelog2feed

ApacheLogAnalyzer2Feed is a really powerful open source PHP 5 class to parse and analyze Apache Web Server log files.
PHP
13
star
14

activerecord-multiconditions

DISCONTINUED - An ActiveRecord plugin for dealing with complex search :conditions.
Ruby
8
star
15

docbook5-tmbundle

TextMate bundle for DocBook 5.
6
star
16

whois-debian

Various versions of the Whois debian packages tracked in a single Git repository.
C
5
star
17

wordpress-mtimporter

WordPress Importer utilities to finalize your migration from Movable Type to Wordpress.
PHP
5
star
18

helperful

DISCONTINUED - A collection of useful Rails helpers.
Ruby
5
star
19

asg

ASP Stats Generator
ASP
4
star
20

fileiterator

PHP
4
star
21

whois.js

JavaScript
3
star
22

gomotion2015

A practical introduction to go
Go
3
star
23

movabletype-templates

3
star
24

kirbybase

Unofficial Git repository for KirbyBase
3
star
25

letsencrypt-dnsimple

Go
3
star
26

pslint

PSLint is a linter for Public Suffix list
Go
1
star
27

squeezon-ruby

Ruby
1
star
28

digweb

Like DiG, but in HTTP flavour.
Go
1
star
29

hubot-digweb

CoffeeScript
1
star
30

domainr-go

A Go client for the Domainr API.
Go
1
star
31

public_suffix_service

The repository has moved!
1
star