• Stars
    star
    170
  • Rank 223,357 (Top 5 %)
  • Language
    Go
  • License
    MIT License
  • Created over 8 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Domain name parser for Go based on the Public Suffix List.

Public Suffix for Go

The package publicsuffix provides a Go domain name parser based on the Public Suffix List.

Tests GoDoc

Currently, publicsuffix-go requires Go version 1.9 or greater. We do our best not to break older versions of Go if we don't have to, but due to tooling constraints, we don't always test older versions.

Getting started

Clone the repository in your workspace and move into it:

mkdir -p $GOPATH/src/github.com/weppos && cd $_
git clone [email protected]:weppos/publicsuffix-go.git
cd publicsuffix-go

Fetch the dependencies:

go get ./...

Run the test suite.

go test ./...

Testing

The following command runs the entire test suite.

go test ./...

There are 3 different test suites built into this library:

  • Acceptance: the acceptance test suite contains some high level tests to ensure the library behaves as expected
  • PSL: the PSL test suite runs the library against the official Public Suffix test cases
  • Unit: the unit test suite stresses the various single components of this package

Installation

go get github.com/weppos/publicsuffix-go

Usage

This is a simple example that demonstrates how to use the package with the default options and the default Public Suffix list packaged with the library.

package main

import (
    "fmt"

    "github.com/weppos/publicsuffix-go/publicsuffix"
)

func main() {
    // Extract the domain from a string
    // using the default list
    fmt.Println(publicsuffix.Domain("example.com"))             // example.com
    fmt.Println(publicsuffix.Domain("www.example.com"))         // example.com
    fmt.Println(publicsuffix.Domain("example.co.uk"))           // example.co.uk
    fmt.Println(publicsuffix.Domain("www.example.co.uk"))       // example.co.uk

    // Parse the domain from a string
    // using the default list
    fmt.Println(publicsuffix.Parse("example.com"))             // &DomainName{"com", "example", ""}
    fmt.Println(publicsuffix.Parse("www.example.com"))         // &DomainName{"com", "example", "www"}
    fmt.Println(publicsuffix.Parse("example.co.uk"))           // &DomainName{"co.uk", "example", ""}
    fmt.Println(publicsuffix.Parse("www.example.co.uk"))       // &DomainName{"co.uk", "example", "www"}
}

Ignoring Private Domains

The PSL is composed by two list of suffixes: IANA suffixes, and Private Domains.

Private domains are submitted by private organizations. By default, private domains are not ignored. Sometimes, you want to ignore these domains and only query against the IANA suffixes. You have two options:

  1. Ignore the domains at runtime
  2. Create a custom list without the private domains

In the first case, the private domains are ignored at runtime: they will still be included in the lists but the lookup will skip them when found.

publicsuffix.DomainFromListWithOptions(publicsuffix.DefaultList(), "google.blogspot.com", nil)
// google.blogspot.com

publicsuffix.DomainFromListWithOptions(publicsuffix.DefaultList(), "google.blogspot.com", &publicsuffix.FindOptions{IgnorePrivate: true})
// blogspot.com

// Note that the DefaultFindOptions includes the private domains by default
publicsuffix.DomainFromListWithOptions(publicsuffix.DefaultList(), "google.blogspot.com", publicsuffix.DefaultFindOptions)
// google.blogspot.com

This solution is easy, but slower. If you find yourself ignoring the private domains in all cases (or in most cases), you may want to create a custom list without the private domains.

list := NewListFromFile("path/to/list.txt", &publicsuffix.ParserOption{PrivateDomains: false})
publicsuffix.DomainFromListWithOptions(list, "google.blogspot.com", nil)
// blogspot.com

IDN domains, A-labels and U-labels

A-label and U-label are two different ways to represent IDN domain names. These two encodings are also known as ASCII (A-label) or Pynucode vs Unicode (U-label). Conversions between U-labels and A-labels are performed according to the "Punycode" specification, adding or removing the ACE prefix as needed.

IDNA-aware applications generally use the A-label form for storing and manipulating data, whereas the U-labels can appear in presentation and user interface forms.

Although the PSL list has been traditionally U-label encoded, this library follows the common industry standards and stores the rules in their A-label form. Therefore, unless explicitly mentioned, any method call, comparison or internal representation is expected to be ASCII-compatible encoded (ACE).

Passing Unicode names to the library may either result in error or unexpected behaviors.

If you are interested in the details of this decision, you can read the full discussion here.

Differences with golang.org/x/net/publicsuffix

The golang.org/x/net/publicsuffix is a package part of the Golang x/net package, that provides a public suffix list implementation.

The main difference is that the x/net package is optimized for speed, but it's less flexible. The list is compiled and embedded into the package itself. However, this is also the main downside. The list is not frequently refreshed, hence the results may be inaccurate, in particular if you heavily rely on the private domain section of the list. Changes in the IANA section are less frequent, whereas changes in the Private Domains section happens weekly.

This package provides the following extra features:

  • Ability to load an arbitrary list at runtime (e.g. you can feed your own list, or create multiple lists)
  • Ability to create multiple lists
  • Ability to parse a domain using a previously defined list
  • Ability to add custom rules to an existing list, or merge/load rules from other lists (provided as file or string)
  • Advanced access to the list rules
  • Ability to ignore private domains at runtime, or when the list is parsed

This package also aims for 100% compatibility with the x/net package. A special adapter is provided as a drop-in replacement. Simply change the include statement from

import (
    "golang.org/x/net/publicsuffix"
)

to

import (
    "github.com/weppos/publicsuffix-go/net/publicsuffix"
)

The github.com/weppos/publicsuffix-go/net/publicsuffix package defines the same methods defined in golang.org/x/net/publicsuffix, but these methods are implemented using the github.com/weppos/publicsuffix-go/publicsuffix package.

Note that the adapter doesn't offer the flexibility of github.com/weppos/publicsuffix-go/publicsuffix, such as the ability to use multiple lists or disable private domains at runtime.

cookiejar.PublicSuffixList interface

This package implements the cookiejar.PublicSuffixList interface. It means it can be used as a value for the PublicSuffixList option when creating a net/http/cookiejar.

import (
    "net/http/cookiejar"
    "github.com/weppos/publicsuffix-go/publicsuffix"
)

deliciousJar := cookiejar.New(&cookiejar.Options{PublicSuffixList: publicsuffix.CookieJarList})

License

Copyright (c) 2016-2022 Simone Carletti. This is Free Software distributed under the MIT license.

More Repositories

1

whois

An intelligent β€” pure Ruby β€” WHOIS client and parser.
Ruby
1,096
star
2

breadcrumbs_on_rails

A simple Ruby on Rails plugin for creating and managing a breadcrumb navigation.
Ruby
937
star
3

publicsuffix-ruby

Domain name parser for Ruby based on the Public Suffix List.
Ruby
609
star
4

tabs_on_rails

Tabs on Rails is a simple Rails plugin for creating and managing tabs and navigation menus.
Ruby
295
star
5

whois-parser

An intelligent β€” pure Ruby β€” WHOIS parser.
Ruby
97
star
6

dnscaa

Go
69
star
7

www-delicious

Ruby client for delicious.com API.
Ruby
62
star
8

apachelogregex

Ruby parser for Apache log files based on regular expressions.
Ruby
54
star
9

actionmailer_with_request

A simple plugin to make the Rails request context available for generating URLs in ActionMailer.
Ruby
54
star
10

ruby-4-rails.showoff

Learning Ruby, with Rails in mind.
Ruby
45
star
11

rubyist

Ruby Quality Guidelines
Ruby
34
star
12

brighella

A simple URL-masking redirect service built on Go.
Go
14
star
13

apachelog2feed

ApacheLogAnalyzer2Feed is a really powerful open source PHP 5 class to parse and analyze Apache Web Server log files.
PHP
13
star
14

activerecord-multiconditions

DISCONTINUED - An ActiveRecord plugin for dealing with complex search :conditions.
Ruby
8
star
15

docbook5-tmbundle

TextMate bundle for DocBook 5.
6
star
16

whois-debian

Various versions of the Whois debian packages tracked in a single Git repository.
C
5
star
17

wordpress-mtimporter

WordPress Importer utilities to finalize your migration from Movable Type to Wordpress.
PHP
5
star
18

helperful

DISCONTINUED - A collection of useful Rails helpers.
Ruby
5
star
19

asg

ASP Stats Generator
ASP
4
star
20

fileiterator

PHP
4
star
21

whois.js

JavaScript
3
star
22

gomotion2015

A practical introduction to go
Go
3
star
23

movabletype-templates

3
star
24

kirbybase

Unofficial Git repository for KirbyBase
3
star
25

letsencrypt-dnsimple

Go
3
star
26

pslint

PSLint is a linter for Public Suffix list
Go
1
star
27

squeezon-ruby

Ruby
1
star
28

digweb

Like DiG, but in HTTP flavour.
Go
1
star
29

hubot-digweb

CoffeeScript
1
star
30

domainr-go

A Go client for the Domainr API.
Go
1
star
31

public_suffix_service

The repository has moved!
1
star