• Stars
    star
    250
  • Rank 162,397 (Top 4 %)
  • Language
    Elixir
  • License
    MIT License
  • Created over 9 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

HTML sanitizer for Elixir

HtmlSanitizeEx Build Status Inline docs

html_sanitize_ex provides a fast and straightforward HTML Sanitizer written in Elixir which lets you include HTML authored by third-parties in your web application while protecting against XSS.

It is the first Hex package to come out of the elixirstatus.com project, where it will be used to sanitize user announcements from the Elixir community.

What can it do?

html_sanitize_ex parses a given HTML string and, based on the used Scrubber, either completely strips it from HTML tags or sanitizes it by only allowing certain HTML elements and attributes to be present.

NOTE: The one thing missing at this moment is support for styles. To add this, we have to implement a Scrubber for CSS, to prevent nasty CSS hacks using <style> tags and attributes.

Otherwise html_sanitize_ex is a full-featured HTML sanitizer.

Installation

Add html_sanitize_ex as a dependency in your mix.exs file.

defp deps do
  [{:html_sanitize_ex, "~> 1.4"}]
end

After adding you are done, run mix deps.get in your shell to fetch the new dependency.

The only dependency of html_sanitize_ex is mochiweb which is used to parse HTML.

Usage

Depending on the scrubber you select, it can strip all tags from the given string:

text = "<a href=\"javascript:alert('XSS');\">text here</a>"
HtmlSanitizeEx.strip_tags(text)
# => "text here"

Or allow certain basic HTML elements to remain:

text = "<h1>Hello <script>World!</script></h1>"
HtmlSanitizeEx.basic_html(text)
# => "<h1>Hello World!</h1>"

There are built-in scrubbers that cover common use cases, but you can also easily define custom scrubbers (see the next section).

The following default scrubbing options exist:

HtmlSanitizeEx.basic_html(html)
HtmlSanitizeEx.html5(html)
HtmlSanitizeEx.markdown_html(html)
HtmlSanitizeEx.strip_tags(html)

There is also one scrubber primarily used for testing:

HtmlSanitizeEx.noscrub(html)

Before using a built-in scrubber, you should verify that it functions in the way you expect. The built-in scrubbers are located in /lib/html_sanitize_ex/scrubber

Custom Scrubbers

A custom scrubber has the advantage of allowing you to support only the minimum functionality needed for your use case.

With a custom scrubber, you define which tags, attributes, and uri schemes (e.g. https, mailto, javascript, etc.) are allowed. Anything not allowed can then be stripped out.

There are also utility functions to remove CDATA sections and comments which you will generally include.

Here is an example of a custom scrubber which allows only p, h1, and a tags, and restricts the href attribute to only the https and mailto URI schemes. It also removes CDATA sections and comments.

Note that the scrubber should include Meta.strip_everything_not_covered() at the end.

defmodule MyProject.MyScrubber do
  require HtmlSanitizeEx.Scrubber.Meta
  alias HtmlSanitizeEx.Scrubber.Meta

  Meta.remove_cdata_sections_before_scrub()
  Meta.strip_comments()

  Meta.allow_tag_with_these_attributes("p", [])
  Meta.allow_tag_with_these_attributes("h1", [])
  Meta.allow_tag_with_uri_attributes("a", ["href"], ["https", "mailto"])

  Meta.strip_everything_not_covered()
end

Then, you can use the scrubber in your project by giving it as the second argument to Scrubber.scrub/2:

defmodule MyProject.MyModule do
  alias HtmlSanitizeEx.Scrubber
  alias MyProject.MyScrubber

  def sanitize_html(html) do
    Scrubber.scrub(html, MyScrubber)
  end
end

A great way to make a custom scrubber is to use one the of built-in scrubbers closest to your use case as a template. The built in scrubbers are located in /lib/html_sanitize_ex/scrubber

Contributing

  1. Fork it!
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

Author

René Föhring (@rrrene)

License

html_sanitize_ex is released under the MIT License. See the LICENSE file for further details.

More Repositories

1

credo

A static code analysis tool for the Elixir language with a focus on code consistency and teaching.
Elixir
4,720
star
2

inch

A documentation analysis tool for the Ruby language
Ruby
517
star
3

elixir-style-guide

Style Guide for the Elixir language, implemented by Credo
Elixir
395
star
4

elixirstatus-web

Community site for Elixir project/blog post/version updates
Elixir
279
star
5

inch_ex

Provides a Mix task that gives you hints where to improve your inline docs.
Elixir
200
star
6

sparkr

▁▂▃▅▂▇ in Ruby (and your shell)
Ruby
151
star
7

bunt

256 color ANSI coloring in Elixir CLIs
Elixir
113
star
8

inchjs

A documentation tool for JavaScript/NodeJS
JavaScript
48
star
9

tipsy.hovercard

Hovercard extension for tipsy tooltip
JavaScript
33
star
10

inch-pages

Jekyll Page generator for Inch Pages
Ruby
25
star
11

outline

Outline is an open source knowledge management application inspired by timeline focused apps like Facebook and Twitter.
Ruby
9
star
12

inch-badge

Badge generator for Inch
HTML
7
star
13

credo-proposals

Proposals for Credo, the Elixir code analysis tool with a focus on code consistency and teaching
7
star
14

sherlock

A library for filtering lists of files and performing actions on their content.
Ruby
5
star
15

dotfiles

Personal dotfiles
Shell
4
star
16

homecoming

Easy upwards directory traversal in Ruby
Ruby
4
star
17

custom_attributes

CustomAttributes allows you to add custom attributes to ActiveRecord objects, optionally scoped by another model (e.g. users).
Ruby
3
star
18

texas

Texas provides an easy way to create PDFs from LaTeX documents using ERb templates.
Ruby
3
star
19

tps-report

Excel friendly reports for your bosses, clients and the rest.
Ruby
2
star
20

freight-exchange

Source code for a European Online Rail Transport Spot Exchange
Ruby
2
star
21

credo_demo_plugin

A Demo Plugin for Credo
Elixir
2
star
22

outline.github.com

JavaScript
1
star
23

repomen

Interface wrapper for retrieving repos
Ruby
1
star
24

easy_type

Build a complex puppet custom type and provider, the easy way
Ruby
1
star
25

credo-elixir-benchmark

Elixir
1
star