HTMLEntities
Summary
Pure Swift HTML encode/decode utility tool for Swift.
Includes support for HTML5 named character references. You can find the list of all 2231 HTML5 named character references here.
HTMLEntities
can escape ALL non-ASCII characters as well as the characters <
, >
, &
, "
, โ
, as these five characters are part of the HTML tag and HTML attribute syntaxes.
In addition, HTMLEntities
can unescape encoded HTML text that contains decimal, hexadecimal, or HTML5 named character references.
API Documentation
API documentation for HTMLEntities
is located here.
Features
- Supports HTML5 named character references (
NegativeMediumSpace;
etc.) - HTML5 spec-compliant; strict parse mode recognizes parse errors
- Supports decimal and hexadecimal escapes for all characters
- Simple to use as functions are added by way of extending the default
String
class - Minimal dependencies; implementation is completely self-contained
Version Info
Latest release of HTMLEntities
requires Swift 4.0 and higher.
Installation
Via Swift Package Manager
Add HTMLEntities
to your Package.swift
:
import PackageDescription
let package = Package(
name: "<package-name>",
...
dependencies: [
.package(url: "https://github.com/Kitura/swift-html-entities.git", from: "3.0.0")
]
// Also, make sure to add HTMLEntities to your package target's dependencies
)
Via CocoaPods
Add HTMLEntities
to your Podfile
:
target '<project-name>' do
pod 'HTMLEntities', :git => 'https://github.com/Kitura/swift-html-entities.git'
end
Via Carthage
Add HTMLEntities
to your Cartfile
:
github "Kitura/swift-html-entities"
Usage
import HTMLEntities
// encode example
let html = "<script>alert(\"abc\")</script>"
print(html.htmlEscape())
// Prints "<script>alert("abc")</script>"
// decode example
let htmlencoded = "<script>alert("abc")</script>"
print(htmlencoded.htmlUnescape())
// Prints "<script>alert(\"abc\")</script>"
Advanced Options
HTMLEntities
supports various options when escaping and unescaping HTML characters.
Escape Options
allowUnsafeSymbols
Defaults to false
. Specifies if unsafe ASCII characters should be skipped or not.
import HTMLEntities
let html = "<p>\"cafรฉ\"</p>"
print(html.htmlEscape())
// Prints "<p>"café"</p>"
print(html.htmlEscape(allowUnsafeSymbols: true))
// Prints "<p>\"café\"</p>"
decimal
Defaults to false
. Specifies if decimal character escapes should be used instead of hexadecimal character escapes whenever numeric character escape is used (i.e., does not affect named character references escapes). The use of hexadecimal character escapes is recommended.
import HTMLEntities
let text = "แแ
กแซ, ํ, แบฟ, eฬฬ, ๐บ๐ธ"
print(text.htmlEscape())
// Prints "한, 한, ế, ế, 🇺🇸"
print(text.htmlEscape(decimal: true))
// Prints "한, 한, ế, ế, 🇺🇸"
encodeEverything
Defaults to false
. Specifies if all characters should be escaped, even if some characters are safe. If true
, overrides the setting for allowUnsafeSymbols
.
import HTMLEntities
let text = "A quick brown fox jumps over the lazy dog"
print(text.htmlEscape())
// Prints "A quick brown fox jumps over the lazy dog"
print(text.htmlEscape(encodeEverything: true))
// Prints "A quick brown fox jumps over the lazy dog"
// `encodeEverything` overrides `allowUnsafeSymbols`
print(text.htmlEscape(allowUnsafeSymbols: true, encodeEverything: true))
// Prints "A quick brown fox jumps over the lazy dog"
useNamedReferences
Defaults to false
. Specifies if named character references should be used whenever possible. Set to false
to always use numeric character references, i.e., for compatibility with older browsers that do not recognize named character references.
import HTMLEntities
let html = "<script>alert(\"abc\")</script>"
print(html.htmlEscape())
// Prints โ<script>alert("abc")</script>โ
print(html.htmlEscape(useNamedReferences: true))
// Prints โ<script>alert("abc")</script>โ
Set Escape Options Globally
HTML escape options can be set globally so that you don't have to set them everytime you want to escape a string. The options are managed in the String.HTMLEscapeOptions
struct.
import HTMLEntities
// set `useNamedReferences` to `true` globally
String.HTMLEscapeOptions.useNamedReferences = true
let html = "<script>alert(\"abc\")</script>"
// Now, the default behavior of `htmlEscape()` is to use named character references
print(html.htmlEscape())
// Prints โ<script>alert("abc")</script>โ
// And you can still go back to using numeric character references only
print(html.htmlEscape(useNamedReferences: false))
// Prints "<script>alert("abc")</script>"
Unescape Options
strict
Defaults to false
. Specifies if HTML5 parse errors should be thrown or simply passed over.
Note: htmlUnescape()
is a throwing function if strict
is used in call argument (no matter if it is set to true
or false
); htmlUnescape()
is NOT a throwing function if no argument is provided.
import HTMLEntities
let text = "한"
print(text.htmlUnescape())
// Prints "ํ"
print(try text.htmlUnescape(strict: true))
// Throws a `ParseError.MissingSemicolon` instance
// a throwing function because `strict` is passed in argument
// but no error is thrown because `strict: false`
print(try text.htmlUnescape(strict: false))
// Prints "ํ"
Acknowledgments
HTMLEntities
was designed to support some of the same options as he
, a popular Javascript HTML encoder/decoder.
License
Apache 2.0