String dedent
Champions: @jridgewell, @hemanth
Author: @mmkal
Status: Stage 2
Problem
When trying to embed formatted text (for instance, Markdown contents, or the source text of a JS program) in JS code, developers are forced to make awkward concessions for readability of the code or output. For instance, to make the embedded text look consistent with the surrounding code, we'd write:
class MyClass {
print() {
console.log(`
create table student(
id int primary key,
name text
)
`);
}
}
This outputs (using ^
to mark the beginning of a line and 路
to mark a leading space):
^
^路路路路路路create table student(
^路路路路路路路路id int primary key,
^路路路路路路路路name text
^路路路路路路)
^路路路路
In order to for the output to look sensible, our code becomes illegible:
class MyClass {
print() {
console.log(`create table student(
id int primary key,
name text
)`);
}
}
This outputs a sensible:
create table student(
id int primary key,
name text
)
With a library
It's possible to write sensible code and have sensible output with the help of libraries.
import dedent from 'dedent'
class MyClass {
print() {
console.log(dedent`
create table student(
id int primary key,
name text
)
`);
}
}
This outputs the sensible:
create table student(
id int primary key,
name text
)
However, these libraries incur a runtime cost, and are subtly inconsistent with
the way they perform "dedenting". The most popular package is stagnant without
bug fixes and has problematic interpreting of the Template Object's .raw
array, and none are able to pass the dedented text to tag template functions.
pythonInterpreter`
print('Hello Python World')
`; // IndentationError: unexpected indent
const dedented = dedent`
print('Hello Python World')
`;
pythonInterpreter`${dedented}`; // <- this doesn't work right.
Additionally, even if a userland library were to support passing to tagged
templates, the array would not be a true Template Object in proposals like
Array.isTemplateObject
.
This harms the ability of tagged templates functions to differentiate dedented
templates that exist in the actual program source text (and ascribe a higher
trust level to) vs a dynamically generated string (which may contain a user
generated exploit string).
Proposed solution
Implement a String.dedent
tag template function, for a tagged template
literal behaving almost the same as a regular single backticked template
literal, with a few key differences:
- The opening line (everything immediately right of the opening
`
) must contain only a literal newline char. - The opening line's literal newline is removed.
- The closing line (everything immediately to the left of the closing
`
) may contain whitespace, but the whitespace is removed. - The closing line's preceding literal newline char is removed.
- Lines which only contain whitespace are emptied.
- The "common indentation" of all non-empty content lines (lines that are not the opening or closing) are calculated.
- That common indentation is removed from the start of every line.
Play around with a REPL implementation.
The examples above would be solved like this:
class MyClass {
print() {
console.log(String.dedent`
create table student(
id int primary key,
name text
)
`);
}
}
This outputs the sensible:
create table student(
id int primary key,
name text
)
Expressions can be directly supported, as well as composition with another tagged template function:
const message = 'Hello Python World';
String.dedent(pythonInterpreter)`
print('${message}')
`;
In other languages
- CoffeeScript - block strings using
'''
and"""
triple-quotes. - Java - text blocks using triple-quotes.
- Kotlin - raw strings using triple-quotes and
.trimIndent()
. - Scala - multiline strings
using triple-quotes and
.stripMargin
. - Python - multiline strings using triple-quotes
to avoid escaping and
textwrap.dedent
. - Jsonnet - text blocks with
|||
as a delimiter. - Bash -
<<-
Heredocs. - Ruby -
<<~
Heredocs. - Swift - multiline string literals using triple-quotes - strips margin based on whitespace before closing delimiter.
- PHP -
<<<
heredoc/nowdoc The indentation of the closing marker dictates the amount of whitespace to strip from each line.
Q&A
Why not use a library?
To summarise the problem section above:
- avoid a dependency for the desired behaviour of the vast majority of multiline strings (dedent has millions of downloads per week).
- avoiding inconsistencies between the multiple current implementations.
- improved performance.
- better discoverability - the feature can be documented publicly, and used in code samples which wouldn't otherwise rely on a package like dedent.
- give code generators a way to output readable code with correct indentation properties (e.g. jest inline snapshots).
- support "dedenting" tagged template literal functions with customized expression parameter behaviour (e.g. slonik).
- allow formatters/linters to safely enforce code style without needing to be coupled to the runtime behaviour of multiple libraries in combination.