tree-sitter-comment
Tree-sitter grammar for comment tags like TODO:
, FIXME(user):
, etc.
Useful to be embedded inside comments.
Check the playground at https://stsewd.dev/tree-sitter-comment/.
Syntax
Since comment tags aren't a programming language or have a standard, I have chosen to follow popular conventions for the syntax.
Comment tags
- Comment tags can contain:
- Upper case ascii letters
- Numbers (can't start with one)
-
,_
(they can't start or end with these characters)
- Optionally can have an user linked to the tag inside parentheses
()
- The name must be followed by
:
and a whitespace
URIs
- http and https links are recognized
If you think there are other popular conventions this syntax doesn't cover, feel free to open a issue.
Examples
TODO: something needs to be done
TODO(stsewd): something needs to be done by @stsewd
XXX: fix something else.
XXX: extra white spaces.
(NOTE: this works too).
NOTE-BUG (stsewd): tags can be separated by `-`
NOTE_BUG: or by `_`.
This will be recognized as a URI
https://github.com/stsewd/
FAQ
Can I match a tag that doesn't end in :
, like TODO
?
This grammar doesn't provide a specific token for it, but you can match it with this query:
("text" @todo
(#eq? @todo "TODO"))
Can I highlight references to issues, PRs, MRs, like #10
or !10
?
This grammar doesn't provide a specific token for it, but you can match it with this query:
("text" @issue
(#match? @issue "^#[0-9]+$"))
;; NOTE: This matches `!10` and `! 10`.
("text" @symbol . "text" @issue
(#eq? @symbol "!")
(#match? @issue "^[0-9]+$"))
I'm using Neovim and don't see all tags highlighted
To avoid false positives, Neovim doesn't highlight all tags,
but a list of specific ones,
see the list at queries/comment/highlights.scm
.
If you want your tag highlighted, you can extend the query locally, see :h treesitter-query
.
Or if you think it's very common, you can suggest it upstream.
Why C?
Tree-sitter is a LR parser for context-free grammars, that means it works great for grammars that don't require backtracking, or to keep a state for whitespaces (like indentation). For these reasons, parsing languages that need to keep a state or falling back to a general token, it requires some manual parsing in C.
Projects using this grammar
- nvim-treesitter
- helix
- Yours?
Other grammars
- tree-sitter-rst: reStructuredText grammar.