• Stars
    star
    146
  • Rank 252,769 (Top 5 %)
  • Language
    Ruby
  • License
    MIT License
  • Created over 7 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Up-to-date Emoji Regex in Ruby 💥

Unicode::Emoji [version] [ci]

Provides Unicode Emoji data and regexes, incorporating the latest Unicode and Emoji standards.

Also includes a categorized list of recommended Emoji.

Emoji version: 15.0 (September 2022)

CLDR version (used for sub-region flags): 43 (April 2023)

Supported Rubies: 3.2, 3.1, 3.0

No longer supported Rubies, but might still work: 2.7, 2.6, 2.5, 2.4, 2.3

If you are stuck on an older Ruby version, checkout the latest 0.9 version of this gem.

Gemfile

gem "unicode-emoji"

Usage

Regex

The gem includes a bunch of Emoji regexes, which are compiled out of various Emoji Unicode data sources.

require "unicode/emoji"

string = "String which contains all kinds of emoji:

- Singleton Emoji: 😴
- Textual singleton Emoji with Emoji variation: ▶️
- Emoji with skin tone modifier: 🛌🏽
- Region flag: 🇵🇹
- Sub-Region flag: 🏴󠁧󠁢󠁳󠁣󠁴󠁿
- Keycap sequence: 2️⃣
- Sequence using ZWJ (zero width joiner): 🤾🏽‍♀️

"

string.scan(Unicode::Emoji::REGEX) # => ["😴", "▶️", "🛌🏽", "🇵🇹", "🏴󠁧󠁢󠁳󠁣󠁴󠁿", "2️⃣", "🤾🏽‍♀️"]

Main Regexes

Matches (non-textual) Emoji of all kinds:

Regex Description Example Matches Example Non-Matches
Unicode::Emoji::REGEX Use this if unsure! Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of recommended Emoji sequences 😴, ▶️, 🛌🏽, 🇵🇹, 2️⃣, 🏴󠁧󠁢󠁳󠁣󠁴󠁿, 🤾🏽‍♀️ 😴︎, , 🏻, 🇵🇵, 🏴󠁧󠁢󠁡󠁧󠁢󠁿, 🤠‍🤢
Unicode::Emoji::REGEX_VALID Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of valid Emoji sequences 😴, ▶️, 🛌🏽, 🇵🇹, 2️⃣, 🏴󠁧󠁢󠁳󠁣󠁴󠁿, 🏴󠁧󠁢󠁡󠁧󠁢󠁿, 🤾🏽‍♀️, 🤠‍🤢 😴︎, , 🏻, 🇵🇵
Unicode::Emoji::REGEX_WELL_FORMED Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of well-formed Emoji sequences 😴, ▶️, 🛌🏽, 🇵🇹, 2️⃣, 🏴󠁧󠁢󠁳󠁣󠁴󠁿, 🏴󠁧󠁢󠁡󠁧󠁢󠁿, 🤾🏽‍♀️, 🤠‍🤢, 🇵🇵 😴︎, , 🏻
Picking the Right Emoji Regex
  • Usually you just want REGEX (RGI set)
  • If you want broader matching (e.g. more sub-regions), choose REGEX_VALID
  • If you even want to match for invalid sequences, too, use REGEX_WELL_FORMED

Please see the standard for details.

Property REGEX (RGI / Recommended) REGEX_VALID (Valid) REGEX_WELL_FORMED (Well-formed)
Region "🇵🇹" Yes Yes Yes
Region "🇵🇵" No No Yes
Tag Sequence "🏴󠁧󠁢󠁳󠁣󠁴󠁿" Yes Yes Yes
Tag Sequence "🏴󠁧󠁢󠁡󠁧󠁢󠁿" No Yes Yes
Tag Sequence "😴󠁧󠁢󠁡󠁡󠁡󠁿" No No Yes
ZWJ Sequence "🤾🏽‍♀️" Yes Yes Yes
ZWJ Sequence "🤠‍🤢" No Yes Yes

More info about valid vs. recommended Emoji in this blog article on Emojipedia.

Singleton Regexes

Matches only simple one-codepoint (+ optional variation selector) Emoji:

Regex Description Example Matches Example Non-Matches
Unicode::Emoji::REGEX_BASIC Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji), but no sequences at all 😴, ▶️ 😴︎, , 🏻, 🛌🏽, 🇵🇹, 🇵🇵,2️⃣, 🏴󠁧󠁢󠁳󠁣󠁴󠁿, 🏴󠁧󠁢󠁡󠁧󠁢󠁿, 🤾🏽‍♀️, 🤠‍🤢
Unicode::Emoji::REGEX_TEXT Matches only textual singleton Emoji (except for singleton components, like digit 1) 😴︎, 😴, ▶️, 🏻, 🛌🏽, 🇵🇹, 🇵🇵,2️⃣, 🏴󠁧󠁢󠁳󠁣󠁴󠁿, 🏴󠁧󠁢󠁡󠁧󠁢󠁿, 🤾🏽‍♀️, 🤠‍🤢

Include Textual Emoji

By default, textual Emoji (emoji characters with text variation selector or those that have a default text presentation) will not be included in the default regexes. However, if you wish to match for them too, you can include them in your regex by appending the _INCLUDE_TEXT suffix:

Regex Description Example Matches Example Non-Matches
Unicode::Emoji::REGEX_INCLUDE_TEXT REGEX + REGEX_TEXT 😴, ▶️, 🛌🏽, 🇵🇹, 2️⃣, 🏴󠁧󠁢󠁳󠁣󠁴󠁿, 🤾🏽‍♀️, 😴︎, 🏻, 🇵🇵, 🏴󠁧󠁢󠁡󠁧󠁢󠁿, 🤠‍🤢
Unicode::Emoji::REGEX_VALID_INCLUDE_TEXT REGEX_VALID + REGEX_TEXT 😴, ▶️, 🛌🏽, 🇵🇹, 2️⃣, 🏴󠁧󠁢󠁳󠁣󠁴󠁿, 🏴󠁧󠁢󠁡󠁧󠁢󠁿, 🤾🏽‍♀️, 🤠‍🤢, 😴︎, 🏻, 🇵🇵
Unicode::Emoji::REGEX_WELL_FORMED_INCLUDE_TEXT REGEX_WELL_FORMED + REGEX_TEXT 😴, ▶️, 🛌🏽, 🇵🇹, 2️⃣, 🏴󠁧󠁢󠁳󠁣󠁴󠁿, 🏴󠁧󠁢󠁡󠁧󠁢󠁿, 🤾🏽‍♀️, 🤠‍🤢, 🇵🇵, 😴︎, 🏻

Extended Pictographic Regex

Unicode::Emoji::REGEX_PICTO matches single codepoints with the Extended_Pictographic property. For example, it will match BLACK SAFETY SCISSORS.

Unicode::Emoji::REGEX_PICTO_NO_EMOJI matches single codepoints with the Extended_Pictographic property, but excludes Emoji characters.

See character.construction/picto for a list of all non-Emoji pictographic characters.

Partial Regexes

Matches potential Emoji parts (often, this is not what you want):

Regex Description Example Matches Example Non-Matches
Unicode::Emoji::REGEX_ANY Matches any Emoji-related codepoint (but no variation selectors, tags, or zero-width joiners). Please not that this will match Emoji-parts rather than complete Emoji, for example, single digits! 😴, , 🏻, 🛌, 🏽, 🇵, 🇹, 2, 🏴, 🤾, , 🤠, 🤢 -

List

Use Unicode::Emoji::LIST or the list method to get a grouped (and ordered) list of Emoji:

Unicode::Emoji.list.keys
# => ["Smileys & Emotion", "People & Body", "Component", "Animals & Nature", "Food & Drink", "Travel & Places", "Activities", "Objects", "Symbols", "Flags"]

Unicode::Emoji.list("Food & Drink").keys
# => ["food-fruit", "food-vegetable", "food-prepared", "food-asian", "food-marine", "food-sweet", "drink", "dishware"]

Unicode::Emoji.list("Food & Drink", "food-asian")
=> ["🍱", "🍘", "🍙", "🍚", "🍛", "🍜", "🍝", "🍠", "🍢", "🍣", "🍤", "🍥", "🥮", "🍡", "🥟", "🥠", "🥡"]

Please note that categories might change with future versions of the Emoji standard. This gem will issue warnings when attempting to retrieve old categories using the #list method.

A list of all Emoji can be found at character.construction.

Properties

Allows you to access the codepoint data form Unicode's emoji-data.txt file:

require "unicode/emoji"

Unicode::Emoji.properties "☝" # => ["Emoji", "Emoji_Modifier_Base"]

Also See

MIT

More Repositories

1

irbtools

Improvements for Ruby's IRB console 💎︎
Ruby
919
star
2

clipboard

Ruby access to the clipboard on Windows, Linux, macOS, Java, WSL and more platforms 📋︎
Ruby
372
star
3

paint

Ruby gem for ANSI terminal colors 🎨︎ VERY FAST
Ruby
370
star
4

whirly

Colorful Terminal Spinner for Ruby 😀︎
Ruby
324
star
5

idiosyncratic-ruby.com

Documenting All Ruby Specialities 💎︎
JavaScript
312
star
6

uniscribe

Know your Unicode ✀
Ruby
280
star
7

pws

Command-Line Password Safe 🔐︎
Ruby
209
star
8

unibits

Visualize different Unicode encodings in the terminal
Ruby
127
star
9

unicode-display_width

Monospace Unicode character width in Ruby
Ruby
118
star
10

sugar_refinery

Tiny refinements for Ruby
Ruby
110
star
11

stdgems

Ruby's default & bundled gems: The new standard library
Ruby
109
star
12

productive-sublime-snippets-ruby

Ruby Snippets for Sublime Text
Ruby
107
star
13

relaxed.ruby.style

A Relaxed Style Guide for Ruby & Configuration for RuboCop
Ruby
72
star
14

unicode-confusable

Unicode::Confusable.confusable? "ℜսᖯʏ", "Ruby"
Ruby
71
star
15

wirb

Ruby Object Inspection for IRB
Ruby
70
star
16

fresh

Fresh Ruby Enhanced SHell
Ruby
70
star
17

sig

Validate Method Arguments & Results in Ruby
Ruby
58
star
18

fancy_irb

Colors & Hash Rockets in IRB
Ruby
47
star
19

rg

A way to integrate AngularJS into a Rails project using CoffeeScript and Bower.
Ruby
46
star
20

debugging

Improve your Print Debugging
Ruby
42
star
21

unicode-x

Unicode Micro Libraries for Ruby
Ruby
38
star
22

characteristics

Character info under different encodings
Ruby
27
star
23

object_shadow

The Shadow of a Ruby Object lets you See and Manipulate its Instance Variables and Methods
Ruby
27
star
24

value_struct

Read-only structs in Ruby
Ruby
25
star
25

redux.rb

A tiny Ruby redux
Ruby
25
star
26

code

Displays a Ruby method's source code
Ruby
24
star
27

has_many_booleans

This Rails plugin/gem allows you to generate virtual boolean attributes, which get saved in the database as a single bitset integer.
Ruby
23
star
28

microevent.rb

Events for Ruby objects (a.k.a objects with Publish-Subscribe capabilities a.k.a. Observer pattern)
Ruby
23
star
29

ruby.style

Collects Ruby Style Guides
CSS
22
star
30

unicopy

Unicode command-line codepoint dumper
Ruby
20
star
31

unicode-blocks

Unicode Blocks of a Ruby String
Ruby
18
star
32

irbtools-more

irbtools-more adds gems to IRB that may not build out-of-the-box
18
star
33

character.construction

Notable characters, codepoints, and resources
Ruby
16
star
34

ruby_version

RubyVersion | Better than RUBY_VERSION
Ruby
15
star
35

better-array

Unobtrusive JavaScript Array Extras
JavaScript
15
star
36

render_react

Pre-render and mount React components from Ruby
Ruby
15
star
37

yyid.ex

Almost a random UUID in Elixir
Elixir
14
star
38

rubybuntu-gedit

Ruby/Rails/Web related gedit language definitions, mime types, styles and snippets.
Ruby
14
star
39

slim_migrations

Let's you write slightly slimmer Rails migrations.
Ruby
14
star
40

unicode-name

Unicode character names in Ruby
Ruby
13
star
41

uke

𝄝 Ukulele CLI Support
Ruby
13
star
42

gedit-external-tools

A repository for useful and handy snippets for gedit's external tools plugin
Shell
13
star
43

boolean2

Boolean2 is a Ruby constant that is an ancestor of true and false.
Ruby
11
star
44

derb

Dockerfile.erb
Ruby
11
star
45

unicode-scripts

Unicode Scripts / Script Extensions of a Ruby String
Ruby
11
star
46

az

From A to Z
Ruby
10
star
47

symbolify

␀ ␁ ␂ ␃ ␄ ␅ ␆ ␇ ␈ ␉ ␊ ␋ ␌ ␍ ␎ ␏ ␐ ␑ ␒ ␓ ␔ ␕ ␖ ␗ ␘ ␙ ␚ ␛ ␜ ␝ ␞ ␟ ␠ ␡
Ruby
9
star
48

micrologger

A minimal logger based on MicroEvent.rb
Ruby
9
star
49

watchbuffy

Which Buffy episode to put on next?
Ruby
8
star
50

ruby_info

RubyInfo | Better than SCRIPT_LINES__
Ruby
8
star
51

productive-sublime-snippets-erb

Productive Sublime Snippets for ERB
Ruby
8
star
52

unicoder

(wip)
Ruby
7
star
53

clipboard_formatter

A clipboard formatter for RSpec
Ruby
7
star
54

unicode-categories

Unicode General Categories of a Ruby String
Ruby
7
star
55

microgem

more gems
Ruby
6
star
56

rubynetz

Example Usage of Harvester
6
star
57

unicode-sequence_name

Unicode sequence names in Ruby
Ruby
6
star
58

unicode-types

Basic Unicode Types of a Ruby String
Ruby
6
star
59

Deutsch.rb

Like English.rb
Ruby
6
star
60

rubybuntu-language-specs

gtksourceview language specifications for Ruby/Web devoloper's gedit
Ruby
6
star
61

unicode-numeric_value

Convert a Unicode character into its numeric value
Ruby
6
star
62

ripl-multi_line

This ripl plugin allows you to evaluate multiple lines of code.
Ruby
6
star
63

ruby_engine

RubyEngine | Better than RUBY_ENGINE
Ruby
6
star
64

rubybuntu-mime

gnome mime types for Ruby/Web developer's gedit
5
star
65

promiseUserMedia.js

Promisified access to getUserMedia & vendor prefixes.
JavaScript
5
star
66

added

Module#added
Ruby
5
star
67

ripltools

This meta gem installs a bunch of ripl plugins for a nice-to-use general purpose ripl.
Ruby
5
star
68

every_day_irb

Ruby
4
star
69

procstar

Provides to_proc implementations for other Ruby classes than just Symbol
Ruby
4
star
70

rusty_clipboard

Ruby 🡪 Rust 🡪 System Clipboard
Ruby
4
star
71

unicode-age

Determine Unicode version required to display a string
Ruby
4
star
72

yyid.rb

Almost a random UUID in Ruby
Ruby
4
star
73

multi_block

Pass multiple blocks to a Ruby method
Ruby
4
star
74

ripl-color_result

This ripl plugin colorizes your results.
Ruby
4
star
75

nem

npm + gem = nem
Ruby
4
star
76

rubybuntu-editor-styles

gtksourceview styles for Ruby/Web devoloper's gedit
4
star
77

named_proc

Named procs and lambdas
Ruby
3
star
78

local_port

Returns the next free local port number to use for your shiny new service
Ruby
3
star
79

nomore

Blocks your computer from accessing domains on the internet
Ruby
3
star
80

cd

Enhanced cd command for the Ruby console.
Ruby
3
star
81

ripl-auto_indent

This ripl plugin indents your multi-line Ruby input.
Ruby
3
star
82

unicode-version

Which level of Unicode and Emoji support is included with Ruby?
Ruby
3
star
83

egonil

Egocentric Nil
Ruby
2
star
84

talk-ruby-unconf-surprises

Ruby is Full of Surprises (Ruby Unconf 2018)
JavaScript
2
star
85

iterate

Kernel#iterate
Ruby
2
star
86

ripl-rocket

Lets you display the ripl result as a comment on the same line.
Ruby
2
star
87

website

Ruby
2
star
88

instance_variables_from

Turn bindings, hashes or arrays into instance variables
Ruby
2
star
89

yyid.js

yyid() generates a random uuid* in the browser, uses the crypto api when available
JavaScript
2
star
90

pws-otp

Experimental OTP support for PWS
Ruby
2
star
91

null_plus

+nil
Ruby
2
star
92

ripl-color_streams

This ripl plugin colorizes your stdout and stderr streams.
Ruby
2
star
93

wcswidth-ruby

FFI bindings to libc's wcswidth() to determine the actual display width of strings
Ruby
2
star
94

unicode-category.js

Get the General Category of a Unicode character
JavaScript
1
star
95

yyid.go

Almost a random UUID in Go
Go
1
star
96

yyid-node.js

Almost a random UUID in node.js
JavaScript
1
star
97

communication-map

WebRTC based Location Sharing
CSS
1
star
98

null_question

Adds the null? predicate to Ruby's nil
Ruby
1
star
99

ripl-profiles

This ripl plugin adds a --profile option to ripl that loads profile files in ~/.ripl/profiles before starting ripl
Ruby
1
star
100

exists

Turns null objects into nil
Ruby
1
star