Community Patterns

Community Library Entry

1

Regular Expression
Created·2023-09-26 02:02
Flavor·ECMAScript (JavaScript)

/
(?:\p{Extended_Pictographic}[\p{Emoji_Modifier}\p{M}]*(?:\p{Join_Control}\p{Extended_Pictographic}[\p{Emoji_Modifier}\p{M}]*)*|\s|.)\p{M}*
/
guy
Open regex in editor

Description

RegExp matches each (combined) Unicode symbol, character, or emoji consecutively.

Note: there are some edge cases where it does not combine some symbols that should be seen as one symbol (after Default Grapheme Cluster Boundary of UAX #29). For Unicode code point info see codepoints.net

Also, see this stackoverflow, which inspired me to make this RegExp (and I also needed it for a project), for Unicode symbol splitting with JavaScript strings.

Submitted by MAZ01001