Regular Expressions 101

Save & Share

Flavor

  • PCRE2 (PHP >=7.3)
  • PCRE (PHP <7.3)
  • ECMAScript (JavaScript)
  • Python
  • Golang
  • Java 8

Function

  • Match
  • Substitution
  • List
  • Unit Tests
/
^(?'Username'[-\w\d\.]+?)(?:\s+at\s+|\s+\[at\]\s+|\s*@\s*|\s*(?:[\[\]@]){3}\s*)(?'Domain'[-\w\d\.]*?)\s*(?:dot|\.|(?:[\[\]dot\.]){3,5})\s*(?'TLD'\w+)$
/
gm
^ asserts position at start of a line
Named Capture Group Username
(?'Username'[-\w\d\.]+?)
Match a single character present in the list below
[-\w\d\.]
+? matches the previous token between one and unlimited times, as few times as possible, expanding as needed (lazy)
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
This hyphen is treated literally, which might be confusing for others. Consider escaping it or placing at the start or end of the class!
\w matches any word character (equivalent to [a-zA-Z0-9_])
\d matches a digit (equivalent to [0-9])
\. matches the character . with index 4610 (2E16 or 568) literally (case sensitive)
Non-capturing group
(?:\s+at\s+|\s+\[at\]\s+|\s*@\s*|\s*(?:[\[\]@]){3}\s*)
1st Alternative
\s+at\s+
\s
matches any whitespace character (equivalent to [\r\n\t\f\v ])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
at
matches the characters at literally (case sensitive)
\s
matches any whitespace character (equivalent to [\r\n\t\f\v ])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
2nd Alternative
\s+\[at\]\s+
\s
matches any whitespace character (equivalent to [\r\n\t\f\v ])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\[ matches the character [ with index 9110 (5B16 or 1338) literally (case sensitive)
at
matches the characters at literally (case sensitive)
\] matches the character ] with index 9310 (5D16 or 1358) literally (case sensitive)
\s
matches any whitespace character (equivalent to [\r\n\t\f\v ])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
3rd Alternative
\s*@\s*
\s
matches any whitespace character (equivalent to [\r\n\t\f\v ])
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
@ matches the character @ with index 6410 (4016 or 1008) literally (case sensitive)
\s
matches any whitespace character (equivalent to [\r\n\t\f\v ])
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
4th Alternative
\s*(?:[\[\]@]){3}\s*
\s
matches any whitespace character (equivalent to [\r\n\t\f\v ])
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Non-capturing group
(?:[\[\]@]){3}
{3} matches the previous token exactly 3 times
Match a single character present in the list below
[\[\]@]
\[ matches the character [ with index 9110 (5B16 or 1338) literally (case sensitive)
\] matches the character ] with index 9310 (5D16 or 1358) literally (case sensitive)
@ matches the character @ with index 6410 (4016 or 1008) literally (case sensitive)
\s
matches any whitespace character (equivalent to [\r\n\t\f\v ])
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Named Capture Group Domain
(?'Domain'[-\w\d\.]*?)
Match a single character present in the list below
[-\w\d\.]
*? matches the previous token between zero and unlimited times, as few times as possible, expanding as needed (lazy)
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
This hyphen is treated literally, which might be confusing for others. Consider escaping it or placing at the start or end of the class!
\w matches any word character (equivalent to [a-zA-Z0-9_])
\d matches a digit (equivalent to [0-9])
\. matches the character . with index 4610 (2E16 or 568) literally (case sensitive)
\s
matches any whitespace character (equivalent to [\r\n\t\f\v ])
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Non-capturing group
(?:dot|\.|(?:[\[\]dot\.]){3,5})
1st Alternative
dot
dot
matches the characters dot literally (case sensitive)
2nd Alternative
\.
\. matches the character . with index 4610 (2E16 or 568) literally (case sensitive)
3rd Alternative
(?:[\[\]dot\.]){3,5}
Non-capturing group
(?:[\[\]dot\.]){3,5}
{3,5} matches the previous token between 3 and 5 times, as many times as possible, giving back as needed (greedy)
Match a single character present in the list below
[\[\]dot\.]
\[ matches the character [ with index 9110 (5B16 or 1338) literally (case sensitive)
\] matches the character ] with index 9310 (5D16 or 1358) literally (case sensitive)
dot
matches a single character in the list dot (case sensitive)
\. matches the character . with index 4610 (2E16 or 568) literally (case sensitive)
\s
matches any whitespace character (equivalent to [\r\n\t\f\v ])
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Named Capture Group TLD
(?'TLD'\w+)
\w
matches any word character (equivalent to [a-zA-Z0-9_])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
$ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
Your regular expression does not match the subject string.Try launching the debugger to find out why.

Regular Expression
No Match

/
/
gm

Test String

Substitution

Processing...