Regular Expressions 101

Save & Share

Flavor

  • PCRE2 (PHP >=7.3)
  • PCRE (PHP <7.3)
  • ECMAScript (JavaScript)
  • Python
  • Golang
  • Java 8

Function

  • Match
  • Substitution
  • List
  • Unit Tests
"
(?:(?<=^)|(?<=\D))((00|\+)?55(\s|\.|-)*)?((\()?0?\d{2}(?(5)\)|)(\s|\.|-)*)?(9(\s|\.|-)*)?\d{4}(\s|\.|-)*\d{4}(?=\D|$)
"
gm
Non-capturing group
(?:(?<=^)|(?<=\D))
1st Alternative
(?<=^)
Positive Lookbehind
(?<=^)
Assert that the Regex below matches
^ asserts position at start of a line
2nd Alternative
(?<=\D)
Positive Lookbehind
(?<=\D)
Assert that the Regex below matches
\D matches any character that's not a digit (equivalent to [^0-9])
1st Capturing Group
((00|\+)?55(\s|\.|-)*)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
2nd Capturing Group
(00|\+)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
1st Alternative
00
00
matches the characters 00 literally (case sensitive)
2nd Alternative
\+
\+ matches the character + with index 4310 (2B16 or 538) literally (case sensitive)
55
matches the characters 55 literally (case sensitive)
3rd Capturing Group
(\s|\.|-)*
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data
1st Alternative
\s
\s matches any whitespace character (equivalent to [\r\n\t\f\v  ])
2nd Alternative
\.
\. matches the character . with index 4610 (2E16 or 568) literally (case sensitive)
3rd Alternative
-
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
4th Capturing Group
((\()?0?\d{2}(?(5)\)|)(\s|\.|-)*)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
5th Capturing Group
(\()?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
\( matches the character ( with index 4010 (2816 or 508) literally (case sensitive)
0
matches the character 0 with index 4810 (3016 or 608) literally (case sensitive)
\d
matches a digit (equivalent to [0-9])
{2} matches the previous token exactly 2 times
Conditional
(?(5)\)|)
Conditionally matches one of two options depending on whether the 5th capturing group matched
If condition is met, match the following regex
\)
\) matches the character ) with index 4110 (2916 or 518) literally (case sensitive)
Else match the following regex null, matches any position
6th Capturing Group
(\s|\.|-)*
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data
1st Alternative
\s
\s matches any whitespace character (equivalent to [\r\n\t\f\v  ])
2nd Alternative
\.
\. matches the character . with index 4610 (2E16 or 568) literally (case sensitive)
3rd Alternative
-
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
7th Capturing Group
(9(\s|\.|-)*)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
9 matches the character 9 with index 5710 (3916 or 718) literally (case sensitive)
8th Capturing Group
(\s|\.|-)*
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data
1st Alternative
\s
\s matches any whitespace character (equivalent to [\r\n\t\f\v  ])
2nd Alternative
\.
\. matches the character . with index 4610 (2E16 or 568) literally (case sensitive)
3rd Alternative
-
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
\d
matches a digit (equivalent to [0-9])
{4} matches the previous token exactly 4 times
9th Capturing Group
(\s|\.|-)*
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data
1st Alternative
\s
\s matches any whitespace character (equivalent to [\r\n\t\f\v  ])
2nd Alternative
\.
\. matches the character . with index 4610 (2E16 or 568) literally (case sensitive)
3rd Alternative
-
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
\d
matches a digit (equivalent to [0-9])
{4} matches the previous token exactly 4 times
Positive Lookahead
(?=\D|$)
Assert that the Regex below matches
1st Alternative
\D
\D matches any character that's not a digit (equivalent to [^0-9])
2nd Alternative
$
$ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
Your regular expression does not match the subject string.

Regular Expression
No Match

r"
"
gm

Test String