Regular Expressions 101

Save & Share

Flavor

  • PCRE2 (PHP >=7.3)
  • PCRE (PHP <7.3)
  • ECMAScript (JavaScript)
  • Python
  • Golang
  • Java 8

Function

  • Match
  • Substitution
  • List
  • Unit Tests
"
((@[\w\d][\w\d_]+)\s+)?((\.[a-zA-z][a-zA-z\d_]+)\s+)?(\'.+\'|\".+\"|[\da-fbohx.\s\-+:\'\",]+)(//.+|$|\s)+
"
gim
1st Capturing Group
((@[\w\d][\w\d_]+)\s+)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
2nd Capturing Group
(@[\w\d][\w\d_]+)
@ matches the character @ with index 6410 (4016 or 1008) literally (case insensitive)
Match a single character present in the list below
[\w\d]
\w matches any word character (equivalent to [a-zA-Z0-9_])
\d matches a digit (equivalent to [0-9])
Match a single character present in the list below
[\w\d_]
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\w matches any word character (equivalent to [a-zA-Z0-9_])
\d matches a digit (equivalent to [0-9])
_ matches the character _ with index 9510 (5F16 or 1378) literally (case insensitive)
\s
matches any whitespace character (equivalent to [\r\n\t\f\v  ])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
3rd Capturing Group
((\.[a-zA-z][a-zA-z\d_]+)\s+)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
4th Capturing Group
(\.[a-zA-z][a-zA-z\d_]+)
\. matches the character . with index 4610 (2E16 or 568) literally (case insensitive)
Match a single character present in the list below
[a-zA-z]
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
A-z matches a single character in the range between A (index 65) and z (index 122) (case insensitive)
Match a single character present in the list below
[a-zA-z\d_]
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
A-z matches a single character in the range between A (index 65) and z (index 122) (case insensitive)
\d matches a digit (equivalent to [0-9])
_ matches the character _ with index 9510 (5F16 or 1378) literally (case insensitive)
\s
matches any whitespace character (equivalent to [\r\n\t\f\v  ])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
5th Capturing Group
(\'.+\'|\".+\"|[\da-fbohx.\s\-+:\'\",]+)
1st Alternative
\'.+\'
\' matches the character ' with index 3910 (2716 or 478) literally (case insensitive)
.
matches any character (except for line terminators)
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\' matches the character ' with index 3910 (2716 or 478) literally (case insensitive)
2nd Alternative
\".+\"
\" matches the character " with index 3410 (2216 or 428) literally (case insensitive)
.
matches any character (except for line terminators)
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\" matches the character " with index 3410 (2216 or 428) literally (case insensitive)
3rd Alternative
[\da-fbohx.\s\-+:\'\",]+
Match a single character present in the list below
[\da-fbohx.\s\-+:\'\",]
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\d matches a digit (equivalent to [0-9])
a-f matches a single character in the range between a (index 97) and f (index 102) (case insensitive)
bohx.
matches a single character in the list bohx. (case insensitive)
\s matches any whitespace character (equivalent to [\r\n\t\f\v  ])
\- matches the character - with index 4510 (2D16 or 558) literally (case insensitive)
+:
matches a single character in the list +: (case insensitive)
\' matches the character ' with index 3910 (2716 or 478) literally (case insensitive)
\" matches the character " with index 3410 (2216 or 428) literally (case insensitive)
, matches the character , with index 4410 (2C16 or 548) literally (case insensitive)
6th Capturing Group
(//.+|$|\s)+
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data
1st Alternative
//.+
//
matches the characters // literally (case insensitive)
.
matches any character (except for line terminators)
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
2nd Alternative
$
$ asserts position at the end of a line
3rd Alternative
\s
\s matches any whitespace character (equivalent to [\r\n\t\f\v  ])
Global pattern flags
g modifier: global. All matches (don't return after first match)
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
Your regular expression does not match the subject string.

Regular Expression
No Match

r"
"
gim

Test String