Regular Expressions 101

Save & Share

Flavor

  • PCRE2 (PHP >=7.3)
  • PCRE (PHP <7.3)
  • ECMAScript (JavaScript)
  • Python
  • Golang
  • Java 8

Function

  • Match
  • Substitution
  • List
  • Unit Tests
/
^(https?:\/\/)?(www.)?([\da-zA-z\_\-]+)\.(com|(|[\da-zA-Z]{2,6}))([\/\w \.\-\#\&\?\%\_]*)?([^\/| |\s])$
/
gmi
^ asserts position at start of a line
1st Capturing Group
(https?:\/\/)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
http
matches the characters http literally (case insensitive)
s
matches the character s with index 11510 (7316 or 1638) literally (case insensitive)
: matches the character : with index 5810 (3A16 or 728) literally (case insensitive)
\/ matches the character / with index 4710 (2F16 or 578) literally (case insensitive)
\/ matches the character / with index 4710 (2F16 or 578) literally (case insensitive)
2nd Capturing Group
(www.)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
www
matches the characters www literally (case insensitive)
. matches any character (except for line terminators)
3rd Capturing Group
([\da-zA-z\_\-]+)
Match a single character present in the list below
[\da-zA-z\_\-]
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\d matches a digit (equivalent to [0-9])
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
A-z matches a single character in the range between A (index 65) and z (index 122) (case insensitive)
\_ matches the character _ with index 9510 (5F16 or 1378) literally (case insensitive)
\- matches the character - with index 4510 (2D16 or 558) literally (case insensitive)
\. matches the character . with index 4610 (2E16 or 568) literally (case insensitive)
4th Capturing Group
(com|(|[\da-zA-Z]{2,6}))
1st Alternative
com
com
matches the characters com literally (case insensitive)
2nd Alternative
(|[\da-zA-Z]{2,6})
5th Capturing Group
(|[\da-zA-Z]{2,6})
1st Alternative null, matches any position
2nd Alternative
[\da-zA-Z]{2,6}
Match a single character present in the list below
[\da-zA-Z]
{2,6} matches the previous token between 2 and 6 times, as many times as possible, giving back as needed (greedy)
\d matches a digit (equivalent to [0-9])
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case insensitive)
6th Capturing Group
([\/\w \.\-\#\&\?\%\_]*)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
Match a single character present in the list below
[\/\w \.\-\#\&\?\%\_]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\/ matches the character / with index 4710 (2F16 or 578) literally (case insensitive)
\w matches any word character (equivalent to [a-zA-Z0-9_])
matches the character with index 3210 (2016 or 408) literally (case insensitive)
\. matches the character . with index 4610 (2E16 or 568) literally (case insensitive)
\- matches the character - with index 4510 (2D16 or 558) literally (case insensitive)
\# matches the character # with index 3510 (2316 or 438) literally (case insensitive)
\& matches the character & with index 3810 (2616 or 468) literally (case insensitive)
\? matches the character ? with index 6310 (3F16 or 778) literally (case insensitive)
\% matches the character % with index 3710 (2516 or 458) literally (case insensitive)
\_ matches the character _ with index 9510 (5F16 or 1378) literally (case insensitive)
7th Capturing Group
([^\/| |\s])
Match a single character not present in the list below
[^\/| |\s]
\/ matches the character / with index 4710 (2F16 or 578) literally (case insensitive)
| |
matches a single character in the list | (case insensitive)
\s matches any whitespace character (equivalent to [\r\n\t\f\v \u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff])
$ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
Your regular expression does not match the subject string.

Regular Expression
No Match

/
/
gmi

Test String