Regular Expressions 101

Save & Share

Flavor

  • PCRE2 (PHP >=7.3)
  • PCRE (PHP <7.3)
  • ECMAScript (JavaScript)
  • Python
  • Golang
  • Java 8

Function

  • Match
  • Substitution
  • List
  • Unit Tests
"
^(?:(http|https)://|//)?(?P<url>www.(?P<resource>facebook.com).*?/video.php\?.*?(?:(?:videos%2F(vb\.[0-9]+%2F)?|v%3D)(?P<id>[0-9]+)).*)
"
gm
^ asserts position at start of a line
Non-capturing group
(?:(http|https)://|//)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
1st Alternative
(http|https)://
1st Capturing Group
(http|https)
1st Alternative
http
http
matches the characters http literally (case sensitive)
2nd Alternative
https
https
matches the characters https literally (case sensitive)
://
matches the characters :// literally (case sensitive)
2nd Alternative
//
//
matches the characters // literally (case sensitive)
Named Capture Group url
(?P<url>www.(?P<resource>facebook.com).*?/video.php\?.*?(?:(?:videos%2F(vb\.[0-9]+%2F)?|v%3D)(?P<id>[0-9]+)).*)
www
matches the characters www literally (case sensitive)
. matches any character (except for line terminators)
Named Capture Group resource
(?P<resource>facebook.com)
facebook
matches the characters facebook literally (case sensitive)
. matches any character (except for line terminators)
com
matches the characters com literally (case sensitive)
.
matches any character (except for line terminators)
*? matches the previous token between zero and unlimited times, as few times as possible, expanding as needed (lazy)
/video
matches the characters /video literally (case sensitive)
. matches any character (except for line terminators)
php
matches the characters php literally (case sensitive)
\? matches the character ? with index 6310 (3F16 or 778) literally (case sensitive)
.
matches any character (except for line terminators)
*? matches the previous token between zero and unlimited times, as few times as possible, expanding as needed (lazy)
Non-capturing group
(?:(?:videos%2F(vb\.[0-9]+%2F)?|v%3D)(?P<id>[0-9]+))
Non-capturing group
(?:videos%2F(vb\.[0-9]+%2F)?|v%3D)
1st Alternative
videos%2F(vb\.[0-9]+%2F)?
videos%2F
matches the characters videos%2F literally (case sensitive)
4th Capturing Group
(vb\.[0-9]+%2F)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
vb
matches the characters vb literally (case sensitive)
\. matches the character . with index 4610 (2E16 or 568) literally (case sensitive)
Match a single character present in the list below
[0-9]
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
%2F
matches the characters %2F literally (case sensitive)
2nd Alternative
v%3D
v%3D
matches the characters v%3D literally (case sensitive)
Named Capture Group id
(?P<id>[0-9]+)
Match a single character present in the list below
[0-9]
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
.
matches any character (except for line terminators)
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
Your regular expression does not match the subject string.

Regular Expression
No Match

r"
"
gm

Test String