Regular Expressions 101

Save & Share

Flavor

  • PCRE2 (PHP >=7.3)
  • PCRE (PHP <7.3)
  • ECMAScript (JavaScript)
  • Python
  • Golang
  • Java 8

Function

  • Match
  • Substitution
  • List
  • Unit Tests
`
(?P<origin>(?P<protocol>http[s]?:)?\/\/(?P<host>[a-z0-9A-Z-_.]+))(?P<port>:\d+)?(?P<path>[\/a-zA-Z0-9-\.]+)?(?P<search>\?[^#\n]+)?(?P<hash>#.*)?
`
gm
Named Capture Group origin
(?P<origin>(?P<protocol>http[s]?:)?\/\/(?P<host>[a-z0-9A-Z-_.]+))
Named Capture Group protocol
(?P<protocol>http[s]?:)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
http
matches the characters http literally (case sensitive)
Match a single character present in the list below
[s]
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
s matches the character s with index 11510 (7316 or 1638) literally (case sensitive)
: matches the character : with index 5810 (3A16 or 728) literally (case sensitive)
\/ matches the character / with index 4710 (2F16 or 578) literally (case sensitive)
\/ matches the character / with index 4710 (2F16 or 578) literally (case sensitive)
Named Capture Group host
(?P<host>[a-z0-9A-Z-_.]+)
Match a single character present in the list below
[a-z0-9A-Z-_.]
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
a-z matches a single character in the range between a (index 97) and z (index 122) (case sensitive)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
This hyphen is treated literally, which might be confusing for others. Consider escaping it or placing at the start or end of the class!
_.
matches a single character in the list _. (case sensitive)
Named Capture Group port
(?P<port>:\d+)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
: matches the character : with index 5810 (3A16 or 728) literally (case sensitive)
\d
matches a digit (equivalent to [0-9])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
Named Capture Group path
(?P<path>[\/a-zA-Z0-9-\.]+)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
Match a single character present in the list below
[\/a-zA-Z0-9-\.]
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\/ matches the character / with index 4710 (2F16 or 578) literally (case sensitive)
a-z matches a single character in the range between a (index 97) and z (index 122) (case sensitive)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
This hyphen is treated literally, which might be confusing for others. Consider escaping it or placing at the start or end of the class!
\. matches the character . with index 4610 (2E16 or 568) literally (case sensitive)
Named Capture Group search
(?P<search>\?[^#\n]+)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
\? matches the character ? with index 6310 (3F16 or 778) literally (case sensitive)
Match a single character not present in the list below
[^#\n]
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
# matches the character # with index 3510 (2316 or 438) literally (case sensitive)
\n matches a line-feed (newline) character (ASCII 10)
Named Capture Group hash
(?P<hash>#.*)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
# matches the character # with index 3510 (2316 or 438) literally (case sensitive)
.
matches any character (except for line terminators)
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
Your regular expression does not match the subject string.

Regular Expression
No Match

`
`
gm

Test String

Code Generator

Generated Code

Loading code sample...
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report.
For a full regex reference for Golang, please visit: https://golang.org/pkg/regexp/