Regular Expressions 101

Save & Share

Flavor

  • PCRE2 (PHP >=7.3)
  • PCRE (PHP <7.3)
  • ECMAScript (JavaScript)
  • Python
  • Golang
  • Java 8

Function

  • Match
  • Substitution
  • List
  • Unit Tests
'
^(?P<ip>\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}) - - \[(?P<date>.*)\] "(?P<request>.*?)" (?P<status>\d*) (?P<size>\d*) "(?P<referer>.*?)" "(?P<user_agent>.*?)".*$
'
gm
^ asserts position at start of a line
Named Capture Group ip
(?P<ip>\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})
\d
matches a digit (equivalent to [0-9])
{1,3} matches the previous token between 1 and 3 times, as many times as possible, giving back as needed (greedy)
. matches any character (except for line terminators)
\d
matches a digit (equivalent to [0-9])
{1,3} matches the previous token between 1 and 3 times, as many times as possible, giving back as needed (greedy)
. matches any character (except for line terminators)
\d
matches a digit (equivalent to [0-9])
{1,3} matches the previous token between 1 and 3 times, as many times as possible, giving back as needed (greedy)
. matches any character (except for line terminators)
\d
matches a digit (equivalent to [0-9])
{1,3} matches the previous token between 1 and 3 times, as many times as possible, giving back as needed (greedy)
- -
matches the characters - - literally (case sensitive)
\[ matches the character [ with index 9110 (5B16 or 1338) literally (case sensitive)
Named Capture Group date
(?P<date>.*)
.
matches any character (except for line terminators)
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\] matches the character ] with index 9310 (5D16 or 1358) literally (case sensitive)
"
matches the characters " literally (case sensitive)
Named Capture Group request
(?P<request>.*?)
.
matches any character (except for line terminators)
*? matches the previous token between zero and unlimited times, as few times as possible, expanding as needed (lazy)
"
matches the characters " literally (case sensitive)
Named Capture Group status
(?P<status>\d*)
\d
matches a digit (equivalent to [0-9])
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
matches the character with index 3210 (2016 or 408) literally (case sensitive)
Named Capture Group size
(?P<size>\d*)
\d
matches a digit (equivalent to [0-9])
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
"
matches the characters " literally (case sensitive)
Named Capture Group referer
(?P<referer>.*?)
.
matches any character (except for line terminators)
*? matches the previous token between zero and unlimited times, as few times as possible, expanding as needed (lazy)
" "
matches the characters " " literally (case sensitive)
Named Capture Group user_agent
(?P<user_agent>.*?)
" matches the character " with index 3410 (2216 or 428) literally (case sensitive)
.
matches any character (except for line terminators)
$ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
Your regular expression does not match the subject string.

Regular Expression
No Match

r'
'
gm

Test String