Regular Expressions 101

Community Patterns

full url dissection

1

Regular Expression
PCRE2 (PHP >=7.3)

/
(?<url>(?:(?<scheme>[a-zA-Z]+:\/\/)?(?<hostname>(?:[-a-zA-Z0--ÖØ-öø-ÿ@%_\+~#=]{1,256}\.){1,256}(?:[-a-zA-Z0--ÖØ-öø-ÿ@%_\+~#=]{1,256})))(?::(?<port>[[:digit:]]+))?(?<path>(?:\/[-a-zA-Z0-9!$&'()*+,\\\/:;=@\[\]._~%]*)*)(?<query>(?:(?:\#|\?)[-a-zA-Z0-9!$&'()*+,\\\/:;=@\[\]._~]*)*))
/
gi

Description

any given url inside a text will be recognized and split into different groups:

  • url (full url)
  • scheme
  • hostname (subdomain + domain + tld)
  • port
  • path
  • query (GET-parameters)

while the only group required for a match is the hostname, therefore example.com would allready be a match.

Submitted by ttschnz - 3 years ago