Regular Expressions 101

Community Patterns

url tokenizer

1

Regular Expression
PCRE (PHP <7.3)

/
^(?:(?'protocol'[a-z]{2,}(?=[:])):)?(?:\/\/)?(?:(?'subdomain'(?:[0-9a-z](?![\-\.]))+(?:[0-9a-z\-\.][0-9a-z]+)+?)\.(?'domain'(?:[0-9a-z](?![\-]))+(?:[0-9a-z\-][0-9a-z])\.(?'tld'[a-z]{2,}(?:\.[a-z]{2})?))|(?'ip'(?'segment'[01][0-9][0-9]|2[0-4][0-9]|25[0-5]|[0-9])\.(?&segment)\.(?&segment)\.(?&segment)))(?:\:(?'port'\d+))?(?'path'(?:\/[^\/\?]*?)*)?(?:\?(?'query'[^#]*))?(?:#!?(?'hash'.*))?$
/
gm

Description

break down a url to it's components

Submitted by johnrcui - 8 years ago