Regular Expressions 101

Community Patterns

URL parser/validator

1

Regular Expression
PCRE (PHP <7.3)

/
^(https?:\/\/)? # optional scheme ((?:[-a-z0-9._~!$&\'()*+,;=]|%[0-9a-f]{2})+ # optional username@, (?::(?:[-a-z0-9._~!$&\'()*+,;=]|%[0-9a-f]{2})+)?@)? # or username:password@ (?:((?:(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\.){3}(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])) # IPv4 address |((?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z][a-z0-9-]*[a-z0-9])) # or dot-separated domain labels (:\d+)? # optional port number ((?:\/(?:[-a-z0-9._~!$&\'()*+,;=:@]|%[0-9a-f]{2})+)*\/?) # path (possibly empty, may end in /, no double-// allowed) (\?(?:[-a-z0-9._~!$&\'()*+,;=:@\/?]|%[0-9a-f]{2})*)? # optional querystring (\#(?:[-a-z0-9._~!$&\'()*+,;=:@\/?]|%[0-9a-f]{2})*)?$ # optional fragment
/
ix

Description

Validates and parses absolute URLs:

  • Optional http:// or https:// at the start
  • IPv4 supported, IPv6 NOT supported
  • International characters (punycode) NOT supported
  • Extracts scheme, username, password, IPv4 address or domain, port, path, querystring and fragment
  • Sensible restrictions on domain and path
Submitted by doin - 8 years ago