Regular Expressions 101

Community Patterns

Community Library Entry

1

Regular Expression
PCRE (PHP <7.3)

/
^(?:(?:(?<protocol>(?:http|https)):\/\/)?(?:(?<authority>(?:[A-Za-z](?:[A-Za-z\d\-]*[A-Za-z\d])?)(?:\.[A-Za-z][A-Za-z\d\-]*[A-Za-z\d])*)(?:\:(?<port>[0-9]+))?\/)(?:(?<path>[^\/][^\?\#\;]*\/))?)?(?<file>[^\?\#\/\\]*\.(?<extension>[Jj][Pp][Ee]?[Gg]|[Pp][Nn][Gg]|[Gg][Ii][Ff]))(?:\?(?<query>[^\#]*))?(?:\#(?<fragment>.*))?$
/
gm

Description

Conforms to various RFC specifications

Captures key components of url:

  • Protocol
  • Authority
  • Port
  • Path
  • Query
  • Fragment

Also because this is specifically looking for an image file, captures full file name & extension (jpeg/jpg/gif/png)

Limitations to this regex:

  • Only http or https protocols are allowed
  • Only named hosts can be used, not IP addresses
  • Does not support unicode
  • Does not support escaped characters

The path & filename detection is still a little "free" in standard URI style - they allow a wide range of characters

Submitted by anonymous - 6 years ago