Regular Expressions 101

Community Patterns

Community Library Entry

0

Regular Expression
Python

r"
^(?P<scheme>http(?:s)?:\/\/)?(?P<www>www\d?\.)?(?P<domain>\b(?!www\d?\.)[a-z\d\-]+(?:\.[a-z\d\-]{2,})+)(?P<port>:\d{1,5})?(?P<path>\/(?:[-a-zA-Z0-9._~!$&\'()*+,;=:@]|%[0-9a-fA-F]{2})*)*(?P<query_fragm>[?#&][\/\_\-\&\%\?\#\+\=\.:a-zA-Z\d]*)*$
"
gm

Description

Matches in python valid urls (excludes some edge cases), but pretty good to verify an URL before scraping it

Submitted by anonymous - a year ago (Last modified a year ago)