use strict;
my $str = 'https://www.subdomain.example.com/folder/folder
valid.domains.below
schools.k12
newTLD.clothing text thtat should not be matched
good.photography
x.a-b.com
x-y.a-b.net
x-y.a-b-c.co.uk
x.0ac.com
schools.k12
newTLD.clothing
good.photography
0-1-2.3-4.co
a-----b.com
hello he.llo-o.com/okayokay/
https://www.11737.se/hello/
http://www.11377.se/hello/
www.11773.se/hello/
invalid-.domains for fun
-domain.com
domain--.com
-domain-.-.com
domain.000
.domain.net
domain.net.
sub.-domain.com
sub.domain-.com
sub-.domain.com
-sub.domain.com
';
my $regex = qr/(?<=https?://)(?:\w+\.)+(?<domain>\w+\.\w+)[/\s$]/mp;
if ( $str =~ /$regex/g ) {
print "Whole match is ${^MATCH} and its start/end positions can be obtained via \$-[0] and \$+[0]\n";
# print "Capture Group 1 is $1 and its start/end positions can be obtained via \$-[1] and \$+[1]\n";
# print "Capture Group 2 is $2 ... and so on\n";
}
# ${^POSTMATCH} and ${^PREMATCH} are also available with the use of '/p'
# Named capture groups can be called via $+{name}
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Perl, please visit: http://perldoc.perl.org/perlre.html