use strict;
my $str = '"http://www.website.com"
https://www.website.com
"ftp://www.website.com
ftps://www.website.com
<u>http://www.website.com</u>
This is a link http://www.website.com that is not linked ftp://www.website.com
This is a long link http://www.website.com/index.htm?foo=bar
<a href="http://www.website.com" target="_blank">http://website.com</a >
<a href="https://www.website.com">http://website.com</a>
<a href="http://www.website.com"><u>http://www.website.com</u></a>
<a href="ftp://www.website.com">ftp://www.website.com</a>
<img src="http://www.website.com" target="_blank"/>
<a href="http://www.website.com" target="_blank">
http://website.com
</a>
<a href="http://www.website.com">
http://www.website.com
http://www.website.com
</a>
Lorem ipsum Test dolor sit amet, consetetur sadhttp://url.comipscing elitr, sed diam <a href="http://Test.com/url">Test</a> eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam http://url.com et justo...
<a href="http://www.website.com">
http://www.website.com
<img src="http://www.website.com" target="_blank"/>
http://www.website.com
</a>';
my $regex = qr/\b(ftps?:\/\/[^"<\s]+)(?![^<>]*>|[^"]*?<\/a)/ip;
if ( $str =~ /$regex/g ) {
print "Whole match is ${^MATCH} and its start/end positions can be obtained via \$-[0] and \$+[0]\n";
# print "Capture Group 1 is $1 and its start/end positions can be obtained via \$-[1] and \$+[1]\n";
# print "Capture Group 2 is $2 ... and so on\n";
}
# ${^POSTMATCH} and ${^PREMATCH} are also available with the use of '/p'
# Named capture groups can be called via $+{name}
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Perl, please visit: http://perldoc.perl.org/perlre.html