use strict;
my $str = '
I want to extract the words between the two bracket "blocks" and also the word in first brackets (RUNNING or STOPPED).
Example (extract the bolded part):
[ RUNNING ] My First Application [Pid: 4194]
[ RUNNING ] Second app (some data) [Pid: 5248]
[ STOPPED ] Logger App
So, as you can see, the [Pid: X] part is optional. I can write the regex as follows:
\\[\\s+(RUNNING|STOPPED)\\s+\\]\\s+([^\\[]+).*
and it will work. But this would fail if App name would contain the \'[\' character. I tried the following, but it won\'t work:
\\[\\s+(RUNNING|STOPPED)\\s+\\]\\s+(?!\\[Pid)+.*
My idea was to match any words/characters that are not starting with "[Pid", but I guess this would match any words that are not followed by "[Pid".
Is there any way to do exactly that: Match any word that is not "[Pid", i.e. match the part until first appearing of "[Pid" substring?
';
my $regex = qr/\[\ (RUNNING|STOPPED)\ \]
(.+?)
(?:\[Pid:\ (\d+)\]|$)/xmp;
if ( $str =~ /$regex/g ) {
print "Whole match is ${^MATCH} and its start/end positions can be obtained via \$-[0] and \$+[0]\n";
# print "Capture Group 1 is $1 and its start/end positions can be obtained via \$-[1] and \$+[1]\n";
# print "Capture Group 2 is $2 ... and so on\n";
}
# ${^POSTMATCH} and ${^PREMATCH} are also available with the use of '/p'
# Named capture groups can be called via $+{name}
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Perl, please visit: http://perldoc.perl.org/perlre.html