use strict;
my $str = '<div class=\'bg-transparent mb-3\'>
<p>Try this:</p>
<pre><code>string html;
string cleaned = new Regex("style=\\"[^\\"]*\\"").Replace(html, "");
string cleaned = new Regex("(?<=class=\\")([^\\"]*)\\\\babc\\\\w*\\\\b([^\\"]*)(?=\\")").Replace(cleaned, "$1$2");
</code></pre>
</div>
</div>
<div class="col-4"></div>
</div>
</div><!-- #comments -->
<div id="related-embeded" class="related-embeded-area"><div class="row"><div class="col-12"><div class="mt-3 border-bottom border-success"><h4 class="text-info"><i class=\'fa fa-check-circle text-info mr-3\'></i><span>Related Solutions</span></h4></div><div class="mt-3 mb-3 border-bottom"><h5><a href=\'https://itecnote.com/tecnote/java-remove-html-tags-from-a-string/\'>Java – Remove HTML tags from a String</a></h5></div><div class=\'bg-transparent mb-3\'>
<p>Use a HTML parser instead of regex. This is dead simple with <a href="http://jsoup.org" rel="noreferrer">Jsoup</a>.</p>
<pre><code>public static String html2text(String html) {
return Jsoup.parse(html).text();
}
</code></pre>
<p>Jsoup also <a href="https://jsoup.org/cookbook/cleaning-html/whitelist-sanitizer" rel="noreferrer">supports</a> removing HTML tags against a customizable whitelist, which is very useful if you want to allow only e.g. <code><b></code>, <code><i></code> and <code><u></code>.</p>
<h3>See also:</h3>
<ul>
<li><a href="https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags">RegEx match open tags except XHTML self-contained tags</a></li>
<li><a href="https://stackoverflow.com/questions/3152138/what-are-the-pros-and-cons-of-the-leading-java-html-parsers">What are the pros and cons of the leading Java HTML parsers?</a></li>
<li><a href="https://stackoverflow.com/questions/2658922/xss-prevention-in-jsp-servlet-web-application">XSS prevention in JSP/Servlet web application</a></li>
</ul>
</div><div class="mt-3 mb-3 border-bottom"><h5><a href=\'https://itecnote.com/tecnote/c-how-to-remove-all-non-alphanumeric-characters-from-a-string-except-dash/\'>C# – How to remove all non alphanumeric characters from a string except dash</a></h5></div><div class=\'bg-transparent mb-3\'>
<p>Replace <code>[^a-zA-Z0-9 -]</code> with an empty string.</p>
<pre><code>Regex rgx = new Regex("[^a-zA-Z0-9 -]");
str = rgx.Replace(str, "");
</code></pre>
</div></div></div></div> </div>
<div class="col-12 col-md-4">
<ins class="adsbygoogle"
style="display:block"
data-ad-client="ca-pub-1962880864575776"
data-ad-slot="5946106761"
data-ad-format="auto"
data-full-width-responsive="true"></ins>
<script>
(adsbygoogle = window.adsbygoogle || []).push({});
</script>
<div class=\'mt-3 ml-4 border-bottom border-success\'><h6><span>Related Question</span></h6></div><ul class=\'list-group list-group-flush\'><li class="list-group-item"><a href=\'https://itecnote.com/tecnote/html-css-pseudo-classes-with-inline-styles/\'>Html – CSS Pseudo-classes with inline styles</a></li><li class="list-group-item"><a href=\'https://itecnote.com/tecnote/c-efficient-way-to-remove-all-whitespace-from-string/\'>C# – Efficient way to remove ALL whitespace from String</a></li><li class="list-group-item"><a href=\'https://itecnote.com/tecnote/javascript-how-to-remove-all-line-breaks-from-a-string/\'>Javascript – How to remove all line breaks from a string</a></li><li class="list-group-item"><a href=\'https://itecnote.com/tecnote/c-how-to-remove-all-html-tags-from-a-string-without-knowing-which-tags-are-in-it/\'>C# – How to remove all HTML tags from a string without knowing which tags are in it</a></li><li class="list-group-item"><a href=\'https://itecnote.com/tecnote/html-how-to-override-bootstrap-css-styles/\'>Html – How to override Bootstrap CSS styles</a></li><li class="list-group-item"><a href=\'https://itecnote.com/tecnote/c-why-not-inherit-from-list/\'>C# – Why not inherit from List<T></a></li><li class="list-group-item"><a href=\'https://itecnote.com/tecnote/html-stylesheet-not-loaded-because-of-mime-type/\'>Html – Stylesheet not loaded because of MIME-type</a></li></ul> <ins class="adsbygoogle"
style="display:block"
data-ad-client="ca-pub-1962880864575776"
data-ad-slot="1527980765"
data-ad-format="auto"
data-full-width-responsive="true"></ins>
<script>
(adsbygoogle = window.adsbygoogle || []).push({});
</script>
</div>';
my $regex = qr/(?<=<div.*?)(?<!=\t*?"?\t*?)(class|style)=".*?"/mp;
if ( $str =~ /$regex/g ) {
print "Whole match is ${^MATCH} and its start/end positions can be obtained via \$-[0] and \$+[0]\n";
# print "Capture Group 1 is $1 and its start/end positions can be obtained via \$-[1] and \$+[1]\n";
# print "Capture Group 2 is $2 ... and so on\n";
}
# ${^POSTMATCH} and ${^PREMATCH} are also available with the use of '/p'
# Named capture groups can be called via $+{name}
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Perl, please visit: http://perldoc.perl.org/perlre.html