re = /<div\b # Start of outer DIV start tag.
[^>]*? # Lazily match up to id attrib.
\bclass\s*+=\s*+ # id attribute name and =
([\'"]?+) # $1: Optional quote delimiter.
\basd\b # specific ID to be matched.
(?(1)\1) # If open quote, match same closing quote
[^>]*+> # remaining outer DIV start tag.
( # $2: DIV contents. (may be called recursively!)
(?: # Non-capture group for DIV contents alternatives.
# DIV contents option 1: All non-DIV, non-comment stuff...
[^<]++ # One or more non-tag, non-comment characters.
# DIV contents option 2: Start of a non-DIV tag...
| < # Match a "<", but only if it
(?! # is not the beginning of either
\/?div\b # a DIV start or end tag,
| !-- # or an HTML comment.
) # Ok, that < was not a DIV or comment.
# DIV contents Option 3: an HTML comment.
| <!--.*?--> # A non-SGML compliant HTML comment.
# DIV contents Option 4: a nested DIV element!
| <div\b[^>]*+> # Inner DIV element start tag.
(?2) # Recurse group 2 as a nested subroutine.
<\/div\s*> # Inner DIV element end tag.
)*+ # Zero or more of these contents alternatives.
) # End 2$: DIV contents.
<\/div\s*>/mix
str = '<div>
<div class="asd">
<div>
<div>
Hello
</div>
</div>
</div>
</div>'
# Print the match result
str.match(re) do |match|
puts match.to_s
end
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Ruby, please visit: http://ruby-doc.org/core-2.2.0/Regexp.html