Recursively use this pattern to match HTML content.
This pattern will match <sometag my-data="1234567890" style=""> <nestedhtml>Hello World</nestedhtml> </sometag> and return a groupdict of tag, body and attribute data.
Then you can use this pattern on the body to parse nested HTML tags as well.