import re
regex = re.compile(r"<[\s\S]+?>", flags=re.MULTILINE)
test_str = ("covid sucks and I want to go outside <!--/* Font Definitions */@font-face{font-family:Wingdings;panose-1:5 0 0 0 0 0 0 0 0 0;}@font- \n"
"face{font-family:\"\"Cambria Math\"\";panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face{font-family:Calibri;panose- \n"
"1:2 15 5 2 2 2 4 3 2 4;}@font-face{font-family:\"\"Bradley Hand ITC\"\";panose-1:3 7 4 2 5 3 2 3 2 3;}/* \n"
"Style Definitions */p.MsoNormal, li.MsoNormal, div.MsoNormal{margin:0in;margin-bottom:.0001pt;font- \n"
"size:11.0pt;font-family:\"\"Calibri\"\",sans-serif;}p.MsoListParagraph, li.MsoListParagraph, \n"
"div.MsoListParagraph{m{margin-bottom:0in;}--> pop goes the peanut.\n\n"
"this is text before the junk starts <!--/* Font Definitions */@font-face{font-family:Wingdings;panose-1:5 0 0 0 0 0 0 0 0 0;}@font- \n"
"face{font-family:\"\"Camb\n\n"
"NEW LINE!\n"
"NEW LINE!\n\n"
"ria Math\"\";panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face{font-family:Calibri;panose- \n"
"1:2 15 5 2 2 2 4 3 2 4;}@font-face{font-family:\"\"Bradley Hand ITC\"\";panose-1:3 7 4 2 5 3 2 3 2 3;}/* \n"
"Style Definitions */p.MsoNormal, li.MsoNormal, div.MsoNormal{margin:0in;margin-bottom:.0001pt;font- \n"
"size:11.0pt;font-family:\"\"Calibri\"\",sans-serif;}p.MsoListParagraph, li.MsoListParagraph, \n"
"div.MsoListParagraph{m{margin-bottom:0in;}--> ...and this is the bit afterwards even if the junk has line returns.")
subst = ""
result = regex.sub(subst, test_str)
if result:
print(result)
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Python, please visit: https://docs.python.org/3/library/re.html