Submitted by Alice Bevan-McGregor - 3 months ago
HTML Document or Fragment Heuristic
From the StackOverflow question "Check if a string is html or not" there are many examples and a few approaches, some more expensive than others. Full DOM parsing and detecting constructed DOM elements is sure-fire, but slow, requiring a browser with parser and DOM representation. Regular expression...