Regular Expressions 101

Community Patterns

There does not seem to be anything here

Community Library Entry

1

Regular Expression
ECMAScript (JavaScript)

/
((?<=>)[^<>\n]+?(?=<))|((?<=<.+=")[^\n"]+?(?="))
/
gm

Description

Overview

This Regular Expression shows any text between HTML tags or between quotation marks within the tags themselves. An example would be: <HTML lang="en", then selecting the 'en' from there. Group 1 is between tags, and group 2 is in the quotes. I recommend only using half of this regular expression, like the 'between tags' part, or the 'inside quotes part'.

Explanation

The whole regex is built of two alternates, the tags and quotes part. The tags part (the first part) starts with a lookbehind of a closing of a tag, then a lazy part that selects almost characters, and a lookahead that checks for the beginning of tag (so the text is between tags). The quotes part is built in a similar structure, where it looks behind for an open of a tag, .+ , then checking for an equals sign followed by a quotation mark. The middle section of the quotes part is again just searching for text, but also not including quotation marks. The lookahead checks for a closing quotation mark and that's it! Please comment any questions or email me.

Submitted by daniel@sabian.pro - 7 days ago