Regular Expressions 101

Community Patterns

Community Library Entry

2

Regular Expression
Created·2026-05-20 09:51
Updated·2026-05-27 11:13
Flavor·.NET 7.0 (C#)

@"""
(?<Space>\s+)? # Space ((?<PITag><\?(?<PIType>[\w-]+)(\s*(?<Attrs>(?<AttrName>([\w-]+:)?[\w-]+)\s*=\s*(?<quote>"|')(?<AttrValue>((?!\k<quote>).)*?)\k<quote>))*\s*\?>) # <?Tag ?> |(?<CommentTag><!--\s*(?<Comment>.*?)\s*-->) # <!-- --> |(?<DTDTag><!\w+?\s*.*?\[(?<DTDContent>.*?)\]>) # <!Tag !> |(?<CDATATag><!\[CDATA\[(?<CDATAContent>.*?)\]\]>) # <![CDATA []]> |(?<XmlCloseTag><(?<TagName>([\w-]+:)?[\w-]+)(\s*(?<Attrs>(?<AttrName>([\w-]+:)?[\w-]+)\s*=\s*(?<quote>"|')(?<AttrValue>((?!\k<quote>).)*?)\k<quote>))*\s*?/>) # <Tag/> |(?<XmlOpenTagEnd></(?<TagName>([\w-]+:)?[\w-]+)\s*>) # </Tag> |(?<XmlOpenTagBegin><(?<TagName>([\w-]+:)?[\w-]+)(\s*(?<Attrs>(?<AttrName>([\w-]+:)?[\w-]+)\s*=\s*(?<quote>"|')(?<AttrValue>((?!\k<quote>).)*?)\k<quote>))*\s*?>) # <Tag> |(?<PlainText>(?<=>\s*).*?(?=\s*<)) # PainText )
"""
gmsxn
Open regex in editor

Description

This regular expression is designed to tokenize XML content by identifying major XML constructs through named capture groups. It detects processing instructions (PI), DTD blocks, CDATA sections, comments, self‑closing tags, opening tags, closing tags, and plain text. It is suitable for building lightweight XML lexers or preprocessing XML before deeper parsing.

Submitted by Flithor