Regular Expressions 101

Save & Share

Flavor

  • PCRE2 (PHP >=7.3)
  • PCRE (PHP <7.3)
  • ECMAScript (JavaScript)
  • Python
  • Golang
  • Java 8
  • .NET 7.0 (C#)
  • Rust
  • Regex Flavor Guide

Function

  • Match
  • Substitution
  • List
  • Unit Tests

Tools

Sponsors
There are currently no sponsors. Become a sponsor today!
An explanation of your regex will be automatically generated as you type.
Detailed match information will be displayed here automatically.
  • All Tokens
  • Common Tokens
  • General Tokens
  • Anchors
  • Meta Sequences
  • Quantifiers
  • Group Constructs
  • Character Classes
  • Flags/Modifiers
  • Substitution
  • A single character of: a, b or c
    [abc]
  • A character except: a, b or c
    [^abc]
  • A character in the range: a-z
    [a-z]
  • A character not in the range: a-z
    [^a-z]
  • A character in the range: a-z or A-Z
    [a-zA-Z]
  • Any single character
    .
  • Alternate - match either a or b
    a|b
  • Any whitespace character
    \s
  • Any non-whitespace character
    \S
  • Any digit
    \d
  • Any non-digit
    \D
  • Any word character
    \w
  • Any non-word character
    \W
  • Non-capturing group
    (?:...)
  • Capturing group
    (...)
  • Zero or one of a
    a?
  • Zero or more of a
    a*
  • One or more of a
    a+
  • Exactly 3 of a
    a{3}
  • 3 or more of a
    a{3,}
  • Between 3 and 6 of a
    a{3,6}
  • Start of string
    ^
  • End of string
    $
  • A word boundary
    \b
  • Non-word boundary
    \B

Regular Expression

/
/
gm

Test String

Code Generator

Generated Code

#include <StringConstants.au3> ; to declare the Constants of StringRegExp #include <Array.au3> ; UDF needed for _ArrayDisplay and _ArrayConcatenate Local $sRegex = "(?m)^CVE-(1999|2\d{3})-(0\d{2}[1-9]|[1-9]\d{3,})$" Local $sString = "#This one is not valid but wasn't covered by the test cases provided by MITRE" & @CRLF & _ "CVE-2001-0000" & @CRLF & _ "" & @CRLF & _ "# This file contains test data for implementations to verify that" & @CRLF & _ "# CVE IDs are properly parsed and handled to conform with the" & @CRLF & _ "# 2014 CVE ID Syntax change." & @CRLF & _ "#" & @CRLF & _ "# About this test data: README-tests.txt" & @CRLF & _ "# More info: http://cve.mitre.org/cve/identifiers/syntaxchange.html" & @CRLF & _ "#" & @CRLF & _ "# ****** VALID SYNTAX ******" & @CRLF & _ "#" & @CRLF & _ "# Publicly-referenced IDs for the new syntax (formerly "Option B")" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-0001" & @CRLF & _ "CVE-2014-0999" & @CRLF & _ "CVE-2014-1234" & @CRLF & _ "CVE-2014-3127" & @CRLF & _ "CVE-2014-9999" & @CRLF & _ "CVE-2014-10000" & @CRLF & _ "CVE-2014-54321" & @CRLF & _ "CVE-2014-99999" & @CRLF & _ "CVE-2014-100000" & @CRLF & _ "CVE-2014-123456" & @CRLF & _ "CVE-2014-456132" & @CRLF & _ "CVE-2014-999999" & @CRLF & _ "CVE-2014-1000000" & @CRLF & _ "CVE-2014-1234567" & @CRLF & _ "CVE-2014-7654321" & @CRLF & _ "CVE-2014-9999999" & @CRLF & _ "#" & @CRLF & _ "# Invalid ID. This is the only invalid ID in this file, and it's" & @CRLF & _ "# intended to help spot incorrect tests that mistakenly accept all" & @CRLF & _ "# inputs. See README." & @CRLF & _ "#" & @CRLF & _ "CVE-ABCD-EFGH" & @CRLF & _ "#" & @CRLF & _ "# These are valid but could cause problems if IDs are stored in bytes" & @CRLF & _ "# due to numeric overflows (stranger things have happened)." & @CRLF & _ "#" & @CRLF & _ "CVE-2014-16385" & @CRLF & _ "CVE-2014-32769" & @CRLF & _ "CVE-2014-65537" & @CRLF & _ "CVE-2014-131073" & @CRLF & _ "#" & @CRLF & _ "# unusually large number of trailing zeros" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-100000000" & @CRLF & _ "#" & @CRLF & _ "# storing CVE number portion as 32-bit signed integer (seen in at" & @CRLF & _ "# least one real-world implementation)" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-2147483647" & @CRLF & _ "CVE-2014-2147483648" & @CRLF & _ "#" & @CRLF & _ "# storing CVE number portion as 32-bit unsigned integer (possibly seen" & @CRLF & _ "# in at least one real-world implementation)" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-4294967295" & @CRLF & _ "CVE-2014-4294967296" & @CRLF & _ "#" & @CRLF & _ "# storing CVE ID string in a fixed-length 32-byte buffer, with or" & @CRLF & _ "# without a required trailing '\0' character" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-1111111111111111111111" & @CRLF & _ "CVE-2014-11111111111111111111111" & @CRLF & _ "CVE-2014-111111111111111111111111" & @CRLF & _ "" & @CRLF & _ "####################################################################" & @CRLF & _ "" & @CRLF & _ "# This file contains test data for implementations to verify that" & @CRLF & _ "# CVE IDs are properly parsed and handled to conform with the" & @CRLF & _ "# 2014 CVE ID Syntax change." & @CRLF & _ "#" & @CRLF & _ "# About this test data: README-tests.txt" & @CRLF & _ "# More info: http://cve.mitre.org/cve/identifiers/syntaxchange.html" & @CRLF & _ "#" & @CRLF & _ "#" & @CRLF & _ "# ****** SYNTAX VIOLATIONS ******" & @CRLF & _ "#" & @CRLF & _ "# Option A syntax from early 2013 - option not chosen. These might look" & @CRLF & _ "# good at first glance, but have leading 0's with more than 4 digits." & @CRLF & _ "#" & @CRLF & _ "CVE-2014-000001" & @CRLF & _ "CVE-2014-009999" & @CRLF & _ "CVE-2014-000001" & @CRLF & _ "CVE-2014-000999" & @CRLF & _ "CVE-2014-001234" & @CRLF & _ "CVE-2014-009999" & @CRLF & _ "CVE-2014-010000" & @CRLF & _ "CVE-2014-054321" & @CRLF & _ "CVE-2014-099999" & @CRLF & _ "#" & @CRLF & _ "# Option A' syntax - modified Option A for second vote - option not chosen." & @CRLF & _ "# Similar to original Option A, there are leading 0's with more than 4 digits." & @CRLF & _ "#" & @CRLF & _ "CVE-2014-00000001" & @CRLF & _ "CVE-2014-00000999" & @CRLF & _ "CVE-2014-00001234" & @CRLF & _ "CVE-2014-00009999" & @CRLF & _ "CVE-2014-00010000" & @CRLF & _ "CVE-2014-00123456" & @CRLF & _ "CVE-2014-01234567" & @CRLF & _ "#" & @CRLF & _ "# Option C syntax from early 2013 - option not chosen" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-1-8" & @CRLF & _ "CVE-2014-999-3" & @CRLF & _ "CVE-2014-1234-3" & @CRLF & _ "CVE-2014-9999-3" & @CRLF & _ "CVE-2014-10000-8" & @CRLF & _ "CVE-2014-54321-5" & @CRLF & _ "CVE-2014-123456-5" & @CRLF & _ "CVE-2014-999999-5" & @CRLF & _ "CVE-2014-1234567-4" & @CRLF & _ "#" & @CRLF & _ "# Intentionally valid ID. This is the only valid ID in this file, and" & @CRLF & _ "# it's intended to help spot incorrect tests that mistakenly reject" & @CRLF & _ "# all inputs. See README." & @CRLF & _ "#" & @CRLF & _ "CVE-2014-1234" & @CRLF & _ "#" & @CRLF & _ "# Miscellaneous examples used during discussion of syntax" & @CRLF & _ "#" & @CRLF & _ "CVE-YYYY-NNNN" & @CRLF & _ "CVE-YYYY-NNNNN" & @CRLF & _ "CVE-YYYY-NNNNNN" & @CRLF & _ "#" & @CRLF & _ "# Loose extraction assuming only CVE prefix and two alphanumerics" & @CRLF & _ "# separated by hyphens" & @CRLF & _ "#" & @CRLF & _ "CVE-SRC-OHA" & @CRLF & _ "CVE-2AAA-3BBB" & @CRLF & _ "#" & @CRLF & _ "# Missing sequence number / invalid year" & @CRLF & _ "#" & @CRLF & _ "CVE-114" & @CRLF & _ "CVE-73" & @CRLF & _ "#" & @CRLF & _ "# Malformed sequence number" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-789" & @CRLF & _ "CVE-2014-" & @CRLF & _ "CVE-2014-9" & @CRLF & _ "CVE-2014-98" & @CRLF & _ "#" & @CRLF & _ "# leading 0's - prohibited except for 999 and less (i.e., "0001"" & @CRLF & _ "# through "0999"" & @CRLF & _ "#" & @CRLF & _ "CVE-2015-010000" & @CRLF & _ "CVE-2015-09999" & @CRLF & _ "CVE-2014-00001" & @CRLF & _ "#" & @CRLF & _ "# CR/LF in middle of ID" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-" & @CRLF & _ "1234" & @CRLF & _ "CVE-2014" & @CRLF & _ "-1234" & @CRLF & _ "CVE-201" & @CRLF & _ "4-1235" & @CRLF & _ "#" & @CRLF & _ "# no year provided" & @CRLF & _ "#" & @CRLF & _ "CVE-3153" & @CRLF & _ "#" & @CRLF & _ "# position-oriented (assume columns 5 through 8 are year). The first one" & @CRLF & _ "# is a real-world conversion error by CVE code (oops)." & @CRLF & _ "#" & @CRLF & _ "CVE- 14-1236" & @CRLF & _ "CVE-AAAA-1237" & @CRLF & _ "#" & @CRLF & _ "# missing/invalid "CVE-" prefix" & @CRLF & _ "#" & @CRLF & _ "C-2014-1238" & @CRLF & _ "2014-1240" & @CRLF & _ "CVE:2014-1241" & @CRLF & _ "CVE 2014 1242" & @CRLF & _ "#" & @CRLF & _ "# invalid year" & @CRLF & _ "#" & @CRLF & _ "CVE-201-0771" & @CRLF & _ "CVE-14-1239" & @CRLF & _ "CVE-20132-0169" & @CRLF & _ "#" & @CRLF & _ "# Odd stuff straight from CVE web logs (thanks, random anonymous" & @CRLF & _ "# people!). Includes some real-world typos or, in some cases," & @CRLF & _ "# security-related IDs that utilize portions of the CVE ID." & @CRLF & _ "#" & @CRLF & _ "2013" & @CRLF & _ "0497" & @CRLF & _ "2010-270" & @CRLF & _ "2013-199" & @CRLF & _ "2013-6XXX" & @CRLF & _ "CVE2014-0591" & @CRLF & _ "CVE:13-7108" & @CRLF & _ "CVE-XXXX-XXXX" & @CRLF & _ "CVE-TODO" & @CRLF & _ "1421010/13" & @CRLF & _ "CVE20076753" & @CRLF & _ "CVE:2013-4547" & @CRLF & _ "(CVE-2013-136" & @CRLF & _ "CVE - 2006 - 0788" & @CRLF & _ "CVE-2008-600" & @CRLF & _ "199-0618" & @CRLF & _ "CVE-199-0618" & @CRLF & _ "CA-2003-16" & @CRLF & _ "# URL-encoded" & @CRLF & _ "+CVE+-+2006+-+0788" & @CRLF & _ "CVE-2013%2D4345" & @CRLF & _ "CVE -20093103" & @CRLF & _ "CVE-'2014-1610" & @CRLF & _ "CVE--2009-3555" & @CRLF & _ "CVE-1999-077" & @CRLF & _ "CVE-2006.1737" & @CRLF & _ "CVE-20076-4704" & @CRLF & _ "CVE-2010--0281" & @CRLF & _ "CVE-2010-" & @CRLF & _ "CVE-2013-*" & @CRLF & _ "CVE-2013-167`" & @CRLF & _ "CVE-2013-00XX" & @CRLF & _ "CVE-2013--4339" & @CRLF & _ "CVE-2013-****" & @CRLF & _ "CVE-2013-3.893" & @CRLF & _ "CVE-CVE:2013-4883" & @CRLF & _ "CVE-CVE-2013-4883" & @CRLF & _ "CVE2010-3333.J" & @CRLF & _ "2013-A-0196" & @CRLF & _ "CVE-2013-A-0196" & @CRLF & _ "#" & @CRLF & _ "# common shorthand for multiple IDs" & @CRLF & _ "#" & @CRLF & _ "CVE-2007-{4352,5392,5393}" & @CRLF & _ "CVE:2012-0013" & @CRLF & _ "CVE_2013-7063" & @CRLF & _ "E-2011-3192" & @CRLF & _ "EXPLOIT-CVE2013-2465" & @CRLF & _ "VE-2012-0158" & @CRLF & _ "VE-2013-5875C" & @CRLF & _ "ZDI-12-170" & @CRLF & _ "CVE-YYYY-XXXX" & @CRLF & _ "CVE-2012=1234" & @CRLF & _ "#" & @CRLF & _ "# these originated in late 1999/early 2000 era" & @CRLF & _ "#" & @CRLF & _ "GENERIC-MAP-NOMATCH" & @CRLF & _ "CVE-MAP-NOMATCH" & @CRLF & _ "CVE-NO-MATCH" & @CRLF & _ "CVE-NO-NAME" & @CRLF & _ "CVE-NONE-0662" & @CRLF & _ "#" & @CRLF & _ "# Arbitrary 13-character string" & @CRLF & _ "#" & @CRLF & _ "ABCDEFGHIJKLM" & @CRLF & _ "#" & @CRLF & _ "# NOCVE identifiers, e.g., http://cs.coresecurity.com/core-impact-pro/exploits?page=11" & @CRLF & _ "#" & @CRLF & _ "NOCVE-9999-54104" & @CRLF & _ "NOCVE-9999-46110" & @CRLF & _ "CVE-9999-1" & @CRLF & _ "CVE-9999-11" & @CRLF & _ "CVE-9999-111" & @CRLF & _ "#" & @CRLF & _ "# erroneous attempts to convert certain homoglyphs / Unicode to 7-bit" & @CRLF & _ "# ASCII" & @CRLF & _ "CVE?2014?0001" & @CRLF & _ "#" & @CRLF & _ "# mashups of CVEs and telephone numbers" & @CRLF & _ "#" & @CRLF & _ "CVE-555-1212" & @CRLF & _ "CVE-800-555-1212" & @CRLF & _ "CVE-1-800-555-1212" & @CRLF & _ "#" & @CRLF & _ "# mashups of CVEs and Jenny" & @CRLF & _ "#" & @CRLF & _ "CVE-867-5309" & @CRLF & _ "CVE-867-5309(1981)" & @CRLF & _ "#" & @CRLF & _ "# extraneous spaces (very common in disclosures from multiple sources)" & @CRLF & _ "#" & @CRLF & _ "CVE-2014- 0001" & @CRLF & _ "CVE- 2014-0001" & @CRLF & _ "CVE- 2014- 0001" & @CRLF & _ "CVE-2014- 13001" & @CRLF & _ "CVE- 2014-13001" & @CRLF & _ "CVE- 2014- 13001" & @CRLF & _ "#" & @CRLF & _ "# non-dash format (widely used by IBM ISS X-Force, e.g., the http://xforce.iss.net/xforce/xfdb/89235 page)" & @CRLF & _ "#" & @CRLF & _ "CVE20140001" & @CRLF & _ "cve20140001" & @CRLF & _ "CVE201413001" & @CRLF & _ "cve201413001" & @CRLF & _ "#" & @CRLF & _ "# traditional VUPEN style - which happens to match CVE except for the" & @CRLF & _ "# "ADV-" prefix instead of "CVE-"" & @CRLF & _ "#" & @CRLF & _ "ADV-2006-0001" & @CRLF & _ "#" & @CRLF & _ "# exploit-db.com format" & @CRLF & _ "#" & @CRLF & _ "CVE: 2014-0001" & @CRLF & _ "CVE: 2014-13001" & @CRLF & _ "#" & @CRLF & _ "# OSVDB format" & @CRLF & _ "#" & @CRLF & _ "CVE ID: 2014-0001" & @CRLF & _ "CVE ID: 2014-13001" & @CRLF & _ "2014-0001" & @CRLF & _ "2014-13001" & @CRLF & _ "#" & @CRLF & _ "# results of bad global search/replace of CVE with CVE&reg;" & @CRLF & _ "# (registered trademark symbol)" & @CRLF & _ "#" & @CRLF & _ "CVE&reg;-2014-0001" & @CRLF & _ "#" & @CRLF & _ "# attempts at XML conversion" & @CRLF & _ "#" & @CRLF & _ "<CVE>-2014-0001" & @CRLF & _ "<CVE>2014-0001" & @CRLF & _ "<CVE>2014-0001</CVE>" & @CRLF & _ "<CVE>2014-0001</>" & @CRLF & _ "#" & @CRLF & _ "# attempts at JSON conversion" & @CRLF & _ "#" & @CRLF & _ ""CVE": "2014-0001"" & @CRLF & _ ""cve": "2004-0001"" & @CRLF & _ ""CVE":"2014-0001"" & @CRLF & _ ""cve":"2004-0001"" & @CRLF & _ "#" & @CRLF & _ "# use of the letter 'O' instead of the number '0'" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-OOO1" & @CRLF & _ "CVE-2O14-0001" & @CRLF & _ "#" & @CRLF & _ "# use of the letter 'l' instead of the number '1'" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-000l" & @CRLF & _ "CVE-20l4-0001" & @CRLF & _ "#" & @CRLF & _ "# regular expressions or various other groupings" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-130[12]" & @CRLF & _ "CVE-[0-9]{4}-[0-9]{4}" & @CRLF & _ "CVE-[0-9]{4,}-[0-9]{4,}" & @CRLF & _ "# "sticky" keyboards" & @CRLF & _ "#" & @CRLF & _ "CVEE-2014-0001" & @CRLF & _ "CVEEEEEEE-2014-0001" & @CRLF & _ "# attempts at plurals" & @CRLF & _ "#" & @CRLF & _ "CVEs-2014-0001 and 2014-0002" & @CRLF & _ "# misplaced organizational specifiers" & @CRLF & _ "#" & @CRLF & _ "CVE[MITRE]-2014-0001" & @CRLF & _ "CVE[Mitre]-2014-0001" & @CRLF & _ "# confusion with National Vulnerability Database" & @CRLF & _ "#" & @CRLF & _ "NVD-2014-0001" & @CRLF & _ "# confusion with Defense Vulnerability Database" & @CRLF & _ "#" & @CRLF & _ "DVD-2014-0001" & @CRLF & _ "# confusion with other organizations" & @CRLF & _ "#" & @CRLF & _ "CERT-2014-0001" & @CRLF & _ "JVN-2014-0001" & @CRLF & _ "JVNDB-2014-000001" & @CRLF & _ "# intraword footnotes" & @CRLF & _ "#" & @CRLF & _ "CVE[1]-2014-0001" & @CRLF & _ "CVE*-2014-0001" & @CRLF & _ "CVE**-2014-0001" & @CRLF & _ "# Literal tab character." & @CRLF & _ "#" & @CRLF & _ "CVE 2014-0001" & @CRLF & _ "# erroneous generation of a -1 value" & @CRLF & _ "#" & @CRLF & _ "CVE-2014--1" & @CRLF & _ "# erroneous generation of a zero value" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-0" & @CRLF & _ "# ordering confusion" & @CRLF & _ "#" & @CRLF & _ "2014-0001-CVE" & @CRLF & _ "# this is technically valid syntax, but since the year can never be before" & @CRLF & _ "# 1999, this could be rejected based on CVE "business rules"." & @CRLF & _ "CVE-0001-2014" & @CRLF & _ "# wildcards or meta-expressions" & @CRLF & _ "#" & @CRLF & _ "CVE-2014-*" & @CRLF & _ "CVE-2014-####" & @CRLF & _ "CVE-2014-****" & @CRLF & _ "CVE-2014-?" & @CRLF & _ "CVE-2014-????" & @CRLF & _ "CVE-2014*" & @CRLF & _ "CVE-2014?" & @CRLF & _ "# extraneous dashes" & @CRLF & _ "#" & @CRLF & _ "CVE-2014--0001" & @CRLF & _ "CVE--2014-0001" & @CRLF & _ "# typos of dash" & @CRLF & _ "#" & @CRLF & _ "CVE=2014=0001" & @CRLF & _ "CVE0201400001" & @CRLF & _ "# various uncategorized examples" & @CRLF & _ "#" & @CRLF & _ "CVE_2014_0001" & @CRLF & _ "CVE-ID-2014-0001" & @CRLF & _ "CVEID-2014-0001" & @CRLF & _ "CVE#2014-0001" & @CRLF & _ "CVE# 2014-0001" & @CRLF & _ "CVEID#2014-0001" & @CRLF & _ "CVEID# 2014-0001" & @CRLF & _ "CVE-ID#2014-0001" & @CRLF & _ "CVE-ID# 2014-0001" & @CRLF & _ "CVE#2014-0001" & @CRLF & _ "CVE# 2014-0001" & @CRLF & _ "CEV-2014-0001" & @CRLF & _ "VCE-2014-0001" & @CRLF & _ "VEC-2014-0001" & @CRLF & _ "CWE-2014-0001" & @CRLF & _ "CPE-2014-0001" & @CRLF & _ "CME-2014-0001" & @CRLF & _ "CE-2014-0001" & @CRLF & _ "VE-2014-0001" & @CRLF & _ "E-2014-0001" & @CRLF & _ "-2014-0001" & @CRLF & _ "CVE-2014-000{1,2}" & @CRLF & _ "CVE/MITRE-2014-0001" & @CRLF & _ "" Local $aArray = StringRegExp($sString, $sRegex, $STR_REGEXPARRAYGLOBALFULLMATCH) Local $aFullArray[0] For $i = 0 To UBound($aArray) -1 _ArrayConcatenate($aFullArray, $aArray[$i]) Next $aArray = $aFullArray ; Present the entire match result _ArrayDisplay($aArray, "Result")

Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for AutoIt, please visit: https://www.autoitscript.com/autoit3/docs/functions/StringRegExp.htm