Regular Expressions 101

Save & Share

Flavor

  • PCRE2 (PHP >=7.3)
  • PCRE (PHP <7.3)
  • ECMAScript (JavaScript)
  • Python
  • Golang
  • Java 8
  • .NET 7.0 (C#)
  • Rust
  • Regex Flavor Guide

Function

  • Match
  • Substitution
  • List
  • Unit Tests

Tools

Sponsors
There are currently no sponsors. Become a sponsor today!
An explanation of your regex will be automatically generated as you type.
Detailed match information will be displayed here automatically.
  • All Tokens
  • Common Tokens
  • General Tokens
  • Anchors
  • Meta Sequences
  • Quantifiers
  • Group Constructs
  • Character Classes
  • Flags/Modifiers
  • Substitution
  • A single character of: a, b or c
    [abc]
  • A single character of: a, b, c or d
    [[ab][cd]]
  • A character except: a, b or c
    [^abc]
  • A character in the range: a-z
    [a-z]
  • A character not in the range: a-z
    [^a-z]
  • A character in the range: a-z or A-Z
    [a-zA-Z]
  • Character class intersection
    [\w&&[^\d]]
  • Any single character
    .
  • Alternate - match either a or b
    a|b
  • Any whitespace character
    \s
  • Any non-whitespace character
    \S
  • Any digit
    \d
  • Any non-digit
    \D
  • Any word character
    \w
  • Any non-word character
    \W
  • Non-capturing group
    (?:...)
  • Capturing group
    (...)
  • Zero or one of a
    a?
  • Zero or more of a
    a*
  • One or more of a
    a+
  • Exactly 3 of a
    a{3}
  • 3 or more of a
    a{3,}
  • Between 3 and 6 of a
    a{3,6}
  • Start of string
    ^
  • End of string
    $
  • A word boundary
    \b
  • Non-word boundary
    \B

Regular Expression
No Match

r"
"
gm

Test String

Code Generator

Generated Code

import java.util.regex.Matcher; import java.util.regex.Pattern; public class Example { public static void main(String[] args) { final String regex = "\\[(?:http://|https://)*(?:\\w+\\.)*(\\w+(?:\\.(?:com|org|net|edu|gov|info|biz|io|co|app|co|uk|de|jp|ca|dev|app|gg))+)]\\((?:http://|https://)(?:\\w+\\.)+\\w+(?:/\\w+)*\\)"; final String string = "Normal links don't get caught:\n" + "[do not catch this](https://example.com)\n" + "orthis.com\n\n" + "Neither do links with full stops in the message:\n" + "(messages. with. full stops)[https://example.com]\n\n" + "even if they forget a space\n" + "[whoops.nospace](https://example.com)\n\n" + "because we catch based on tld:\n" + "[catchthis.com](https://malicious.link)\n" + "[catchthis.org](https://malicious.link)\n" + "[catchthis.net](https://malicious.link)\n" + "[catchthis.edu](https://malicious.link)\n" + "[catchthis.gov](https://malicious.link)\n" + "[catchthis.info](https://malicious.link)\n" + "[catchthis.biz](https://malicious.link)\n" + "[catchthis.io](https://malicious.link)\n" + "[catchthis.co](https://malicious.link)\n" + "[catchthis.uk](https://malicious.link)\n" + "[catchthis.de](https://malicious.link)\n" + "[catchthis.jp](https://malicious.link)\n\n" + "[www.catchthis.com](https://malicious.link)\n" + "[https://catchthis.com](https://malicious.link)\n" + "[http://catchthis.com](http://malicious.link)\n\n" + "any combination of the above also gets matched for multiple tld urls:\n" + "[link.co.jp.org.net](https://malicious.link)\n\n" + "This is perfect because we can block any malicious link with any tld or any number of subdomains, but have a controlled list of tlds that links with a fake url begin with. Since most non-standard tlds are sketchy, we don't even need that many:\n\n" + "[link.com](http://any.malicious.li.nk/anything/at/all)\n\n" + "Any number of subdomains also get caught:\n" + "[auth.google.com](https://malicious.website.com)\n" + "[any.number.at.all.com](https://malicious.link)\n\n\n" + "This method of having a set tld list means almost zero false positives, with the drawback of people having to recognise sketchy urls themselves:\n\n" + "[linkwitha.sketchytld](https://malicious.link) // not caught\n\n" + "If you want a wider net with a higher chance of false positives, replace the subdomains with the word matcher wildcard (\\w+):\n\n" + "\\[(?:\\w+\\.)*(\\w+(?:\\.(?:\\w+))+)]\\((?:http://|https://)(?:\\w+\\.)+\\w+(?:/\\w+)*\\)\n\n" + "Or a much shorter one that doesn't catch http:// links but that is short enough for Discord: [discord already blocks \"fake\" links with https in the title but not ones without it]\n\n" + "\\[(\\w+\\.?)*]\\((https?://)(\\w+\\.?)*\\)\n\n" + "a longer method with subdomain denylisting is also short enough for Discord:\n\n" + "\\[(?:(?:www|auth|login)\\.)*(\\w+(?:\\.(?:com|org|net|edu|gov|info|biz|io|co|app|co|uk|de|jp|ca|dev|app|gg))+)]\\((?:http://|https://)(?:\\w+\\.)+\\w+(?:/\\w+)*\\)\n\n" + "Since this compiles to a shorter resulting regex (add more subdomains after auth to catch more. )"; final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE); final Matcher matcher = pattern.matcher(string); while (matcher.find()) { System.out.println("Full match: " + matcher.group(0)); for (int i = 1; i <= matcher.groupCount(); i++) { System.out.println("Group " + i + ": " + matcher.group(i)); } } } }

Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Java, please visit: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html