Regular Expressions 101

Save & Share

Current Version: 1
Fork Regex
ctrl+s
Go to Community Entry

Flavor

PCRE2 (PHP)
ECMAScript (JavaScript)
Python
Golang
Java
.NET 7.0 (C#)
Rust
PCRE (Legacy)
Regex Flavor Guide

Function

Match
Substitution
List
Unit Tests

Tools

Regular Expression
Processing...

Test String

Code Generator

Language

Generated Code

import re

regex = re.compile(r"(/?ark:/? ?)([-a-zA-Z0-9@:%_\+.~#?&//=]*)", flags=re.MULTILINE)

test_str = ("https://www.example.com\n"
	"http://www.example.com\n"
	"www.example.com\n"
	"example.com\n"
	"http://blog.example.com\n"
	"http://www.example.com/product\n"
	"http://www.example.com/products?id=1&page=2\n"
	"http://www.example.com#up\n"
	"http://255.255.255.255\n"
	"255.255.255.255\n"
	"255.255.255.255/test\n"
	"http://invalid.com/perl.cgi?key= | http://web-site.com/cgi-bin/perl.cgi?key1=value1&key2\n"
	"http://www.site.com:8008\n"
	"http://www.site.com:8008 10.1016.12.31/naTUrE.S0735-1097(98)2000/12/31/34:7-7 http://myrepo.example.org/ark:/12345/bcd987\n"
	"http://n2t.net/ark:/12345/bcd987\n"
	"http://texashistory.unt.edu/ark:/67531/metapth346793\n"
	"http://example.org/ark:/12025/654xz321/s3/f8.05v.tiff\n"
	"https://doi.org/10.3886/ICPSR06849 10.3886/ICPSR06849 https://www.icpsr.umich.edu/icpsrweb/NACJD/studies/6849/version/1\n"
	"doi.org/10.1175/1520-0485(2002)032<0870:CT>2.0.CO;2\n"
	"ark: 12025/654xz321/s3/f8.05v.tiff\n"
	"        self.crossref_dois = (\n"
	"            '10.2310/JIM.0b013e31820bab4c',\n"
	"            '10.1007/978-3-642-28108-2_19',\n"
	"            '10.1016/S0735-1097(98)00347-7',\n"
	"        )\n\n"
	"        self.hard_dois = (\n"
	"            '10.1175/1520-0485(2002)032<0870:CT>2.0.CO;2',\n"
	"            '10.1002/(SICI)1522-2594(199911)42:5<952::AID-MRM16>3.0.CO;2-S',\n"
	"            '10.1579/0044-7447(2006)35\\[89:RDUICP\\]2.0.CO;2',\n"
	"        )\n\n"
	"        self.currently_not_supported = (\n"
	"            '10.1007.10/978-3-642-28108-2_19',\n"
	"            '10.1000.10/123456',\n"
	"            '10.1016.12.31/nature.S0735-1097(98)2000/12/31/34:7-7',\n"
	"        )\n\n"
	"        self.crossref_dois = (\n"
	"            'doi.org/10.2310/JIM.0b013e31820bab4c',\n"
	"            'doi.org/10.1007/978-3-642-28108-2_19',\n"
	"            'doi.org/10.1016/S0735-1097(98)00347-7',\n"
	"        )\n\n"
	"        self.hard_dois = (\n"
	"            'doi.org/10.1175/1520-0485(2002)032<0870:CT>2.0.CO;2',\n"
	"            'doi.org/10.1002/(SICI)1522-2594(199911)42:5<952::AID-MRM16>3.0.CO;2-S',\n"
	"            'doi.org/10.1579/0044-7447(2006)35\\[89:RDUICP\\]2.0.CO;2',\n"
	"        )\n\n"
	"        self.currently_not_supported = (\n"
	"            'doi.org/10.1007.10/978-3-642-28108-2_19',\n"
	"            'doi.org/10.1000.10/123456',\n"
	"            'doi.org/10.1016.12.31/nature.S0735-1097(98)2000/12/31/34:7-7',")

matches = regex.finditer(test_str)

for match_num, match in enumerate(matches, start=1):
    print(f"Match {match_num} was found at {match.start()}-{match.end()}: {match.group()}")
    
    for group_num, group in enumerate(match.groups(), start=1):
        print(f"Group {group_num} found at {match.start(group_num)}-{match.end(group_num)}: {group}")

Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Python, please visit: https://docs.python.org/3/library/re.html

Regular Expressions 101

Save & Share

Flavor

Function

Tools

Explanation

Match Information

Quick Reference

Regular Expression
Processing...

Test String

Code Generator

Language

Generated Code

Save & Share

Flavor

Function

Tools

Explanation

Match Information

Quick Reference

Regular ExpressionProcessing...

Test String

Regular Expression
Processing...