Regular Expressions 101

Save & Share

Flavor

  • PCRE2 (PHP >=7.3)
  • PCRE (PHP <7.3)
  • ECMAScript (JavaScript)
  • Python
  • Golang
  • Java 8

Function

  • Match
  • Substitution
  • List
  • Unit Tests
"
^(c\/|calle|carrer)?\W?(\w+\W?[A-z]*\W?[A-z]*\W?[A-z]*\W?[A-z]*\W?[A-z]*\W?[A-z]*(?<!num))\W(|num|núm|num\.|núm\.|numero|número|n|n\.|no|no\.)?\W?(\d{1,3}\-?\d{1,3}?)\W?(bis|dup|mod|ant)?\W+(BJ)?\W\w+\W
"
ig
^ asserts position at start of the string
1st Capturing Group
(c\/|calle|carrer)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
1st Alternative
c\/
c matches the character c with index 9910 (6316 or 1438) literally (case insensitive)
\/ matches the character / with index 4710 (2F16 or 578) literally (case insensitive)
2nd Alternative
calle
calle
matches the characters calle literally (case insensitive)
3rd Alternative
carrer
carrer
matches the characters carrer literally (case insensitive)
\W
matches any non-word character (equivalent to [^a-zA-Z0-9_])
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
2nd Capturing Group
(\w+\W?[A-z]*\W?[A-z]*\W?[A-z]*\W?[A-z]*\W?[A-z]*\W?[A-z]*(?<!num))
\w
matches any word character (equivalent to [a-zA-Z0-9_])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\W
matches any non-word character (equivalent to [^a-zA-Z0-9_])
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
Match a single character present in the list below
[A-z]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A-z matches a single character in the range between A (index 65) and z (index 122) (case insensitive)
\W
matches any non-word character (equivalent to [^a-zA-Z0-9_])
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
Match a single character present in the list below
[A-z]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A-z matches a single character in the range between A (index 65) and z (index 122) (case insensitive)
\W
matches any non-word character (equivalent to [^a-zA-Z0-9_])
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
Match a single character present in the list below
[A-z]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A-z matches a single character in the range between A (index 65) and z (index 122) (case insensitive)
\W
matches any non-word character (equivalent to [^a-zA-Z0-9_])
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
Match a single character present in the list below
[A-z]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A-z matches a single character in the range between A (index 65) and z (index 122) (case insensitive)
\W
matches any non-word character (equivalent to [^a-zA-Z0-9_])
Match a single character present in the list below
[A-z]
\W
matches any non-word character (equivalent to [^a-zA-Z0-9_])
Match a single character present in the list below
[A-z]
Negative Lookbehind
(?<!num)
Assert that the Regex below does not match
\W matches any non-word character (equivalent to [^a-zA-Z0-9_])
3rd Capturing Group
(|num|núm|num\.|núm\.|numero|número|n|n\.|no|no\.)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
1st Alternative
matches the characters literally (case insensitive)
2nd Alternative
num
num
matches the characters num literally (case insensitive)
3rd Alternative
núm
núm
matches the characters núm literally (case insensitive)
4th Alternative
num\.
num
matches the characters num literally (case insensitive)
\. matches the character . with index 4610 (2E16 or 568) literally (case insensitive)
5th Alternative
núm\.
núm
matches the characters núm literally (case insensitive)
\. matches the character . with index 4610 (2E16 or 568) literally (case insensitive)
6th Alternative
numero
numero
matches the characters numero literally (case insensitive)
7th Alternative
número
número
matches the characters número literally (case insensitive)
8th Alternative
n
n matches the character n with index 11010 (6E16 or 1568) literally (case insensitive)
9th Alternative
n\.
10th Alternative
no
11th Alternative
no\.
\W
matches any non-word character (equivalent to [^a-zA-Z0-9_])
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
4th Capturing Group
(\d{1,3}\-?\d{1,3}?)
\d
matches a digit (equivalent to [0-9])
{1,3} matches the previous token between 1 and 3 times, as many times as possible, giving back as needed (greedy)
\-
matches the character - with index 4510 (2D16 or 558) literally (case insensitive)
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
\d
matches a digit (equivalent to [0-9])
{1,3}? matches the previous token between 1 and 3 times, as few times as possible, expanding as needed (lazy)
\W
matches any non-word character (equivalent to [^a-zA-Z0-9_])
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
5th Capturing Group
(bis|dup|mod|ant)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
1st Alternative
bis
bis
matches the characters bis literally (case insensitive)
2nd Alternative
dup
dup
matches the characters dup literally (case insensitive)
3rd Alternative
mod
mod
matches the characters mod literally (case insensitive)
4th Alternative
ant
ant
matches the characters ant literally (case insensitive)
\W
matches any non-word character (equivalent to [^a-zA-Z0-9_])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
6th Capturing Group
(BJ)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
BJ
matches the characters BJ literally (case insensitive)
\W matches any non-word character (equivalent to [^a-zA-Z0-9_])
\w
matches any word character (equivalent to [a-zA-Z0-9_])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\W matches any non-word character (equivalent to [^a-zA-Z0-9_])
Global pattern flags
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
g modifier: global. All matches (don't return after first match)
Your regular expression does not match the subject string.

Regular Expression
No Match

r"
"
ig

Test String