Regular Expressions 101

Community Patterns

[Wikipedia] Article descriptive tokenizer

0

Regular Expression
PCRE (PHP <7.3)

/
(([\w\s\d]+\s)(is the|is a|are the|are a|was)\s([\w\d\s\[\]\)\(,']+))
/
g

Description

Match patterns of form "{% sentence_prefix %} (is|are) (a|the) {% description_of_prefix %}". Should almost always match part of the first sentence of a Wikipedia article.

Submitted by anonymous - 6 years ago