Regular Expressions 101

Community Patterns

identify duplicate consecutive words + word combinations to remove

0

Regular Expression
PCRE2 (PHP >=7.3)

/
((\b\w+(\s+\w+)*(\s+\w+)*\b))(\s+\1)+
/
gm

Description

Identifies duplicate word and word combinations that exist consecutively within a string for removal. It is currently configured for up to 3 consecutive words but can be extended by adding additional (\s+\w)* to the first groups pattern match logic

use with substitution val $1

good for removing dups for fred fred fred fred smith fred smith Michael J Fox Michael J Fox etc...

Submitted by GrantO - 3 years ago