import re
regex = re.compile(r"^(\w{4,}?)(?:es|s|e|x)$", flags=re.MULTILINE | re.UNICODE)
test_str = ("I am trying to delete all word suffixes -es, -s, -e or -x of all words that have at least 4 characters after removing the suffix, using regex in Python.\n\n"
"There are some examples of desired output (in French):\n\n"
"technologiques\n"
"→ technologiqu\n"
"pares\n"
" → pare (the word is too small so it does not remove the \"es\", only the \"s\")\n"
"bas\n"
" → bas (the word is too small so it does not do anything)\n"
"matériaux\n"
" → materiau\n"
"sièges\n"
" → sieg\n"
"siege\n"
" → sieg\n"
"feuilletées\n"
" → feuilleté\n"
"dos\n"
" → dos\n")
subst = "\\1"
result = regex.sub(subst, test_str)
if result:
print(result)
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Python, please visit: https://docs.python.org/3/library/re.html