import re
regex = re.compile(r"""
((?:[ -$&(-[\]-~]|([%'\\])\2)*(%(\d+\$)?[-+\s0#]?(\d+|\*)?(\.\d+)?[bt]?[diuoxXfeEgGcs]+)+(?:(?!(?:[ -$&(-[\]-~]|([%'\\])\7)*(?:%(?:\d+\$)?[-+\s0#]?(?:\d+|\*)?(?:\.\d+)?[bt]?[diuoxXfeEgGcs]+)+)(?:[ -$&(-[\]-~]|([%'\\])\8)*)?)
""", flags=re.MULTILINE | re.IGNORECASE | re.VERBOSE)
test_str = ("Color %s, we are looking for %%02droids %% number1 %d, number2 %05d, hex %#x, float %5.2f, unsigned value %u.\n\n\n\n"
"I wanted to also add the ability to capture \"any printable text characters besides %, ' and \\, unless these characters appear exactly twice\". This needs to be captured both before the initial % and after the conversion character.\n\n"
"any printable character: [ -~]\n"
"besides %, ' and \\: (?![\\\\%'])\n"
"these characters appear exactly twice: ( §§§§ |'{2}|\\\\{2}|%{2}) (§ = placeholder)\n"
"I am having a problem with the \"unless\", that is, getting the negative look-ahead to discard single occurrences but allow double occurrences of the specified characters.\n\n"
"Color %s, number1 %d, number2 %05d, hex %#x, float %5.2f, unsigned value %u.\n"
"| | | | | | |")
subst = "IMAGE REMOVED FROM CHROME\\r\\n"
result = regex.sub(subst, test_str)
if result:
print(result)
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Python, please visit: https://docs.python.org/3/library/re.html