import re
regex = re.compile(r"(id='.*?') (srcDocId='.*?')", flags=re.IGNORECASE)
test_str = ("<document id='3316200' srcDocId='http://ecx.images-amazon.com/images/I/61A9A0fmN7L.jpg'</document>\n"
" <document id='3306829' srcDocId='http://ecx.images-amazon.com/images/I/71sQDUoJbmL.jpg'</document>\n"
" <document id='2406251' srcDocId='http://ecx.images-amazon.com/images/I/71j7ISxAOdL.jpg'</document>\n"
" <document id='2534144' srcDocId='http://ecx.images-amazon.com/images/I/71VXMXcrg2L.jpg'</document>\n"
" <document id='3417415' srcDocId='http://ecx.images-amazon.com/images/I/71Ymoo32gVL.jpg'</document>")
subst = "$1 $1"
result = regex.sub(subst, test_str)
if result:
print(result)
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Python, please visit: https://docs.python.org/3/library/re.html