Regular Expressions 101

Community Patterns

extract title and page from ocr of pdf content by carlleonhard

0

Regular Expression
Python

r"
^\s*((?:[—‘’ \-\w]|(?<=\w)[.,:])+?[^ .])[., ]*([-\d]+)$
"
gm

Description

extract title and page from ocr of pdf content by carlleonhard

Submitted by carlleonhard - 2 years ago