Regular Expressions 101

Library entries

0
python

Author cite

Get author names from shitty import
Submitted by anonymous - a day ago
0
java

Martch Letras y Parentesis

Martch Letras y Parentesis
Submitted by Daniel Gonzalez - 6 days ago
0
python

Session Start/Close

Capturing Start/Close in IRC-logs
Submitted by Corpset - 8 days ago
0
python

PLDI regex 3

for pldi answer
Submitted by PlaceReporter99 - 10 days ago
0
python

PLDI regex 2

for pldi answer
Submitted by PlaceReporter99 - 10 days ago
0
python

PLDI regex 1

For pldi answer
Submitted by PlaceReporter99 - 10 days ago
0
python

Dyno warning regex

Used to separate things in dyno warning
Submitted by anonymous - 10 days ago
0
python

Hesla ISJ

ISJ
Submitted by anonymous - 12 days ago
0
python

100-4300

match 100, 200, 300 upto 4300
Submitted by anonymous - 12 days ago
0
java

Riedler's 2nd URL regex

why did I make this?
Submitted by Riedler - 17 days ago
1
python

Get <NIC>

Get NICs from string
Submitted by anonymous - 18 days ago
1
golang

Home

Dj dus er geen je een je aan je enige eerst een he we er
Submitted by Webmaster - 18 days ago

Distinguish torrent files (series vs movies)

Vote

81

Regular Expression
python

r"
^ # get the title of this movie or series (?P<title> [-\w'"]+ # match separator to later replace into correct title (?P<separator> [ .] ) # note this *must* be lazy for the engine to work ltr not rtl (?: [-\w'"]+\2 )*? ) # start of movie vs serie check (?: # if this is an episode, lets match the season # number one way or another. if not, the year # of the movie (?: # series. can be a lot prettier if we used perl regex... # make sure this is not just a number in the title followed by our separator. # like, iron man 3 2013 or my.fictional.24.series (?! \d+ \2 ) # now try to match the season number (?: s (?: eason \2? )? )? (?P<season> \d\d? ) # needed to validate the last token is a dot, or whatever. (?: e\d\d? (?:-e?\d\d?)? | x\d\d? )? | # this is likely a movie, match the year (?P<year> [(\]]?\d{4}[)\]]? ) ) # make sure this ends with the separator, otherwise we # might be in the middle of something like "1080p" (?=\2) | # if we get here, this is likely still a movie. # match until one of the keywords (?= BOXSET | XVID | DIVX | LIMITED | UNRATED | PROPER | DTS | AC3 | AAC | BLU[ -]?RAY | HD(?:TV|DVD) | (?:DVD|B[DR]|WEB)RIP | \d+p | [hx]\.?264 ) )
"
gimx

Description

Loading markdown...
Submitted by Firas Dib - 9 years ago