Regular Expressions 101

Community Patterns

21

Get path from any text

Created·2023-01-31 14:38
Updated·2023-07-23 20:17
Flavor·PCRE2 (PHP)
Recommended·
Get path (windows style) from any type of text (error message, e-mail corps ...), quoted or not. THIS IS THE SINGLE LINE VERSION ! If you want understand how it work or edit it, go https://regex101.com/r/7o2fyy Relative path are not supported The goal is to catch what "Look like" a path. See the limitations UNC path and prefix path like //./], [//?/] or [//./UNC/] are allowed some url path like [file:///C:/] or [file://] are allowed Catch path quoted with ["] and [']. But these quotes are include with the catch Quoted path is not concerned by limitations Limitations : (only unquoted path) [dot] and [space] is allowed, but not in a row [dot+space] or [space+dot at end of file name isn't catched INSIDE A NAME FILE (or last directory if it is a path to a directory) : [comma] is not supported (it stop the catch) after a first [dot], any [space] stop the catch after a [space], catch is stoped if next character is not a [letter], [digit] or [-] so, double [space] stop the catch Compatibility compatible PCRE, PCRE2 AutoHotkey : don't forget to escape "%" in "`%" /!\ Powershell and .Net /!\\ : this regex need some modification to be interpreted by powershell. You have to replace each (?&CapturGroupName) by \k. Use this powershell code to do this replacement : ` $powershellRegex = @' [Put here the regex to replace (?&CapturGroupName) with \k] '@ -replace '\(\?&(\w+)\)', '\k' ` This example code must return : [Put here the regex to replace \k with \k]
Submitted by nitrateag

Community Library Entry

1

Regular Expression
Created·2023-01-09 15:47
Updated·2023-01-09 15:48
Flavor·Java

"
^ # get the title of this movie or series (?<title> [-\w'\"]+ # match separator to later replace into correct title (?<separator> [\s.] ) # note this must be lazy for the engine to work ltr not rtl (?: [-\w'\"]+\2 )*? )(?: # if this is an episode, lets match the season # number one way or another. if not, the year of the movie # make sure this is not just a number in the title followed by our separator. # like, iron man 3 2013 or my.fictional.24.series (?! \d+ \2 ) # now try to match the season number (?: s (?: \2? )? )? (?<season> \d\d? ) (?: e|x (?:\2? )? ) (?<episode> \d\d? ) # needed to validate the last token is a dot, or whatever. (?: e\d\d? (?:-e?\d\d?)? | x\d\d? )? | # this is likely a movie, match the year [(\[]?(?<year>\d{4})[)\]]? ) | # optional release name (?:(?<release> PROPER | REPACK | LIMITED | EXTENDED | INTERNAL | NEW(?:\ SOURCE)? | NUKED | UNRATED | .*?\ EDITION | HC)) | # optional resolution group (?<resolution> \d{3,4}\ ?p) | # optional quality group (?<quality> HDTV | WEB[-.]?DL | HDDVD | DVDRip | DVD | B[DR]Rip | Blu[-.\ ]?Ray | HDRip | WEBRIP ) | # optional codec group (?<codec> XviD | X26[45] | h26[45] | hevc ) | # optional audio group (?<audio> AC3 | AAC | DTS | DD5\.1) | # optional team group with hyphen prefix (?:-(?<team>.*?))? # optional extension group with . prefix (?:\.(?<extension>mkv|avi|mp4|srt))? $
"
gmix
Open regex in editor

Description

Analyze whether the torrent name is a Movie or TV Episode

Inspired from https://regex101.com/library/yP4bY4 There is two versions, see differences at the bottom

Groups:

  • Title (of the Movie/of the TV Series)
  • Season (if TV Episode)
  • Episode (if TV Episode)
  • Year (if Movie)
  • Name (only in v1, should match the TV Episode title if present)
  • Release name (PROPER, REPACK, LIMITED, etc..)
  • Resolution (720p, 1080p, etc..)
  • Quality (HDTV, BluRay, WebRip, etc..)
  • Codec (Xvid, x265, x264, etc..)
  • Audio coding (AAC, AC3, DTS, etc...)
  • Team (torrent group/team)
  • Extension (mkv, avi, mp4, etc...)

Versions:

  • v1, include the TV Episode title, and have all the groups in a single match, but will misinterpret the "name" and "team" groups if the expected pattern is not respected (see in the test strings/regex rules)
  • v2, does not include "name" group but is more reliable, will ignore unexpected patterns.
Submitted by Hot Priest