Regular Expressions 101

Community Patterns

21

Get path from any text

Created·2023-01-31 14:38
Updated·2023-07-23 20:17
Flavor·PCRE2 (PHP)
Recommended·
Get path (windows style) from any type of text (error message, e-mail corps ...), quoted or not. THIS IS THE SINGLE LINE VERSION ! If you want understand how it work or edit it, go https://regex101.com/r/7o2fyy Relative path are not supported The goal is to catch what "Look like" a path. See the limitations UNC path and prefix path like //./], [//?/] or [//./UNC/] are allowed some url path like [file:///C:/] or [file://] are allowed Catch path quoted with ["] and [']. But these quotes are include with the catch Quoted path is not concerned by limitations Limitations : (only unquoted path) [dot] and [space] is allowed, but not in a row [dot+space] or [space+dot at end of file name isn't catched INSIDE A NAME FILE (or last directory if it is a path to a directory) : [comma] is not supported (it stop the catch) after a first [dot], any [space] stop the catch after a [space], catch is stoped if next character is not a [letter], [digit] or [-] so, double [space] stop the catch Compatibility compatible PCRE, PCRE2 AutoHotkey : don't forget to escape "%" in "`%" /!\ Powershell and .Net /!\\ : this regex need some modification to be interpreted by powershell. You have to replace each (?&CapturGroupName) by \k. Use this powershell code to do this replacement : ` $powershellRegex = @' [Put here the regex to replace (?&CapturGroupName) with \k] '@ -replace '\(\?&(\w+)\)', '\k' ` This example code must return : [Put here the regex to replace \k with \k]
Submitted by nitrateag

Community Library Entry

1

Regular Expression
Created·2016-06-23 20:08
Flavor·Python

r"
#Modifiers << NOT GOING TO DO THESE HERE. #∂ Before After Until in?Between #∂ onwards #Time-Time << Don't need as both times will match #Lonely HOURS List: 10,time or 12 <not going to do this # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # - - - - - - - - - - - BIG TIME REGEX - - - - - - - - - - - - #LONELY HOURS #These guys are just \d\d. So we identify them through proximity to known times (?: #Boundary Lookaround (?<=[ ]|^|[^\d\.:\r\n$£-]) #hour - you always have hours (?:1[0-9]|2[0-4]|0?[0-9]) (?: #Hour [or / to - , and] #People don't say "12pm or 3?" much. Build it if it comes up. (?:[ ]? (?:or|[,/-]|to|and) [ ]?) ) ){0,3} #TIME BASICS: #Basic times that are definitely times. Have to come last so longer strings match first. #Boundary Lookaround (BREAK THIS FOR THE LONELY NUMBERS) # (?<=[ ]|^|[^\d\.:\r\n$£-]) #hour - you always have hours ( (?:1[0-9]|2[0-4]|0?[0-9]) ) #Clarifier. Minutes, AM|PM, a TimeZone or a combination. ( #Minutes (?:[ ]?[\:\. ][ ]?) ([0-5][0-9]) #AM|PM | (?:[ ]?[ap]m|[ ]?o\'?[ ]?clock) #TimeZones | (?:[ ]? (?:PS?T|GMT|∆USA|∆US|ET|BST|∆UK |UK[ ]?[Tt]ime|[Ee]astern[ ]?[Tt]ime|[Pp]acific[ ]?[Tt]ime|[Cc]entral[ ]?[Tt]ime |∆UTC|ACDT|ACST|ACT|ACT|ADT|AEDT|AEST|AFT|AKDT|AKST|AMST|AMT|AMT|ART|AST|AST|AWDT |AWST|AZOST|AZT|BDT|BDT|BIOT|BRST|BRT|BST|BST|BST|BTT|CCT|CDT|CDT|CEDT|CEST|CET |CHADT|CHAST|CHOT|ChST|CHUT|CIST|CIT|CKT|CLST|CLT|COT|CST|CST|CST|CST|CST|CT|CVT |CWST|CXT|DAVT|DDUT|DFT|EASST|ECT|ECT|EDT|EEDT|EEST|EET|EGST|EGT|EIT|EST|EST|FET |FJT|FKST|FKST|FKT|FNT|GALT|GAMT|GFT|GILT|GIT|GMT|GST|GST|GYT|HADT|HAEC|HAST|HKT |HMT|HOVT|HST|IBST|ICT|IDT|IRDT|IRKT|IRST|IST|IST|IST|JST|KGT|KOST|KRAT|KST|LHST |LHST|LINT|MAGT|MART|MAWT|MDT|MET|MEST|MHT|MIST|MMT|MSK|MST|MST|MST|MUT|MVT|MYT|NCT |NDT|NFT|NPT|NST|NT|NUT|NZDT|NZST|OMST|ORAT|PDT|PETT|PGT|PHOT|PKT|PMDT|PMST|PONT |PST|PST|PYST|PYT|RET|ROTT|SAKT|SAMT|SAST|SBT|SCT|SGT|SLST|SRET|SRT|SST|SST|SYOT |TAHT|THA|TFT|TJT|TKT|TLT|TMT|TOT|TVT|ULAT|USZ1|UYST|UYT|UZT|VET|VLAT|VOLT|VOST |VUT|WAKT|WAST|WAT|WEDT|WEST|WET|WST|YAKT)\b ) #Now enable the combo. Note ideally we would say "use one of each" e.g., =/10:234210 ){1,3} #Lonely HOURS: List of Military times <Here we have enough info to allow no ":". |( (?: #Boundary Lookaround (?<=[ ]|^|[^\d\.:\r\n$£-]) #Must open with a military time, no "445" must be "0445" (?:1[0-9]|2[0-4]|0[0-9]) (?:[ ]?[0-5][0-9][ ]?) #or and (?:[ ]? (?:or|[,/-]|to|and) [ ]?) #2-4 military times, including the closing one ){1,3} #Must also end with a military time (?:1[0-9]|2[0-4]|0[0-9]) (?:[ ]?[0-5][0-9][ ]?) ) #Post rationalise the boundary criteria (?:\b|$|,\b)
"
gmix
Open regex in editor

Description

no description available

Submitted by anonymous