Regular Expressions 101

Community Patterns

22

Get path from any text

Created·2023-01-31 14:38
Updated·2023-07-23 20:17
Flavor·PCRE2 (PHP)
Recommended·
Get path (windows style) from any type of text (error message, e-mail corps ...), quoted or not. THIS IS THE SINGLE LINE VERSION ! If you want understand how it work or edit it, go https://regex101.com/r/7o2fyy Relative path are not supported The goal is to catch what "Look like" a path. See the limitations UNC path and prefix path like //./], [//?/] or [//./UNC/] are allowed some url path like [file:///C:/] or [file://] are allowed Catch path quoted with ["] and [']. But these quotes are include with the catch Quoted path is not concerned by limitations Limitations : (only unquoted path) [dot] and [space] is allowed, but not in a row [dot+space] or [space+dot at end of file name isn't catched INSIDE A NAME FILE (or last directory if it is a path to a directory) : [comma] is not supported (it stop the catch) after a first [dot], any [space] stop the catch after a [space], catch is stoped if next character is not a [letter], [digit] or [-] so, double [space] stop the catch Compatibility compatible PCRE, PCRE2 AutoHotkey : don't forget to escape "%" in "`%" /!\ Powershell and .Net /!\\ : this regex need some modification to be interpreted by powershell. You have to replace each (?&CapturGroupName) by \k. Use this powershell code to do this replacement : ` $powershellRegex = @' [Put here the regex to replace (?&CapturGroupName) with \k] '@ -replace '\(\?&(\w+)\)', '\k' ` This example code must return : [Put here the regex to replace \k with \k]
Submitted by nitrateag

Community Library Entry

0

Regular Expression
Created·2023-05-19 20:35
Flavor·PCRE2 (PHP)

/
(?<CountryCodes>^\b(AF|AX|AL|DZ|AS|AD|AO|AI|AQ|AG|AR|AM|AW|AU|AT|AZ|BS|BH|BD|BB|BY|BE|BZ|BJ|BM|BT|BO|BQ|BA|BW|BV|BR|IO|BN|BG|BF|BI|KH|CM|CA|CV|KY|CF|TD|CL|CN|CX|CC|CO|KM|CG|CD|CK|CR|CI|HR|CU|CW|CY|CZ|DK|DJ|DM|DO|EC|EG|SV|GQ|ER|EE|ET|EU|FK|FO|FJ|FI|FR|GF|PF|TF|GA|GM|GE|DE|GH|GI|GR|GL|GD|GP|GU|GT|GG|GN|GW|GY|HT|HM|VA|HN|HK|HU|IS|IN|ID|IR|IQ|IE|IM|IL|IT|JM|JP|JE|JO|KZ|KE|KI|KP|KR|KW|KG|LA|LV|LB|LS|LR|LY|LI|LT|LU|MO|MK|MG|MW|MY|MV|ML|MT|MH|MQ|MR|MU|YT|MX|FM|MD|MC|MN|ME|MS|MA|MZ|MM|NA|NR|NP|NL|NC|NZ|NI|NE|NG|NU|NF|MP|NO|OM|PK|PW|PS|PA|PG|PY|PE|PH|PN|PL|PT|PR|QA|RE|RO|RU|RW|BL|SH|KN|LC|MF|PM|VC|WS|SM|ST|SA|SN|RS|SC|SL|SG|SX|SK|SI|SB|SO|ZA|GS|SS|ES|LK|SD|SR|SJ|SZ|SE|CH|SY|TW|TJ|TZ|TH|TL|TG|TK|TO|TT|TN|TR|TM|TC|TV|UG|UA|AE|GB|US|UM|UY|UZ|VU|VE|VN|VG|VI|WF|EH|YE|ZM|ZW|AFG|ALB|DZA|ASM|AND|AGO|AIA|ATA|ATG|ARG|ARM|ABW|AUS|AUT|AZE|BHS|BHR|BGD|BRB|BLR|BEL|BLZ|BEN|BMU|BTN|BOL|BIH|BWA|BVT|BRA|IOT|VGB|BRN|BGR|BFA|BDI|KHM|CMR|CAN|CPV|CYM|CAF|TCD|CHL|CHN|CXR|CCK|COL|COM|COD|COG|COK|CRI|CIV|CUB|CYP|CZE|DNK|DJI|DMA|DOM|ECU|EGY|SLV|GNQ|ERI|EST|ETH|FRO|FLK|FJI|FIN|FRA|GUF|PYF|ATF|GAB|GMB|GEO|DEU|GHA|GIB|GRC|GRL|GRD|GLP|GUM|GTM|GIN|GNB|GUY|HTI|HMD|VAT|HND|HKG|HRV|HUN|ISL|IND|IDN|IRN|IRQ|IRL|ISR|ITA|JAM|JPN|JOR|KAZ|KEN|KIR|PRK|KOR|KWT|KGZ|LAO|LVA|LBN|LSO|LBR|LBY|LIE|LTU|LUX|MAC|MKD|MDG|MWI|MYS|MDV|MLI|MLT|MHL|MTQ|MRT|MUS|MYT|MEX|FSM|MDA|MCO|MNG|MSR|MAR|MOZ|MMR|NAM|NRU|NPL|ANT|NLD|NCL|NZL|NIC|NER|NGA|NIU|NFK|MNP|NOR|OMN|PAK|PLW|PSE|PAN|PNG|PRY|PER|PHL|PCN|POL|PRT|PRI|QAT|REU|ROU|RUS|RWA|SHN|KNA|LCA|SPM|VCT|WSM|SMR|STP|SAU|SEN|SCG|SYC|SLE|SGP|SVK|SVN|SLB|SOM|ZAF|SGS|ESP|LKA|SDN|SUR|SJM|SWZ|SWE|CHE|SYR|TWN|TJK|TZA|THA|TLS|TGO|TKL|TON|TTO|TUN|TUR|TKM|TCA|TUV|VIR|UGA|UKR|ARE|GBR|UMI|USA|URY|UZB|VUT|VEN|VNM|WLF|ESH|YEM|ZMB|ZWE)\b)
/
gm
Open regex in editor

Description

Country Code Classifier

ISO 3166-1 alpha-2 codes are two-letter country codes defined in ISO 3166-1, part of the ISO 3166 standard published by the International Organization for Standardization (ISO), to represent countries, dependent territories, and special areas of geographical interest. They are the most widely used of the country codes published by ISO (the others being alpha-3 and numeric), and are used most prominently for the Internet's country code top-level domains (with a few exceptions). They are also used as country identifiers extending the postal code when appropriate within the international postal system for paper mail, and have replaced the previous one consisting one-letter codes. They were first included as part of the ISO 3166 standard in its first edition in 1974.

side note: if applicable, limit the length of pattern matching to 2 or 3 to have better results of using for data classification.

feel free to reach out: loganmcampbell@hotmail.com

Submitted by loganmcampbell