Community Patterns

1

Strong Matcher for ISO 8601 / RFC 3339 Date Times; rejects bad TZ offsets, illegal times (Museum of Bad Data)

Created·2023-05-14 07:19
Flavor·ECMAScript (JavaScript)
Handles many nuanced cases around time zone offsets, leap seconds and leap days. References: List of UTC Offsets RFC 3339, the stricter rules that most systems use in practice ISO 8601, the widely known name for this format Leap Year Leap Second Caveats: Rejects -00:00 timezone offset. rejects all future leap seconds, and only parses ones with a 'Z' offsets. Pattern (?= (?:^ (?: # All non-leap-second YYYY-MM-DD parts: \d\d\d\d-(?:01-9] |10|11|12)-(?:0[1-9]|1[0-9]|2[0-8]) # Days 01-28 | \d\d\d\d-(?:0[13-9] |10|11|12)-(?:29|30) # Days 29+30 | \d\d\d\d-(?:0[13578]|10 |12)-31 # Day 31 | (?:\d\d[2468|\d\d048]|\d\d[13579)-02-29 # leap years not divisible by 100 | (?:0246800|1357900)-02-29 # leap years divisible by 400 ) T (?:(?:00-9]|1[0-9]|2[0-3]):(?:[0-5):(?:0-5)) # time part (?:\.\d\d\d)? # ms part (optional) (?:Z|\+\-|1[012]):00|\+0[34569]:30|\+10:30|-0[39]:30|\+1[34]:00|\+0[58]:45|\+12:45) $) |^(?:1972 |198[1235]|199[2347]|2012|2015 )-06-30T23:59:60Z$ # all june leapsecs |^(?:197[2-9]|1987|1989|199[058] |2005|2008|2016)-12-31T23:59:60Z$ # all december leapsecs ) (?!.*-00:00$) # if given a -00:00 offset reject unconditionally, psychhhh OK! Since only valid times are now possible, we can use a loose pattern match to parse. ^(\d\d\d\d)-(\d\d)-(\d\d)T(\d\d):(\d\d):(\d\d)(?:\.(\d\d\d))?((Z)|([\+\-])(\d\d):(\d\d))$ Test cases // accept: Exemplars 2008-02-03T04:05:06.007Z 2008-02-03T04:05:06Z 0000-01-01T00:00:00.000Z 9999-12-31T59:59:59.999Z 9999-12-31T59:59:59Z 2008-02-03T04:05:06.007+12:45 2008-02-03T04:05:06+03:30 0000-01-01T00:00:00.007-09:30 0000-02-29T04:05:06.007+14:00 9999-12-31T23:59:59.999+12:00 9999-12-31T23:59:59-12:00 // accept: Leap day, year is multiple of four 0004-02-29T04:05:06Z 0096-02-29T04:05:06Z 1560-02-29T04:05:06Z 2004-02-29T04:05:06Z 2020-02-29T04:05:06Z 2032-02-29T04:05:06Z 9996-02-29T04:05:06Z // accept: Leap day, year is Multiple of 400 0000-02-29T04:05:06Z 1200-02-29T04:05:06Z 1600-02-29T04:05:06Z 2000-02-29T04:05:06Z 3600-02-29T04:05:06Z 8000-02-29T04:05:06Z 9600-02-29T04:05:06Z // accept: Day in range for month 2008-01-30T04:05:06Z 2008-03-30T04:05:06Z 2008-04-30T04:05:06Z 2008-05-30T04:05:06Z 2008-06-30T04:05:06Z 2008-07-30T04:05:06Z 2008-08-30T04:05:06Z 2008-09-30T04:05:06Z 2008-10-30T04:05:06Z 2008-11-30T04:05:06Z 2008-12-30T04:05:06Z 2008-01-31T04:05:06Z 2008-02-31T04:05:06Z 2008-03-31T04:05:06Z 2008-05-31T04:05:06Z 2008-07-31T04:05:06Z 2008-08-31T04:05:06Z 2008-10-31T04:05:06Z 2008-12-31T04:05:06Z // accept: leap second 1972-06-30T23:59:60Z 1981-06-30T23:59:60Z 1982-06-30T23:59:60Z 1983-06-30T23:59:60Z 1985-06-30T23:59:60Z 1992-06-30T23:59:60Z 1993-06-30T23:59:60Z 1994-06-30T23:59:60Z 1997-06-30T23:59:60Z 2012-06-30T23:59:60Z 2015-06-30T23:59:60Z 1972-12-31T23:59:60Z 1973-12-31T23:59:60Z 1974-12-31T23:59:60Z 1975-12-31T23:59:60Z 1976-12-31T23:59:60Z 1977-12-31T23:59:60Z 1978-12-31T23:59:60Z 1979-12-31T23:59:60Z 1987-12-31T23:59:60Z 1989-12-31T23:59:60Z 1990-12-31T23:59:60Z 1995-12-31T23:59:60Z 1998-12-31T23:59:60Z 2005-12-31T23:59:60Z 2008-12-31T23:59:60Z 2016-12-31T23:59:60Z // REJECT: Out of range 10000-02-29T04:05:06Z 2008-00-30T04:05:06Z 2008-13-30T04:05:06Z 2008-02-30T04:05:06Z 2008-04-31T04:05:06Z 2008-06-31T04:05:06Z 2008-09-31T04:05:06Z 2008-11-31T04:05:06Z 2008-12-32T04:05:06Z 2008-12-99T04:05:06Z 2008-12-00T04:05:06Z // REJECT: Hour/min/sec out of range 2008-12-08T60:05:06Z 2008-12-08T04:60:06Z 2008-12-08T04:99:06Z 9999-12-31T59:59:61.999Z 2008-02-03T04:05:61.999Z 2008-02-03T04:05:61Z // REJECT: Negative dates not accepted -2000-02-29T04:05:06Z // REJECT: Malformed 2008-02-0304:05:06Z 20080203T040506Z 2008-02-03T04:05:06007Z 2008-02-03T04:05:06.7Z 2008-02-03T04:05:06.07Z 2008-02-03T04:05:06.0007Z 2008-02-03T04:05:06 2008-02-03T04:05:06.Z // REJECT: no leap second 1978-06-30T23:59:60Z // REJECT: assumes no future leap seconds 2024-12-31T23:59:60Z 2099-12-31T23:59:60Z 9999-12-31T23:59:60Z // REJECT: leap seconds only in UTC format 2005-12-31T23:59:60+00:00 2008-12-31T23:59:60+00:00 // REJECT: Not a leap day, not a multiple of four 2003-02-29T04:05:06Z 2037-02-29T04:05:06Z 2038-02-29T04:05:06Z 2039-02-29T04:05:06Z 9997-02-29T04:05:06Z 9998-02-29T04:05:06Z 9999-02-29T04:05:06Z // REJECT: Not a leap day, mult of 100 0100-02-29T04:05:06Z 1000-02-29T04:05:06Z 2100-02-29T04:05:06Z 2200-02-29T04:05:06Z 3500-02-29T04:05:06Z 3700-02-29T04:05:06Z
Submitted by Philip Flip Kromer (@mrflip)
1

Time Zone UTC Offsets in actual use for ISO 8601 / RFC 3339 Date Times (Museum of Bad Data)

Created·2023-05-14 05:03
Flavor·ECMAScript (JavaScript)
Time Zone UTC Offsets in actual use for ISO 8601 / RFC 3339 Date Times (Museum of Bad Data) https://regex101.com/library/F21Glr Matches only (and every) UTC offset that is in actual use and valid under ISO 8601 and RFC 3339 (Y'know, the '2008-08-08T08:08:08+05:00' looking one, this is the plus/minus sign part and what follows, or the 'Z' for UTC). This regex will work in all versions of Javascript past and present. The Expanded version with the comments will not. Edge cases: Rejects -00:00, which is valid RFC 3339 but invalid ISO 8601. Accepts 13:00, 14:00, -03:30, and other yes-those-are-valid offsets References: List of UTC Offsets RFC 3339, the stricter rules that most systems use in practice ISO 8601, the widely known name for this format Test cases: // Time Zone UTC Offsets in actual use for ISO 8601 / RFC 3339 (Museum of Bad Data) // Accept: UTC indicator Z // Accept: Valid +xx:00 +00:00 +01:00 +02:00 +03:00 +04:00 +05:00 +06:00 +07:00 +08:00 +09:00 +10:00 +11:00 +12:00 +13:00 +14:00 // Accept: Valid -xx:00 -01:00 -02:00 -03:00 -04:00 -05:00 -06:00 -07:00 -08:00 -09:00 -10:00 -11:00 -12:00 // Accept: Valid +xx:30 +03:30 +04:30 +05:30 +06:30 +09:30 +10:30 // Accept: Valid +xx:45 +05:45 +08:45 +12:45 // Accept: Valid -xx:30 -03:30 -09:30 // Accept: Valid: 30 offsets +03:30 +04:30 +05:30 +06:30 +09:30 +10:30 -03:30 -09:30 // Accept: Valid :45 offsets +05:45 +08:45 +12:45 // Reject: valid RFC 3339, invalid ISO 8601 -00:00 // Reject: no such UTC offset in use -13:00 -14:00 +00:01 +00:03 +00:99 +20:00 +0:00 // Reject: no such UTC offset in use +01:30 +07:30 +08:30 +02:30 +11:30 +12:30 +13:30 +14:30 -01:30 -02:30 -04:30 -05:30 -06:30 -07:30 -08:30 -10:30 -11:30 -12:30 -13:30 -14:30 // Reject: Unused :45 offsets +01:45 +02:45 +03:45 +04:45 +06:45 +07:45 +09:45 +10:45 +11:45 +13:45 +14:45 -01:45 -02:45 -03:45 -04:45 -05:45 -06:45 -07:45 -08:45 -09:45 -10:45 -11:45 -12:45 -13:45 -14:45 +01:15 +02:15 +03:15 +04:15 +05:15 +06:15 +07:15 +08:15 +09:15 +10:15 +11:15 +12:15 +13:15 +14:15 -01:15 -02:15 -03:15 -04:15 -05:15 -06:15 -07:15 -08:15 -09:15 -10:15 -11:15 -12:15 -13:15 -14:15 // Reject: Z stands alone Z00:00 Z00 Z0 // Reject: hyphen required 0100 +0100 -0100 // Reject: colon required +0100 // Reject: No extra characters 2001-02-03T04:05:06.007+0800 +08:00 +08:00 Expanded Pattern: ^(?!-00:00)(?=^(?:Z|\+\-|1012]):00|\+0[34569]:30|\+10:30|-03:30|-09:30|\+13:00|\+14:00|\+05:45|\+08:45|\+12:45))^((Z)|([\+\-])(\d\d):(\d\d))$ ^ # use zero-width assertions to capture all the special cases: (?!-00:00) # not -00:00 (?=^(?: Z # Z alone works, |[\+\-|1[012]):00 # and all other +/- xx:00s, |\+0[34569]:30|\+10:30|-03:30|-09:30 # the +/- xx:30s. |\+13:00|\+14:00|\+05:45|\+08:45|\+12:45 # and these special cases )) # Now that we've forced only positive matches, let's capture the pieces: ^( # G1: the whole offset (Z) | # G2: UTC indicator or nil ([\+\-]) # G3: +/- direction (\d\d) # G4: "hours" part of offset : (\d\d) # G5: "minutes" part of offset )$
Submitted by Philip Flip Kromer
1

Date - Extract & Validate - Fully tested - Format YYYY-MM-DD (dynamic parts separator / can use a different separator)

Created·2020-11-20 19:31
Flavor·ECMAScript (JavaScript)
A fully tested regex that extracts and validates date parts using named capturing groups. \ Validations: Year must be preceded by nothing or a non-digit character Year must have 4 digits Month must be between 01 and 12 Month must have 2 digits Day must be between 01 and the maximum number of days for the month (e.g. february can't have more than 29 days) Day must have 2 digits Day must be followed by nothing or a non-digit character Separator must be any single character that is not a space or an alphanumeric character Separator must be the same between each date part \ Capturing groups: | # | Name | Description | |:-:|:-------:|-------------------------------------| | 1 | year | 4 digits of the year | | 2 | sep | Date parts separator | | 3 | month | 2 digits of the month | | 4 | day | 2 digits of the date (day of month) | \ Example usage: let match = regex.exec('2020-11-22') console.log('year: %s, month: %s, day: %s', match.groups.year, match.groups.month, match.groups.day) // year: 2020, month: 11, day: 22 \ Compatibility: (updated 2020-11-20) Chrome >= 64 Edge >= 79 Firefox >= 78 IE incompatible (lookbehind assertions & named capture groups not supported) Opera >= 51 Safari incompatible (lookbehind assertions not supported) NodeJS >= 10.0.0 See regex compatibility table. \ Note: does not validate leap years (not really possible in regex)
Submitted by Elie Grenon (DrunkenPoney) <elie.grenon.1@gmail.com>