Regular Expressions 101

Community Patterns

1

ตรวจสอบพยัญชนะต้นตัวสะกดสระและวรรณยุกต์ไทย

Created·2026-01-22 01:36
Updated·2026-01-23 12:42
Flavor·JavaScript
ตรวจสอบพยัญชนะต้น (ต้องมี) ตรวจตัวสะกดสำหรับสระที่ต้องมี ตรวจสอบการวางสระและวรรณยุกต์ไทย หมายเหตุ การตรวจสอบตัวสะกดในภาษาไทยตรวจสอบได้ยากเพราะภาษาไทยเป็นภาษาที่เขียนติด ๆ กันไม่มีการแบ่งคำอย่างชัดเจนทำให้การอ่านภาษาไทยผู้อ่านต้องใช้ความหมายของคำในการตัดสินการอ่านแบ่งคำตามความเหมาะสมเช่นคำว่า "ตากลม" อาจอ่านเป็น "ตาก-ลม" ก็ได้ หรืออ่านเป็น "ตา-กลม"ก็ได้ ดังนั้นการเขียน Regex เพื่อทำการตรวจสอบอาจช่วยได้ระดับหนึ่ง อ่าจมีผิดบ้างถูกบ้าง แต่ก็ถือว่าเป็นเครื่องมือที่ใช้ช่วยเหลือในการตรวจสอบเพิ่มเติมได้ 80% ของความเป็นไปใด้ก็แล้วกันนะครับ หวังว่าการเขียนเพิ่มเติมส่วนนี้ จะมีประโยชน์บ้างไม่มากก็น้อย
Submitted by อธิปัตย์ ล้อวงศ์งาม
1

Regex for Matching Documentation Websites

Created·2024-11-24 01:45
Flavor·JavaScript
Regex for Matching Documentation Websites This repository contains a powerful regular expression designed to match URLs that commonly point to documentation-related websites. The regex is optimized for flexibility, covering various terms and URL patterns. Regex Pattern ^.(?:\.|\/)(docs|documentation|help|guide|manual|reference|api|kb|support|resources|wiki|developer|how-to|tutorials|examples|learn|instructions)(?:\.|\/)?.$ Purpose This regex is intended to identify URLs that contain keywords associated with documentation or support websites. It handles common patterns in subdomains, directories, and file paths. Explanation ^.*: Matches any characters at the beginning of the URL (any prefix). (?:\.|\/): Matches either a period (.) or a forward slash (/) preceding the keyword. (docs|documentation|help|guide|manual|...): Matches any of the keywords listed in the group. (?:\.|\/)?: Allows an optional period (.) or forward slash (/) following the keyword. .*$: Matches any characters following the keyword (any suffix). Examples Positive Examples The following URLs should match the regex: https://example.com/docs http://docs.example.com https://example.com/documentation https://sub.domain.com/docs/index.html https://example.com/help https://api.example.com/docs http://example.com/manual/index.html https://wiki.example.com http://developer.example.com/guide https://example.com/tutorials/docs/page https://kb.example.com/docs/tutorial.html https://example.com/resources/documentation/tutorial.html http://example.com/reference/help/documentation.html https://developer.example.com/docs/tutorials/index.html http://support.example.com/documentation/overview https://resources.example.com/docs/v1/tutorial https://example.com/how-to/documentation http://example.com/api/reference/docs https://example.com/reference/v2/index.html http://example.com/docs/resources/api.html Negative Examples The following URLs should not match the regex: https://example.com/documentary http://helpful.example.com https://manuals.example.com http://example.com/references https://example.com/resourceful http://example.com/wiki-books https://apiary.example.com http://example.com/documents http://example.com/documentable https://help-center.example.com http://manual.example.com/docsystem https://example.com/resourcesful http://api.example.comary https://example.net/instructions-v1 http://example.org/learned-tutorial http://example.com/support-center Author Jeremy Georges-Filteau Website Github
Submitted by jgeofil

Community Library Entry

1

Regular Expression
Created·2023-05-14 05:03
Flavor·JavaScript

/
^(?!-00:00)(?=^(?:Z|[\+\-](?:0[0-9]|1[012]):00|\+0[34569]:30|\+10:30|-03:30|-09:30|\+13:00|\+14:00|\+05:45|\+08:45|\+12:45))^((Z)|([\+\-])(\d\d):(\d\d))$
/
gm
Open regex in editor

Description

Time Zone UTC Offsets in actual use for ISO 8601 / RFC 3339 Date Times (Museum of Bad Data)

https://regex101.com/library/F21Glr

Matches only (and every) UTC offset that is in actual use and valid under ISO 8601 and RFC 3339 (Y'know, the '2008-08-08T08:08:08+05:00' looking one, this is the plus/minus sign part and what follows, or the 'Z' for UTC).

This regex will work in all versions of Javascript past and present. The Expanded version with the comments will not.

Edge cases:

  • Rejects -00:00, which is valid RFC 3339 but invalid ISO 8601.
  • Accepts 13:00, 14:00, -03:30, and other yes-those-are-valid offsets

References:

Test cases:


// Time Zone UTC Offsets in actual use for ISO 8601 / RFC 3339 (Museum of Bad Data)
// Accept: UTC indicator
Z
// Accept: Valid +xx:00
+00:00
+01:00
+02:00
+03:00
+04:00
+05:00
+06:00
+07:00
+08:00
+09:00
+10:00
+11:00
+12:00
+13:00
+14:00
// Accept: Valid -xx:00
-01:00
-02:00
-03:00
-04:00
-05:00
-06:00
-07:00
-08:00
-09:00
-10:00
-11:00
-12:00

// Accept: Valid +xx:30
+03:30
+04:30
+05:30
+06:30
+09:30
+10:30

// Accept: Valid +xx:45
+05:45
+08:45
+12:45

// Accept: Valid -xx:30
-03:30
-09:30

// Accept: Valid: 30 offsets

+03:30
+04:30
+05:30
+06:30
+09:30
+10:30
-03:30
-09:30

// Accept: Valid :45 offsets
+05:45
+08:45
+12:45

// Reject: valid RFC 3339, invalid ISO 8601

-00:00

// Reject: no such UTC offset in use
-13:00
-14:00
+00:01
+00:03
+00:99
+20:00
+0:00
// Reject: no such UTC offset in use

+01:30
+07:30
+08:30
+02:30
+11:30
+12:30
+13:30
+14:30
-01:30
-02:30
-04:30
-05:30
-06:30
-07:30
-08:30
-10:30
-11:30
-12:30
-13:30
-14:30

// Reject: Unused :45 offsets
+01:45
+02:45
+03:45
+04:45
+06:45
+07:45
+09:45
+10:45
+11:45
+13:45
+14:45
-01:45
-02:45
-03:45
-04:45
-05:45
-06:45
-07:45
-08:45
-09:45
-10:45
-11:45
-12:45
-13:45
-14:45

+01:15
+02:15
+03:15
+04:15
+05:15
+06:15
+07:15
+08:15
+09:15
+10:15
+11:15
+12:15
+13:15
+14:15
-01:15
-02:15
-03:15
-04:15
-05:15
-06:15
-07:15
-08:15
-09:15
-10:15
-11:15
-12:15
-13:15
-14:15

// Reject: Z stands alone
Z00:00
Z00
Z0
// Reject: hyphen required
0100
+0100
-0100
// Reject: colon required
+0100
// Reject: No extra characters
2001-02-03T04:05:06.007+0800
+08:00 
 +08:00

Expanded Pattern:

^(?!-00:00)(?=^(?:Z|[\+\-](?:0[0-9]|1[012]):00|\+0[34569]:30|\+10:30|-03:30|-09:30|\+13:00|\+14:00|\+05:45|\+08:45|\+12:45))^((Z)|([\+\-])(\d\d):(\d\d))$
^ # use zero-width assertions to capture all the special cases:
(?!-00:00)                                    # not -00:00
(?=^(?:
   Z                                          # Z alone works,
   |[\+\-](?:0[0-9]|1[012]):00                # and all other +/- xx:00s,
   |\+0[34569]:30|\+10:30|-03:30|-09:30       # the +/- xx:30s.
   |\+13:00|\+14:00|\+05:45|\+08:45|\+12:45   # and these special cases
)) # Now that we've forced only positive matches, let's capture the pieces:
^(               # G1: the whole offset
  (Z) |             # G2: UTC indicator or nil
  ([\+\-])          # G3: +/- direction
  (\d\d)            # G4: "hours" part of offset
  :
  (\d\d)            # G5: "minutes" part of offset
)$
Submitted by Philip Flip Kromer