Regular Expressions 101

Community Patterns

1

ตรวจสอบพยัญชนะต้นตัวสะกดสระและวรรณยุกต์ไทย

Created·2026-01-22 01:36
Updated·2026-01-23 12:42
Flavor·JavaScript
ตรวจสอบพยัญชนะต้น (ต้องมี) ตรวจตัวสะกดสำหรับสระที่ต้องมี ตรวจสอบการวางสระและวรรณยุกต์ไทย หมายเหตุ การตรวจสอบตัวสะกดในภาษาไทยตรวจสอบได้ยากเพราะภาษาไทยเป็นภาษาที่เขียนติด ๆ กันไม่มีการแบ่งคำอย่างชัดเจนทำให้การอ่านภาษาไทยผู้อ่านต้องใช้ความหมายของคำในการตัดสินการอ่านแบ่งคำตามความเหมาะสมเช่นคำว่า "ตากลม" อาจอ่านเป็น "ตาก-ลม" ก็ได้ หรืออ่านเป็น "ตา-กลม"ก็ได้ ดังนั้นการเขียน Regex เพื่อทำการตรวจสอบอาจช่วยได้ระดับหนึ่ง อ่าจมีผิดบ้างถูกบ้าง แต่ก็ถือว่าเป็นเครื่องมือที่ใช้ช่วยเหลือในการตรวจสอบเพิ่มเติมได้ 80% ของความเป็นไปใด้ก็แล้วกันนะครับ หวังว่าการเขียนเพิ่มเติมส่วนนี้ จะมีประโยชน์บ้างไม่มากก็น้อย
Submitted by อธิปัตย์ ล้อวงศ์งาม
1

Regex for Matching Documentation Websites

Created·2024-11-24 01:45
Flavor·JavaScript
Regex for Matching Documentation Websites This repository contains a powerful regular expression designed to match URLs that commonly point to documentation-related websites. The regex is optimized for flexibility, covering various terms and URL patterns. Regex Pattern ^.(?:\.|\/)(docs|documentation|help|guide|manual|reference|api|kb|support|resources|wiki|developer|how-to|tutorials|examples|learn|instructions)(?:\.|\/)?.$ Purpose This regex is intended to identify URLs that contain keywords associated with documentation or support websites. It handles common patterns in subdomains, directories, and file paths. Explanation ^.*: Matches any characters at the beginning of the URL (any prefix). (?:\.|\/): Matches either a period (.) or a forward slash (/) preceding the keyword. (docs|documentation|help|guide|manual|...): Matches any of the keywords listed in the group. (?:\.|\/)?: Allows an optional period (.) or forward slash (/) following the keyword. .*$: Matches any characters following the keyword (any suffix). Examples Positive Examples The following URLs should match the regex: https://example.com/docs http://docs.example.com https://example.com/documentation https://sub.domain.com/docs/index.html https://example.com/help https://api.example.com/docs http://example.com/manual/index.html https://wiki.example.com http://developer.example.com/guide https://example.com/tutorials/docs/page https://kb.example.com/docs/tutorial.html https://example.com/resources/documentation/tutorial.html http://example.com/reference/help/documentation.html https://developer.example.com/docs/tutorials/index.html http://support.example.com/documentation/overview https://resources.example.com/docs/v1/tutorial https://example.com/how-to/documentation http://example.com/api/reference/docs https://example.com/reference/v2/index.html http://example.com/docs/resources/api.html Negative Examples The following URLs should not match the regex: https://example.com/documentary http://helpful.example.com https://manuals.example.com http://example.com/references https://example.com/resourceful http://example.com/wiki-books https://apiary.example.com http://example.com/documents http://example.com/documentable https://help-center.example.com http://manual.example.com/docsystem https://example.com/resourcesful http://api.example.comary https://example.net/instructions-v1 http://example.org/learned-tutorial http://example.com/support-center Author Jeremy Georges-Filteau Website Github
Submitted by jgeofil

Community Library Entry

0

Regular Expression
Created·2022-04-24 13:21
Flavor·PCRE (Legacy)

/
(?P<Element><(?P<TagName>[:_A-z][-.0-9:_A-z\xB7]*)(?:[\x09\x0A\x0D\x20]+[:_A-z][-.0-9:_A-z\xB7]*[\x09\x0A\x0D\x20]*=[\x09\x0A\x0D\x20]*(?:"(?:[^<&"]|&(?:[:_A-z][-.0-9:_A-z\xB7]*|#(?:[0-9]+|x[0-9a-fA-F]+));)*"|'(?:[^<&']|&(?:[:_A-z][-.0-9:_A-z\xB7]*|#(?:[0-9]+|x[0-9a-fA-F]+));)*'))*[\x09\x0A\x0D\x20]*(?:>(?:(?:[^<&\]]|](?!]>))*(?:(?:(?P>Element)|&(?:[:_A-z][-.0-9:_A-z\xB7]*|#(?:[0-9]+|x[0-9a-fA-F]+));|<!\[CDATA\[(?:[^\x01-\x08\x0B\x0C\x0E-\x1F\x5D]|](?!]>))*]]>|<\?[:_A-z][-.0-9:_A-z\xB7]*(?<!(?i:\?xml))(?:[\x09\x0A\x0D\x20]+(?:[^\x01-\x08\x0B\x0C\x0E-\x1F\x3F]|\?(?!>))*)?\?>|<!--(?:[^\x01-\x08\x0B\x0C\x0E-\x1F\x2D]|-(?!-))*-->)(?:[^<&\]]|](?!]>))*)*)<\/(?P=TagName)[\x09\x0A\x0D\x20]*|\/)>)|&(?:[:_A-z][-.0-9:_A-z\xB7]*|#(?:[0-9]+|x[0-9a-fA-F]+));|<!\[CDATA\[(?:[^\x01-\x08\x0B\x0C\x0E-\x1F\x5D]|](?!]>))*]]>|<\?[:_A-z][-.0-9:_A-z\xB7]*(?<!(?i:\?xml))(?:[\x09\x0A\x0D\x20]+(?:[^\x01-\x08\x0B\x0C\x0E-\x1F\x3F]|\?(?!>))*)?\?>|<!--(?:[^\x01-\x08\x0B\x0C\x0E-\x1F\x2D]|-(?!-))*-->|<!DOCTYPE[\x09\x0A\x0D\x20]+[:_A-z][-.0-9:_A-z\xB7]*(?:[\x09\x0A\x0D\x20]+(?:SYSTEM[\x09\x0A\x0D\x20]+(?:"[^"]*"|'[^']*')|PUBLIC[\x09\x0A\x0D\x20]+(?:"[\x0A\x0D\x20\x21\x23-\x25\x27-\x2F\x3A\x3B\x3D\x3F\x40_0-9A-z]*"|'[\x0A\x0D\x20\x21\x23-\x25\x28-\x2F\x3A\x3B\x3D\x3F\x40_0-9A-z]*')[\x09\x0A\x0D\x20]+(?:"[^"]*"|'[^']*')))?[\x09\x0A\x0D\x20]*(?:\[(?:(?:<!ELEMENT[\x09\x0A\x0D\x20]+[:_A-z][-.0-9:_A-z\xB7]*[\x09\x0A\x0D\x20]+(?:EMPTY|ANY|\([\x09\x0A\x0D\x20]*#PCDATA(?:(?:[\x09\x0A\x0D\x20]*\|[\x09\x0A\x0D\x20]*[:_A-z][-.0-9:_A-z\xB7]*)*[\x09\x0A\x0D\x20]*\)\*|[\x09\x0A\x0D\x20]*\))|(?:(?P<choice>\([\x09\x0A\x0D\x20]*(?:[:_A-z][-.0-9:_A-z\xB7]*|(?P>choice)|(?P>seq))[?*+]?(?:[\x09\x0A\x0D\x20]*\|[\x09\x0A\x0D\x20]*(?:[:_A-z][-.0-9:_A-z\xB7]*|(?P>choice)|(?P>seq))[?*+]?)+[\x09\x0A\x0D\x20]*\))|(?P<seq>\([\x09\x0A\x0D\x20]*(?:[:_A-z][-.0-9:_A-z\xB7]*|(?P>choice)|(?P>seq))[?*+]?(?:[\x09\x0A\x0D\x20]*,[\x09\x0A\x0D\x20]*(?:[:_A-z][-.0-9:_A-z\xB7]*|(?P>choice)|(?P>seq))[?*+]?)*[\x09\x0A\x0D\x20]*\)))[?*+]?)[\x09\x0A\x0D\x20]*>|<!ATTLIST[\x09\x0A\x0D\x20]+[:_A-z][-.0-9:_A-z\xB7]*(?:[\x09\x0A\x0D\x20]+[:_A-z][-.0-9:_A-z\xB7]*[\x09\x0A\x0D\x20]+(?:CDATA|(?:ID(?:REFS?)?|ENTIT(?:Y|IES)|NMTOKENS?)|(?:NOTATION[\x09\x0A\x0D\x20]+\([\x09\x0A\x0D\x20]*[:_A-z][-.0-9:_A-z\xB7]*(?:[\x09\x0A\x0D\x20]*\|[\x09\x0A\x0D\x20]*[:_A-z][-.0-9:_A-z\xB7]*)*[\x09\x0A\x0D\x20]*\)|\([\x09\x0A\x0D\x20]*(?:[-.0-9:_A-z\xB7])+(?:[\x09\x0A\x0D\x20]*\|[\x09\x0A\x0D\x20]*(?:[-.0-9:_A-z\xB7])+)*[\x09\x0A\x0D\x20]*\)))[\x09\x0A\x0D\x20]+(?:#(?:REQUIRED|IMPLIED)|(?:#FIXED[\x09\x0A\x0D\x20]+)?(?:"(?:[^<&"]|&(?:[:_A-z][-.0-9:_A-z\xB7]*|#(?:[0-9]+|x[0-9a-fA-F]+));)*"|'(?:[^<&']|&(?:[:_A-z][-.0-9:_A-z\xB7]*|#(?:[0-9]+|x[0-9a-fA-F]+));)*')))*[\x09\x0A\x0D\x20]*>|(?:<!ENTITY[\x09\x0A\x0D\x20]+[:_A-z][-.0-9:_A-z\xB7]*[\x09\x0A\x0D\x20]+(?:(?:"(?:[^%&"]|%[:_A-z][-.0-9:_A-z\xB7]*;|&(?:[:_A-z][-.0-9:_A-z\xB7]*|#(?:[0-9]+|x[0-9a-fA-F]+));)*"|'(?:[^%&']|%[:_A-z][-.0-9:_A-z\xB7]*;|&(?:[:_A-z][-.0-9:_A-z\xB7]*|#(?:[0-9]+|x[0-9a-fA-F]+));)*')|(?:SYSTEM[\x09\x0A\x0D\x20]+(?:"[^"]*"|'[^']*')|PUBLIC[\x09\x0A\x0D\x20]+(?:"[\x0A\x0D\x20\x21\x23-\x25\x27-\x2F\x3A\x3B\x3D\x3F\x40_0-9A-z]*"|'[\x0A\x0D\x20\x21\x23-\x25\x28-\x2F\x3A\x3B\x3D\x3F\x40_0-9A-z]*')[\x09\x0A\x0D\x20]+(?:"[^"]*"|'[^']*'))(?:[\x09\x0A\x0D\x20]+NDATA[\x09\x0A\x0D\x20]+[:_A-z][-.0-9:_A-z\xB7]*)?)[\x09\x0A\x0D\x20]*>|<!ENTITY[\x09\x0A\x0D\x20]+%[\x09\x0A\x0D\x20]+[:_A-z][-.0-9:_A-z\xB7]*[\x09\x0A\x0D\x20]+(?:(?:"(?:[^%&"]|%[:_A-z][-.0-9:_A-z\xB7]*;|&(?:[:_A-z][-.0-9:_A-z\xB7]*|#(?:[0-9]+|x[0-9a-fA-F]+));)*"|'(?:[^%&']|%[:_A-z][-.0-9:_A-z\xB7]*;|&(?:[:_A-z][-.0-9:_A-z\xB7]*|#(?:[0-9]+|x[0-9a-fA-F]+));)*')|(?:SYSTEM[\x09\x0A\x0D\x20]+(?:"[^"]*"|'[^']*')|PUBLIC[\x09\x0A\x0D\x20]+(?:"[\x0A\x0D\x20\x21\x23-\x25\x27-\x2F\x3A\x3B\x3D\x3F\x40_0-9A-z]*"|'[\x0A\x0D\x20\x21\x23-\x25\x28-\x2F\x3A\x3B\x3D\x3F\x40_0-9A-z]*')[\x09\x0A\x0D\x20]+(?:"[^"]*"|'[^']*')))[\x09\x0A\x0D\x20]*>)|<!NOTATION[\x09\x0A\x0D\x20]+[:_A-z][-.0-9:_A-z\xB7]*[\x09\x0A\x0D\x20]+(?:(?:SYSTEM[\x09\x0A\x0D\x20]+(?:"[^"]*"|'[^']*')|PUBLIC[\x09\x0A\x0D\x20]+(?:"[\x0A\x0D\x20\x21\x23-\x25\x27-\x2F\x3A\x3B\x3D\x3F\x40_0-9A-z]*"|'[\x0A\x0D\x20\x21\x23-\x25\x28-\x2F\x3A\x3B\x3D\x3F\x40_0-9A-z]*')[\x09\x0A\x0D\x20]+(?:"[^"]*"|'[^']*'))|PUBLIC[\x09\x0A\x0D\x20]+(?:"[\x0A\x0D\x20\x21\x23-\x25\x27-\x2F\x3A\x3B\x3D\x3F\x40_0-9A-z]*"|'[\x0A\x0D\x20\x21\x23-\x25\x28-\x2F\x3A\x3B\x3D\x3F\x40_0-9A-z]*'))[\x09\x0A\x0D\x20]*>|<\?[:_A-z][-.0-9:_A-z\xB7]*(?<!(?i:\?xml))(?:[\x09\x0A\x0D\x20]+(?:[^\x01-\x08\x0B\x0C\x0E-\x1F\x3F]|\?(?!>))*)?\?>|<!--(?:[^\x01-\x08\x0B\x0C\x0E-\x1F\x2D]|-(?!-))*-->)|(?:%[:_A-z][-.0-9:_A-z\xB7]*;|[\x09\x0A\x0D\x20]+))*][\x09\x0A\x0D\x20]*)?>
/
gm
Open regex in editor

Description

I created the regexp using the XML-specificaton (https://www.w3.org/TR/xml), but in a simplified format (for example allowing only a narrowed set of tag-name characters). The regular expression is a recursive regexp with backreference. I do not tested extensively the regex with a regex-directed engine, so it is not optimized for that. I used the regexp to check a text-directed regular expression engine developed by me ( Windows-users can test this engine, visit https://www.regex.hu )

Submitted by GyRos