Regular Expressions 101

Community Patterns

1

ตรวจสอบพยัญชนะต้นตัวสะกดสระและวรรณยุกต์ไทย

Created·2026-01-22 01:36
Updated·2026-01-23 12:42
Flavor·ECMAScript (JavaScript)
ตรวจสอบพยัญชนะต้น (ต้องมี) ตรวจตัวสะกดสำหรับสระที่ต้องมี ตรวจสอบการวางสระและวรรณยุกต์ไทย หมายเหตุ การตรวจสอบตัวสะกดในภาษาไทยตรวจสอบได้ยากเพราะภาษาไทยเป็นภาษาที่เขียนติด ๆ กันไม่มีการแบ่งคำอย่างชัดเจนทำให้การอ่านภาษาไทยผู้อ่านต้องใช้ความหมายของคำในการตัดสินการอ่านแบ่งคำตามความเหมาะสมเช่นคำว่า "ตากลม" อาจอ่านเป็น "ตาก-ลม" ก็ได้ หรืออ่านเป็น "ตา-กลม"ก็ได้ ดังนั้นการเขียน Regex เพื่อทำการตรวจสอบอาจช่วยได้ระดับหนึ่ง อ่าจมีผิดบ้างถูกบ้าง แต่ก็ถือว่าเป็นเครื่องมือที่ใช้ช่วยเหลือในการตรวจสอบเพิ่มเติมได้ 80% ของความเป็นไปใด้ก็แล้วกันนะครับ หวังว่าการเขียนเพิ่มเติมส่วนนี้ จะมีประโยชน์บ้างไม่มากก็น้อย
Submitted by อธิปัตย์ ล้อวงศ์งาม
1

Regex for Matching Documentation Websites

Created·2024-11-24 01:45
Flavor·ECMAScript (JavaScript)
Regex for Matching Documentation Websites This repository contains a powerful regular expression designed to match URLs that commonly point to documentation-related websites. The regex is optimized for flexibility, covering various terms and URL patterns. Regex Pattern ^.(?:\.|\/)(docs|documentation|help|guide|manual|reference|api|kb|support|resources|wiki|developer|how-to|tutorials|examples|learn|instructions)(?:\.|\/)?.$ Purpose This regex is intended to identify URLs that contain keywords associated with documentation or support websites. It handles common patterns in subdomains, directories, and file paths. Explanation ^.*: Matches any characters at the beginning of the URL (any prefix). (?:\.|\/): Matches either a period (.) or a forward slash (/) preceding the keyword. (docs|documentation|help|guide|manual|...): Matches any of the keywords listed in the group. (?:\.|\/)?: Allows an optional period (.) or forward slash (/) following the keyword. .*$: Matches any characters following the keyword (any suffix). Examples Positive Examples The following URLs should match the regex: https://example.com/docs http://docs.example.com https://example.com/documentation https://sub.domain.com/docs/index.html https://example.com/help https://api.example.com/docs http://example.com/manual/index.html https://wiki.example.com http://developer.example.com/guide https://example.com/tutorials/docs/page https://kb.example.com/docs/tutorial.html https://example.com/resources/documentation/tutorial.html http://example.com/reference/help/documentation.html https://developer.example.com/docs/tutorials/index.html http://support.example.com/documentation/overview https://resources.example.com/docs/v1/tutorial https://example.com/how-to/documentation http://example.com/api/reference/docs https://example.com/reference/v2/index.html http://example.com/docs/resources/api.html Negative Examples The following URLs should not match the regex: https://example.com/documentary http://helpful.example.com https://manuals.example.com http://example.com/references https://example.com/resourceful http://example.com/wiki-books https://apiary.example.com http://example.com/documents http://example.com/documentable https://help-center.example.com http://manual.example.com/docsystem https://example.com/resourcesful http://api.example.comary https://example.net/instructions-v1 http://example.org/learned-tutorial http://example.com/support-center Author Jeremy Georges-Filteau Website Github
Submitted by jgeofil

Community Library Entry

1

Regular Expression
Created·2020-01-27 07:27
Flavor·PCRE (Legacy)

/
^(?!settlement-id($|\t.*$)).*
/
gm
Open regex in editor

Description

This was necessary for cases such as the Amazon Settlements reports To import these, we concatenate many file into one. The header in this concatenated file is reapeated as many times as files were concatenated. We have seen that is there are no sales with a promotion for a determined period, the file for that period does not contain the promotion-id column or header • with promotion-id "settlement-id settlement-start-date settlement-end-date deposit-date total-amount currency transaction-type order-id merchant-order-id adjustment-id shipment-id marketplace-name amount-type amount-description amount fulfillment-id posted-date posted-date-time order-item-code merchant-order-item-id merchant-adjustment-item-id sku quantity-purchased promotion-id" • without promotion-id "settlement-id settlement-start-date settlement-end-date deposit-date total-amount currency transaction-type order-id merchant-order-id adjustment-id shipment-id marketplace-name amount-type amount-description amount fulfillment-id posted-date posted-date-time order-item-code merchant-order-item-id merchant-adjustment-item-id sku quantity-purchased" We still want to import all data from the file without the promotion-id, since no heder column name has changed, neither the order of the columns have changed. This is just an omition, since there is no data for this column for the period As long as these 2 facts are true • No header name was changed • No header name is found in a different position in the header then, the header is considered valid and the import should proceed without a header error. This is in spite of the header only partially matching the stored header This list is also used to delete the headers and partially matching instances of the header from the files to be imported METHODOLOGY To search the file for these instances of the headers, we search for any line that starts with the name of the first column, "settlement-id" in this case This method assumes the following: • No line in the column "settlement-id", contains the word "settlement-id" If this is true, then any line that begins with "settlement-id" is a header To search for this we had 2 options

  1. epRegExReplace( Expression; Replacement; Target {; "Options" } ). Basically search with regular expressions. Match the lines that do NOT begin with "settlement-id" and replace those with empty. Then remove the empty lines
  2. FilterList ( ListA ; Attribute ; ListB ; CaseSensitive ) As shown in the test below, epRegExReplace was almost 500 times faster
Submitted by anonymous