Regular Expressions 101

Community Patterns

1

ตรวจสอบพยัญชนะต้นตัวสะกดสระและวรรณยุกต์ไทย

Created·2026-01-22 01:36
Updated·2026-01-23 12:42
Flavor·JavaScript
ตรวจสอบพยัญชนะต้น (ต้องมี) ตรวจตัวสะกดสำหรับสระที่ต้องมี ตรวจสอบการวางสระและวรรณยุกต์ไทย หมายเหตุ การตรวจสอบตัวสะกดในภาษาไทยตรวจสอบได้ยากเพราะภาษาไทยเป็นภาษาที่เขียนติด ๆ กันไม่มีการแบ่งคำอย่างชัดเจนทำให้การอ่านภาษาไทยผู้อ่านต้องใช้ความหมายของคำในการตัดสินการอ่านแบ่งคำตามความเหมาะสมเช่นคำว่า "ตากลม" อาจอ่านเป็น "ตาก-ลม" ก็ได้ หรืออ่านเป็น "ตา-กลม"ก็ได้ ดังนั้นการเขียน Regex เพื่อทำการตรวจสอบอาจช่วยได้ระดับหนึ่ง อ่าจมีผิดบ้างถูกบ้าง แต่ก็ถือว่าเป็นเครื่องมือที่ใช้ช่วยเหลือในการตรวจสอบเพิ่มเติมได้ 80% ของความเป็นไปใด้ก็แล้วกันนะครับ หวังว่าการเขียนเพิ่มเติมส่วนนี้ จะมีประโยชน์บ้างไม่มากก็น้อย
Submitted by อธิปัตย์ ล้อวงศ์งาม
1

Regex for Matching Documentation Websites

Created·2024-11-24 01:45
Flavor·JavaScript
Regex for Matching Documentation Websites This repository contains a powerful regular expression designed to match URLs that commonly point to documentation-related websites. The regex is optimized for flexibility, covering various terms and URL patterns. Regex Pattern ^.(?:\.|\/)(docs|documentation|help|guide|manual|reference|api|kb|support|resources|wiki|developer|how-to|tutorials|examples|learn|instructions)(?:\.|\/)?.$ Purpose This regex is intended to identify URLs that contain keywords associated with documentation or support websites. It handles common patterns in subdomains, directories, and file paths. Explanation ^.*: Matches any characters at the beginning of the URL (any prefix). (?:\.|\/): Matches either a period (.) or a forward slash (/) preceding the keyword. (docs|documentation|help|guide|manual|...): Matches any of the keywords listed in the group. (?:\.|\/)?: Allows an optional period (.) or forward slash (/) following the keyword. .*$: Matches any characters following the keyword (any suffix). Examples Positive Examples The following URLs should match the regex: https://example.com/docs http://docs.example.com https://example.com/documentation https://sub.domain.com/docs/index.html https://example.com/help https://api.example.com/docs http://example.com/manual/index.html https://wiki.example.com http://developer.example.com/guide https://example.com/tutorials/docs/page https://kb.example.com/docs/tutorial.html https://example.com/resources/documentation/tutorial.html http://example.com/reference/help/documentation.html https://developer.example.com/docs/tutorials/index.html http://support.example.com/documentation/overview https://resources.example.com/docs/v1/tutorial https://example.com/how-to/documentation http://example.com/api/reference/docs https://example.com/reference/v2/index.html http://example.com/docs/resources/api.html Negative Examples The following URLs should not match the regex: https://example.com/documentary http://helpful.example.com https://manuals.example.com http://example.com/references https://example.com/resourceful http://example.com/wiki-books https://apiary.example.com http://example.com/documents http://example.com/documentable https://help-center.example.com http://manual.example.com/docsystem https://example.com/resourcesful http://api.example.comary https://example.net/instructions-v1 http://example.org/learned-tutorial http://example.com/support-center Author Jeremy Georges-Filteau Website Github
Submitted by jgeofil

Community Library Entry

0

Regular Expression
Created·2022-10-29 15:30
Flavor·Python

r"
(?P<field_with_params>[^{ }]+(?P<parenth_open>\()(?P<keyword>[^\(\s]\w+)\s?(?P<colon>\:)\s?(?P<value>(?P<list_open>\[?).+?(?P<list_close>\]?))\s?(?P<parenth_close>\)))|(?P<BEGIN>^\{)|(?P<open>\{)|(?P<END>\}$)|(?P<close>\})|(?P<fragment>(?P<dots>\.{3})\s*?(?P<on>on)\s+?(?P<frag_field>\w+))|(?P<parent_field>(?!(?P=field_with_params)|(?P=fragment))\w+)(?=\s*?\{)|(?P<child_field>(?!(?P=field_with_params)|(?P=fragment))\w+)
"
gm
Open regex in editor

Description

Parse Graphql queries

Changelog (versions):

  1. Initial
  2. Version 2 is much improved and intended to integrate well with customized highlighting with the help of the awesome Rich library.
  3. Bug fixes and added rich example.
  4. Bugfixes and now the regex is actually recognizing the different parts of field_with_params. And an improved example to go with that.

Todo (maybe):

  • Improve the beginning, opening, closing and ending curly bracket recognition with better support for malformed queries and make sure there's only one BEGIN and one END in situations where newlines are present.
  • Also differentiate between arbitrary text and actual graphql query language content. For example, the string "asdf" is currently recognized and categorized as a child_field when it shouldn't be recognized at all.

Version 4

def pretty_query(query_string: str) -> str:
    import re

    from rich.console import Console
    from rich.highlighter import JSONHighlighter
    from rich.theme import Theme, DEFAULT_STYLES

    # this is a string, not a tuple
    field_with_params_rgx = (
        r'(?P<field_with_params>[^{ }]+' # fields with parameters
        r'(?P<parenth_open>\()'          # opening parenthesis
        r'(?P<keyword>[^\(\s]\w+)\s?'    # keywords and colons
        r'(?P<colon>\:)\s?'              # keywords and colons
        r'(?P<value>'                    # parameter values
        r'(?P<list_open>\[?).+?'         # opening square bracket
        r'(?P<list_close>\]?))\s?'       # closing square bracket
        r'(?P<parenth_close>\)))'        # closing parenthesis
    )

    regexes = [
        field_with_params_rgx,
        r'(?P<BEGIN>^\{)|(?P<open>\{)|(?P<END>\}$)|(?P<close>\})',                  # curly brackets
        r'(?P<fragment>(?P<dots>\.{3})\s*?(?P<on>on)\s+?(?P<frag_field>\w+))',      # fragments
        r'(?P<parent_field>(?!(?P=field_with_params)|(?P=fragment))\w+)(?=\s*?\{)', # parent fields
        r'(?P<child_field>(?!(?P=field_with_params)|(?P=fragment))\w+)',            # child fields
    ]

    indent = 2
    count = 0
    parts = []

    for x in re.finditer('|'.join(rgx for rgx in regexes), query_string):
        data = x.groupdict()
        string = ''

        BEGIN = data['BEGIN']
        open = data['open']
        parent_field = data['field_with_params'] or data['parent_field']
        child_field = data['child_field']
        fragment = data['fragment']
        close = data['close']
        END = data['END']


        if BEGIN:
            string += BEGIN
            count += indent

        if open:
            count += indent
            continue

        if parent_field:
            string += '{0: >{fill}}{value} {{'.format(' ', value=parent_field, fill=count)

        if child_field:
            string += '{0: >{fill}}{value}'.format(' ', value=child_field, fill=count)

        if fragment:
            string += '{0: >{fill}}{value} {{'.format(' ', value=fragment, fill=count)

        if close:
            count -= indent
            string += '{0: >{fill}}{value}'.format(' ', value=close, fill=count)

        if END:
            string += END
            count -= indent
        parts.append(string)

    class GQLHighlighter(JSONHighlighter):
        base_style = "gql."
        highlights = ['|'.join(rgx for rgx in regexes)] + JSONHighlighter.highlights

    theme = Theme({
        **{
            f'gql.{k}': DEFAULT_STYLES[f'json.{k}']
            for k in ['brace', 'bool_true', 'bool_false', 'null', 'number', 'str', 'key']
        },
        'gql.BEGIN': 'bold green',
        'gql.END': 'bold green',
        'gql.open': 'bold yellow',
        'gql.close': 'bold blue',
        'gql.field_with_params': 'bold blue',
        'gql.parent_field': 'bold white',
        'gql.child_field': 'italic green',
        'gql.fragment': 'bold yellow',
        'gql.dots': 'bold white',
        'gql.on': 'green',
        'gql.frag_field': 'bold magenta italic',
        'gql.parenth_open': 'blue',
        'gql.parenth_close': 'green',
        'gql.keyword': 'yellow',
        'gql.colon': 'blue',
        'gql.list_open': 'bold yellow',
        'gql.list_close': 'bold yellow',
        'gql.value': 'cyan',
    })

    console = Console(highlighter=GQLHighlighter(), theme=theme)

    console.print('\n'.join(parts))

    return '\n'.join(parts)
Submitted by iwconfig