Regular Expressions 101

Community Patterns

Chinese Digits

1

Regular Expression
PCRE (PHP <7.3)

/
^ (?<integer> (?<n1> (?<n1nz>[一二三四五六七八九壹贰叁肆伍陆柒捌玖]) |(?<n1wz>[〇零]) ) |(?<n2> (?<h2nz> (?<n2nz>(?<n2z1>(?&n1nz)[十拾])(?&n1nz)) |[十拾](?&n1nz) ) |(?<h2wz> (?<n2wz>(?&n2z1)) |[十拾] ) ) |(?<n3> (?<n3nz>(?<n3z2>(?<h1nz>(?&n1nz)|)[百佰])((?&n2nz)|(?&n1wz)(?&n1nz))) |(?<n3wz>(?&n3z2)(?&n2wz)?) ) |(?<n4> (?<n4nz>(?<n4z3>(?&h1nz)[千仟])((?&n3nz)|(?&n1wz)((?&n2nz)|(?&n1nz)))) |(?<n4wz> (?&n4z3)(?&n3wz)? |(?&n4z3)(?&n1wz)(?&n2wz) ) ) |(?<n5_7> (?<n5_7nz> (?<n5_7z4>(?<h1_3nz>(?&h1nz)|(?&h2nz)|(?&n3nz))[万萬])((?&n4nz)|(?&n1wz)((?&n3nz)|(?&n2nz)|(?&n1nz))) |(?<n6_7z5_7>(?<h2_3wz>(?&h2wz)|(?&n3wz))[万萬])((?&n1wz)((?&n4nz)|(?&n3nz)|(?&n2nz)|(?&n1nz))) ) |(?<n5_7wz> (?&n5_7z4)((?&n4wz)|(?&n1wz)((?&n3wz)|(?&n2wz)))? |(?&n6_7z5_7)((?&n1wz)((?&n4wz)|(?&n3wz)|(?&n2wz)))? ) ) |(?<n8> (?<n8nz> (?<n8z4>(?&n4nz)[万萬])((?&n4nz)|(?&n1wz)((?&n3nz)|(?&n2nz)|(?&n1nz))) |(?<n8z5_7>(?&n4wz)[万萬])(?&n1wz)((?&n4nz)|(?&n3nz)|(?&n2nz)|(?&n1nz)) ) |(?<n8wz> (?&n8z4)((?&n4wz)|(?&n1wz)((?&n3wz)|(?&n2wz)))? |(?&n8z5_7)((?&n1wz)((?&n4wz)|(?&n3wz)|(?&n2wz)))? ) ) |(?<n9_16> ((?&h1_3nz)(?&n4nz)|(?&n5_7nz)(?&n8nz))[亿億]((?&n8)|(?&n1wz)((?&n5_7)|(?&n4)|(?&n3)|(?&n2)|(?&n1)))? |((?&h2_3wz)(?&n4wz)|(?&n5_7wz)(?&n8wz))[亿億]((?&n1wz)((?&n8)|(?&n5_7)|(?&n4)|(?&n3)|(?&n2)|(?&n1)))? ) ) (?<decimal> [〇一二三四五六七八九零壹贰叁肆伍陆柒捌玖]+ )? $
/
gmx

Description

Match Chinese Digits less than 1×10^16, such as “一千两百三十四万”、“八萬点七六五”、“玖仟玖佰玖拾玖万玖仟玖佰玖拾玖亿玖仟玖佰玖拾玖万玖仟玖佰玖拾玖点玖玖玖玖玖玖玖玖玖玖玖玖玖玖玖玖”,Upper and lower case Chinese can be mixed, but Chinese numbers and English numbers cannot be mixed. Illegal numbers will not be matched. For example: “两十六” will not be matched, as the correct one should be “二十六”,In general Chinese, “两” and “十” are not used together; “两千零零六” will not be matched, as the correct one should be “两千零六”,as consecutive "零" in the integer part of Chinese numbers are illegal. It need a regex engine that supports the functionality of matching an expression defined in a named capture group, such as "(?<letter>[a-z]+)\d+(&letter)".

用于匹配小于1×10^16的中文数字,例如:“一千两百三十四万”、“八萬点七六五”、“玖仟玖佰玖拾玖万玖仟玖佰玖拾玖亿玖仟玖佰玖拾玖万玖仟玖佰玖拾玖点玖玖玖玖玖玖玖玖玖玖玖玖玖玖玖玖”,大小写中文数字可以混用,中文数字与英文数字不可以混用。 不合法的中文数字不会被匹配,例如:“两十六”、“两十六万”不会被匹配,因为中文习惯中不将“两”与“十”连用;“两千零零六”不会被匹配,因为其中有连续的零。 需要引擎支持引用已定义组的表达式,例如:"(?<letter>[a-z]+)\d+(&letter)"。

Submitted by anonymous - 4 months ago (Last modified 3 months ago)