Regular Expressions 101

Community Patterns

匹配中文数字,用于逆向文本标准化

0

Regular Expression
Python

r"
([a-z]\s*)? ( ( [零幺一二两三四五六七八九十百千万点比] |[零一二三四五六七八九十][ ] |(?<=[一二两三四五六七八九十])[年月日号] |(分之) )+ ( (?<=[一二两三四五六七八九十])[a-zA-Z年月日号个只分万亿秒] |(?<=[一二两三四五六七八九十]\s)[a-zA-Z] )? (?(1) |(?(5) |( [零幺一二两三四五六七八九十百千万亿点比] |(分之) ) )+ ) )
"
xig

Description

大部分应当匹配的数字都匹配上了。

第2个捕获组,即是内容(可能加的有单位,需要手动去除)

Submitted by HaujetZhao - 10 months ago