<p>你可以想出一些正则表达式逻辑和一个转换缩写数字的函数相结合。下面是一些python代码示例:</p>
<pre><code># -*- coding: utf-8> -*-
import re, locale
from locale import *
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
string = """"$305,000 - $349,950"
"Mid $2M's Buyers"
"... Buyers Guide $1.29M+"
"...$485,000 and $510,000"
"""
def convert_number(number, unit):
if unit == "K":
exp = 10**3
elif unit == "M":
exp = 10**6
return (atof(number) * exp)
matches = []
rx = r"""
\$(?P<value>\d+[\d,.]*) # match a dollar sign
# followed by numbers, dots and commas
# make the first digit necessary (+)
(?P<unit>M|K)? # match M or K and save it to a group
( # opening parenthesis
\s(?:-|and)\s # match a whitespace, dash or "and"
\$(?P<value1>\d+[\d,.]*) # the same pattern as above
(?P<unit1>M|K)?
)? # closing parethesis,
# make the whole subpattern optional (?)
"""
for match in re.finditer(rx, string, re.VERBOSE):
if match.group('unit') is not None:
value1 = convert_number(match.group('value'), match.group('unit'))
else:
value1 = atof(match.group('value'))
m = (value1)
if match.group('value1') is not None:
if match.group('unit1') is not None:
value2 = convert_number(match.group('value1'), match.group('unit1'))
else:
value2 = atof(match.group('value1'))
m = (value1, value2)
matches.append(m)
print matches
# [(305000.0, 349950.0), 2000000.0, 1290000.0, (485000.0, 510000.0)]
</code></pre>
<p>代码使用了相当多的逻辑,它首先为<code>atof()</code>函数导入<code>locale</code>模块,定义一个函数<code>convert_number()</code>,并使用代码中解释的正则表达式搜索范围。显然,您可以添加其他货币符号,如<code>€$£</code>,但它们不在您最初的示例中。在</p>