python排序负数和/或十进制字母数字字符串

2024-10-06 12:30:00 发布

您现在位置:Python中文网/ 问答频道 /正文

我在排序包含负数和/或十进制字母数字字符串的字符串列表时遇到问题。到目前为止,我得到的是:

import re

format_ids = ["synopsys_SS_2v_-40c_SS.lib",
              "synopsys_SS_1v_-40c_SS.lib",
              "synopsys_SS_1.2v_-40c_SS.lib", 
              "synopsys_SS_1.4v_-40c_SS.lib",
              "synopsys_SS_2v_-40c_TT.lib",
              "synopsys_FF_3v_25c_FF.lib",
              "synopsys_TT_4v_125c_TT.lib",
              "synopsys_TT_1v_85c_TT.lib",
              "synopsys_TT_10v_85c_TT.lib",
              "synopsys_FF_3v_-40c_SS.lib",
              "synopsys_FF_3v_-40c_TT.lib"]

selector = r'.*(FF|TT|SS)_([-\.\d]+v)_([-\.\d]+c)_(FF|TT|SS).*'
#key = [2,1,3]
key = 2
produce_groups = False

if isinstance(key, int):
    key = [key]

convert = lambda text: float(text) if text.isdigit() else text
alphanum_key = lambda k: [convert(c) for c in re.split('([-.\d]+)', k)]
split_list = lambda name: tuple(alphanum_key(re.findall(selector,name)[0][i]) for i in key)
format_ids.sort(key=split_list)

print "\n".join(format_ids)

我期待以下输出(按第三个键排序):

^{pr2}$

但我得到以下结果(所有负数都列在最后):

synopsys_FF_3v_25c_FF.lib
synopsys_TT_1v_85c_TT.lib
synopsys_TT_10v_85c_TT.lib
synopsys_TT_4v_125c_TT.lib
synopsys_SS_2v_-40c_SS.lib
synopsys_SS_1v_-40c_SS.lib
synopsys_SS_1.2v_-40c_SS.lib
synopsys_SS_1.4v_-40c_SS.lib
synopsys_SS_2v_-40c_TT.lib
synopsys_FF_3v_-40c_SS.lib
synopsys_FF_3v_-40c_TT.lib

现在,对于第二个键的小数(将键变量更改为1(key=1)),我得到:

synopsys_SS_1v_-40c_SS.lib
synopsys_TT_1v_85c_TT.lib
synopsys_SS_2v_-40c_SS.lib
synopsys_SS_2v_-40c_TT.lib
synopsys_FF_3v_25c_FF.lib
synopsys_FF_3v_-40c_SS.lib
synopsys_FF_3v_-40c_TT.lib
synopsys_TT_4v_125c_TT.lib
synopsys_TT_10v_85c_TT.lib
synopsys_SS_1.2v_-40c_SS.lib
synopsys_SS_1.4v_-40c_SS.lib

期望:

synopsys_SS_1v_-40c_SS.lib
synopsys_TT_1v_85c_TT.lib
synopsys_SS_1.2v_-40c_SS.lib
synopsys_SS_1.4v_-40c_SS.lib
synopsys_SS_2v_-40c_SS.lib
synopsys_SS_2v_-40c_TT.lib
synopsys_FF_3v_25c_FF.lib
synopsys_FF_3v_-40c_SS.lib
synopsys_FF_3v_-40c_TT.lib
synopsys_TT_4v_125c_TT.lib
synopsys_TT_10v_85c_TT.lib

如有任何建议,我们将不胜感激。在

编辑:我最后使用了@stephernauch描述的simpler method

import re
def sort_names(format_ids, selector, key=1):

    if isinstance(key, int):
        key = [key]

    SELECTOR_RE = re.compile(selector)

    def convert(x):
        try:
            return float(x[:-1])
        except ValueError:
            return x

    def sort_keys(key):
        def split_fid(x):
            x = SELECTOR_RE.split(x)
            return tuple([convert(x[i]) for i in key])
        return split_fid

    format_ids.sort(key=sort_keys(key))

format_ids = ["synopsys_SS_2v_-40c_SS.lib",
              "synopsys_SS_1v_-40c_SS.lib",
              "synopsys_SS_1.2v_-40c_SS.lib",
              "synopsys_SS_1.4v_-40c_SS.lib",
              "synopsys_SS_2v_-40c_TT.lib",
              "synopsys_FF_3v_25c_FF.lib",
              "synopsys_TT_4v_125c_TT.lib",
              "synopsys_TT_1v_85c_TT.lib",
              "synopsys_TT_10v_85c_TT.lib",
              "synopsys_FF_3v_-40c_SS.lib",
              "synopsys_FF_3v_-40c_TT.lib"]

selector = r'.*(FF|TT|SS)_([-\.\d]+v)_([-\.\d]+c)_(FF|TT|SS).*'
key = [2,1,3]

sort_names(format_ids,selector,key)

Tags: keytextreidsformatconvertlibdef
2条回答

需要对数字进行稍微不同的测试重新分割()被赋予了一个前导'',它正在抛出转换例程。在

固定代码:

key = [2,1,3]

def convert(x):
    try:
        return float(x)
    except ValueError:
        return x

alphanum_keys = lambda k: (convert(c) for c in re.split('([-.\d]+)', k))
alphanum_key = lambda k: [i for i in alphanum_keys(k) if i != ''][0]
split_list = lambda name: [
    alphanum_key(re.findall(selector, name)[0][i]) for i in key]
format_ids.sort(key=split_list)

替代(更简单)解决方案:

但是。。。所有这些lambdas和regex,比这个问题需要的复杂得多。不如就:

^{pr2}$

怎么做的?

sort_keys()返回函数f()。这是传递给sort()以计算排序顺序的一个参数的函数。函数f()将使用传递给sort_keys()的键的值,因为这些值是available at the time f() is defined。这称为closure。在

结果:

synopsys_SS_1v_-40c_SS.lib
synopsys_SS_1.2v_-40c_SS.lib
synopsys_SS_1.4v_-40c_SS.lib
synopsys_SS_2v_-40c_SS.lib
synopsys_SS_2v_-40c_TT.lib
synopsys_FF_3v_-40c_SS.lib
synopsys_FF_3v_-40c_TT.lib
synopsys_FF_3v_25c_FF.lib
synopsys_TT_1v_85c_TT.lib
synopsys_TT_10v_85c_TT.lib
synopsys_TT_4v_125c_TT.lib

问题的一个很大一部分是,只有实际的数字才被视为数字,而不是破折号和句点,因此在代码中诸如“-40”.isdigit()或“1.4”.isdigit()将为False,并保留为文本而不是转换为浮点。在

相关问题 更多 >