Snakemake：如何让shell命令在规则中使用不同的参数（整数）运行？

user = '/home/.../BDT/' nestimators = [1, 2] rule all: input: user + 'AUC_score.pdf' rule testing: output: user + 'AUC_score.csv' shell: 'python bdt.py --nestimators {}'.format(nestimators[i] for i in range(2)) rule plotting: input: user + 'AUC_score.csv' output: user + 'AUC_score.pdf' shell: 'python opti.py

2条回答

网友

1楼 · 编辑于 2024-05-17 14:54:05

出现错误的原因是{}被生成器对象替换，也就是说，它不是先被1替换，然后被2替换，而是被nestimators上的迭代器替换

即使您更正了规则testing中的python表达式。如果我正确理解你的目标，可能会有一个更根本的问题。 The workflows of snakemake are defined in terms of rules that define how to create output files from input files. 因此，函数测试将只调用一次，但您可能希望为每个超参数分别调用规则

解决方案是在输出文件名中添加hyperparameter。大概是这样的：

user = '/home/.../BDT/'

nestimators = [1, 2]

rule all:
        input: user + 'AUC_score.pdf'

rule testing:
        output: user + 'AUC_score_{hyper}.csv'
        shell: 'python bdt.py  nestimators {wildcards.hyper}'

rule plotting:
        input: expand(user + 'AUC_score_{hyper}.csv', hyper=nestimators)
        output: user + 'AUC_score.pdf'
        shell: 'python opti.py'

最后，不要使用shell:调用python脚本。您可以直接使用script:，如文档中所述： https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#external-scripts

网友

2楼 · 编辑于 2024-05-17 14:54:05

代码中的问题是表达式nestimators[i] for i in range(2)不是一个列表（如您所想）。这是一个生成器，它不会生成任何值，除非您显式地这样做。例如，此代码：

'python bdt.py  nestimators {}'.format(list(nestimators[i] for i in range(2)))

生成结果'python bdt.py nestimators [1, 2]'

实际上，您根本不需要生成器，因为此代码生成完全相同的输出：

'python bdt.py  nestimators {}'.format(nestimators)

此格式可能不是脚本所期望的格式。例如，如果希望获得如下命令行：python bdt.py nestimators 1,2，可以使用以下表达式：

'python bdt.py  nestimators {}'.format(",".join(map(str, nestimators)))

如果可以使用f字符串，则可以减少最后一个表达式：

f'python bdt.py  nestimators {",".join(map(str, nestimators))}'

相关问题更多 >

编程相关推荐

热门问题

热门文章