使用lambda函数返回文件列表时Snakemake上的InputFunction错误

2024-09-29 02:22:51 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在编写一个snakemake规则,它将从一个parased yaml中获取输入值,并将与该组标签关联的文件作为列表返回,但我得到了一个奇怪的错误

我让我的函数在返回之前打印返回输出,所以它似乎返回了一个列表

['/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl1_featureCounts_results.txt', '/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl4_featureCounts_results.txt', '/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl2_featureCounts_results.txt', '/SAN/vyplab/alb_projects/data/muscle/analysis/feature_counts/Ctrl3_featureCounts_results.txt']

然而,我得到了一个“AttributeError”,这是出乎意料的,特别是因为我直接从以前的一些管道中改编了它,这些管道与此函数配合得非常好

InputFunctionException in line 26 of /SAN/vyplab/alb_projects/pipelines/rna_seq_snakemake/rules/deseq2_featureCounts.smk:
AttributeError: 'str' object has no attribute 'list'
Wildcards:
bse=control
contrast=ContrastvControl

规则看起来是这样的,我正在调试shell和params调用,因为我认为它们不是调试所必需的

rule run_standard_deseq:
    input:
        base_group = lambda wildcards: featurecounts_files_from_contrast(wildcards.bse),
        contrast_group = lambda wildcards: featurecounts_files_from_contrast(wildcards.contrast)
    output:
        os.path.join(DESEQ2_DIR,"{bse}_{contrast}" + "normed_counts.csv.gz")

helper函数的实现

def featurecounts_files_from_contrast(grp):
    """
    given a contrast name or list of groups return a list of the files in that group
    """
    #reading in the samples
    samples = pd.read_csv(config['sampleCSVpath'])
    #there should be a column which allows you to exclude samples
    samples2 = samples.loc[samples.exclude_sample_downstream_analysis != 1]
    #read in the comparisons and make a dictionary of comparisons, comparisons needs to be in the config file
    compare_dict = load_comparisons()
    #go through the values of the dictionary and break when we find the right groups in that contrast
    grps, comparison_column = return_sample_names_group(grp)
    #take the sample names corresponding to those groups
    if comparison_column == "":
        return([""])
    grp_samples = list(set(list(samples2[samples2[comparison_column].isin(grps)].sample_name)))
    feature_counts_outdir = get_output_dir(config["project_top_level"], config["feature_counts_output_folder"])
    fc_suffix = "_featureCounts_results.txt"

    #build a list with the full path from those sample names
    fc_files = [os.path.join(feature_counts_outdir,x + fc_suffix) \
                   for x in grp_samples]
    fc_files = list(set(fc_files))
    print(fc_files)

    return(fc_files)

print命令正在返回正确的文件,所以我认为这会起作用


Tags: theinanalysisfilesfeaturelistprojectsfc