如何有条件地替换子列表中的项。

3条回答

网友

1楼 · 编辑于 2024-05-20 21:37:30

如果我理解正确，请尝试用以下内容替换elif行：

else:
    longest_UTR = [[locus, mir, gene, transcript, length_as_integer] for x in longest_UTR if x[:3] == [locus, mir, gene] and length_as_integer > int(x[4]) else x]:

你通过你所有的列表更新那些匹配的条件，如果不匹配什么也不做。你知道吗

网友

2楼 · 编辑于 2024-05-20 21:37:30

既然要替换longest_UTR变量并保持事物的良好名称，可以使用字典而不是列表：

targets = open(file)
longest_UTR = {}

for line in targets: 
    chromosome, locus, mir, gene, transcript, UTR_length = line.strip("\n").split("\t")    
    length_as_integer = int(UTR_length)

    # Your condition works for initializing the dictionary because of the default value.
    if length_as_integer > longest_UTR.get("Length", -1):
        longest_UTR["Chromosome"] = chromosome
        longest_UTR["Locus"] = locus
        longest_UTR["Mir"] = mir
        longest_UTR["Gene"] = gene
        longest_UTR["Transcript"] = transcript
        longest_UTR["Length"] = length_as_integer

print (longest_UTR)

编辑：这里还有使用列表的代码版本，以防您有兴趣看到不同之处。就我个人而言，我觉得这本字典读起来比较干净。你知道吗

targets = open(file)
longest_UTR = [None, None, None, None, None, -1]

for line in targets: 
    chromosome, locus, mir, gene, transcript, UTR_length = line.strip("\n").split("\t")    
    length_as_integer = int(UTR_length)

    # Your condition works for initializing the list because of the default value.
    if length_as_integer > longest_UTR[5]:
        longest_UTR[0] = chromosome
        longest_UTR[1] = locus
        longest_UTR[2] = mir
        longest_UTR[3] = gene
        longest_UTR[4] = transcript
        longest_UTR[5] = length_as_integer

print (longest_UTR)

网友

3楼 · 编辑于 2024-05-20 21:37:30

所以，关于你的要求有点反复，但我最后的理解是：您正在一个数据集上循环。此数据集中的每个target都有一个locus、mri和gene以及一个UTR_length属性。对于locus、mri和gene的每一个唯一组合，您都试图找到所有targets具有最大UTR_Length的targets？你知道吗

如果您希望在数据集中找到最大值，有两种方法。
1）您可以简单地将输入文件转换为一个pandas数据帧，按locus、mri和gene值分组，并返回最大值（UTR_Length）的所有值。从易于实现的角度来看，这可能是您的最佳选择。然而，pandas并不总是合适的工具，而且会带来很多开销，特别是如果你想让你的项目靠岸的话。你知道吗

2）如果您想使用基本python包，我建议您利用集合和字典：

targets = open(file)
list_of_targets = []    
for line in targets:

          chromosome, locus, mir, gene, transcript, UTR_length = line.strip("\n").split("\t")
          length_as_integer = int(UTR_length)

          list_of_targets.append((chromosome, locus, mir, gene, transcript, UTR_length))

# Generate Set of unqiue locus, mri, gene (lmg) combinations
set_of_locus_mri_gene = {(i[1], i[2], i[3]) for i in list_of_targets}

# Generate dictionary of maximum lengths for each distinct lmg combo
dict_of_max_lengths = {lmg: max([targets[5] for targets in list_of_targets if 
                                    (targets[1], targets[2], targets[3]) == lmg]) for 
                                    lmg in set_of_locus_mri_gene}

# Generate dictionary with lmg keys and all targets with corresponding max length
final_output = {lmg: [target for target in list_of_targets if target[5] == max_length] for
                        lmg, max_length in dict_of_max_lengths.items()}

相关问题更多 >

编程相关推荐

热门问题

热门文章