如何比较集群？

Cluster 0: Brucellaceae(10) Brucella(10) abortus(1) canis(1) ceti(1) inopinata(1) melitensis(1) microti(1) neotomae(1) ovis(1) pinnipedialis(1) suis(1) Cluster 1: Streptomycetaceae(28) Streptomyces(28) achromogenes(1) albaduncus(1) anthocyanicus(1) etc.

3条回答

网友

1楼 · 编辑于 2024-10-01 09:28:58

给予：

file1 = '''Cluster 0:
 giant(2)
  red(2)
   brick(1)
   apple(1)
Cluster 1:
 tiny(3)
  green(1)
   dot(1)
  blue(2)
   flower(1)
   candy(1)'''.split('\n')
file2 = '''Cluster 18:
 giant(2)
  red(2)
   brick(1)
   tomato(1)
Cluster 19:
 tiny(2)
  blue(2)
   flower(1)
   candy(1)'''.split('\n')

这是你需要的吗？在

^{pr2}$

要了解差异：

for desc, items in differences:
    print desc
    print 
    for item in items:
        print '\t' + item
    print

印刷品

common elements

    giant.red.brick
    tiny.blue.candy
    tiny.blue.flower

missing from file2

    tiny.green.dot
    giant.red.apple

missing from file1

    giant.red.tomato

网友

2楼 · 编辑于 2024-10-01 09:28:58

我在评论中看到了很多不同的答案，为了帮助您，我将给您一个非常非常简单的脚本实现，您可以从中开始。在

请注意，这个并不能回答您的全部问题，而是在评论中为您指出一个方向。在

通常，如果你没有经验，我会建议你去读一读Python（无论如何我都会这么做，我会在答案的底部加上一些链接）

去玩好玩的东西吧！：）

class Cluster(object):
  '''
  This is a class that will contain your information about the Clusters.
  '''
  def __init__(self, number):
    '''
    This is what some languages call a constructor, but it's not.
    This method initializes the properties with values from the method call.
    '''
    self.cluster_number = number
    self.family_name = None
    self.bacteria_name = None
    self.bacteria = []

#This part below isn't a part of the class, this is the actual script.
with open('bacteria.txt', 'r') as file:
  cluster = None
  clusters = []
  for index, line in enumerate(file):
    if line.startswith('Cluster'):
      cluster = Cluster(index)
      clusters.append(cluster)
    else:
      if not cluster.family_name:
        cluster.family_name = line
      elif not cluster.bacteria_name:
        cluster.bacteria_name = line
      else:
        cluster.bacteria.append(line)

我在没有任何花哨的东西和python2.7.2的情况下，尽可能地编写了这篇愚蠢而过于简单的文章您可以将这个文件复制到.py文件中，然后直接从命令行python bacteria.py运行它。在

希望这有点帮助，如果您有任何问题，请随时访问我们的Python聊天室！：）

网友
3楼 · 编辑于 2024-10-01 09:28:58

你必须写一些代码来解析文件。如果你忽略了聚类，你应该能够根据缩进区分科、属和种。在

定义named tuple的最简单方法是：

import collections
Bacterium = collections.namedtuple('Bacterium', ['family', 'genera', 'species'])

在这个对象的实例中，您可以这样做：

^{pr2}$
解析器应该逐行读取文件，并设置族和属。如果它找到了一个物种，它应该在一个列表中添加一个细菌
with open('cluster0.txt', 'r') as infile: lines = infile.readlines() family = None genera = None bacteria = [] for line in lines: # set family and genera. # if you detect a bacterium: bacteria.append(Bacterium(family, genera, species))
一旦您有了每个文件或集群中所有细菌的列表，您可以从所有细菌中进行选择，如下所示：
s = [b for b in bacteria if b.genera == 'Streptomycetaceae']

相关问题更多 >

编程相关推荐

热门问题

热门文章