最长公共序列群

TOKYO-BLING.1 H02-AVAILABLE TOKYO-BLING.1 H02-MIDDLING TOKYO-BLING.1 H02-TOP TOKYO-BLING.2 H04-USED TOKYO-BLING.2 H04-AVAILABLE TOKYO-BLING.2 H04-CANCELLED WAY-VERING.1 H03-TOP WAY-VERING.2 H03-USED WAY-VERING.2 H03-AVAILABLE WAY-VERING.1 H03-CANCELLED

3条回答

网友

1楼 · 编辑于 2024-10-02 00:32:03

有一种方法：

对条目进行排序
确定每个条目之间公共前缀的长度
通过在公共前缀比前一个条目短的位置分隔列表，将条目分组

实施示例：

def common_count(t0, t1):
  "returns the length of the longest common prefix"
  for i, pair in enumerate(zip(t0, t1)):
    if pair[0] != pair[1]:
      return i
  return i

def group_by_longest_prefix(iterable):
  "given a sorted list of strings, group by longest common prefix"
  longest = 0
  out = []

  for t in iterable:
    if out: # if there are previous entries 

      # determine length of prefix in common with previous line
      common = common_count(t, out[-1])

      # if the current entry has a shorted prefix, output previous 
      # entries as a group then start a new group
      if common < longest:
        yield out
        longest = 0
        out = []
      # otherwise, just update the target prefix length
      else:
        longest = common

    # add the current entry to the group
    out.append(t)

  # return remaining entries as the last group
  if out:
    yield out

用法示例：

^{pr2}$

这会产生：

['TOKYO-BLING.1 H02-AVAILABLE', 'TOKYO-BLING.1 H02-MIDDLING', 'TOKYO-BLING.1 H02-TOP']
['TOKYO-BLING.2 H04-AVAILABLE', 'TOKYO-BLING.2 H04-CANCELLED', 'TOKYO-BLING.2 H04-USED']
['WAY-VERING.1 H03-CANCELLED', 'WAY-VERING.1 H03-TOP']
['WAY-VERING.2 H03-AVAILABLE', 'WAY-VERING.2 H03-USED']

在这里看到它的作用：http://ideone.com/1Da0S

网友

2楼 · 编辑于 2024-10-02 00:32:03

您可以用空格分隔每个字符串，然后生成一个dict。在

我是这样做的：

f = open( 'hotels.txt', 'r' )   # read the data
f = f.readlines()               # convert to a list of strings (with newlines)
f = [ i.strip() for i in f ]    # take off the newlines
h = [ i.split(' ') for i in f ] # split using whitespace
                                # now h is a list of lists of strings

keys = [ i[0] for i in h ]      # keys = ['TOKYO-BLING.1','TOKYO-BLING.1',...]
keys = list( set( keys ) )      # take out redundant elements

d = dict()                      # start a dict
for i in keys:                  # initialize dict with empty lists
    d[i] = list()               # (one for each key)

for i in h:                     # for each list in h, append a suffix
    d[i[0]].append(i[1])        # to the appropriate prefix (or key)

这会产生：

^{pr2}$

网友

3楼 · 编辑于 2024-10-02 00:32:03

一个Generalized Suffix tree可以工作，suffix array

相关问题更多 >

编程相关推荐

热门问题

热门文章

最长公共序列群

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >