最长子串（用于三胞胎序列）

网友

1楼 · 编辑于 2024-09-30 20:34:14

我想您正在寻找的是Longest Common Subsequence算法，它可以非常快速地找到这个序列（在O（n²）时间内）。该算法基于一个简单的动态规划递归，网上有很多例子可以说明如何实现该算法。在

直观地说，该算法使用以下递归分解来工作，该分解通过查看每个序列的第一个三元组来工作：

如果任一序列为空，则最长的公共子序列为空序列。在
否则：
- 如果每个序列的前三元组匹配，则LCS是紧跟着两个序列剩余部分的LCS的元素。在
- 如果不是，则LCS是以下两者中较长的一个：第一个序列的LCS和第二个序列中除第一个元素之外的所有元素，或者第二个序列的LCS和第一个序列中除第一个元素之外的所有元素。在

希望这有帮助！在

网友

2楼 · 编辑于 2024-09-30 20:34:14

生物信息学中常用的Sequence alignment算法可以在这里使用。它们主要用于对齐一个字符序列，但是可以修改它们以接受n个字符序列。Needleman–Wunsch algorithm是一个相当有效的方法。在

网友

3楼 · 编辑于 2024-09-30 20:34:14

首先，您至少可以通过计算set symmetric difference来消除不在两个序列中出现的任何三元组来减少这个问题。在

对于最长的子序列，该算法使用dynamic programming方法。对于每一个三元组，找出长度为2的最短子串。循环这些对，尝试通过组合这些对来扩展它们。一直延伸，直到每个三元组都有最长的延伸。选择其中最长的：

ABQACBBA
ZBABBA

Eliminate symmetric difference
ABABBA and BABBA


Start with the first A in ABABBA.
It is followed by B, giving the elements [0,1]
Check to see if AB is in BABBA, and record a match at [1,2]
So, the first pair is ((0,1), (1,2))

Next, try the first B in ABABBA.
It is followed by an A giving the elements [1,2]
Check to see if BA is in BABBA and record a match at [0,1]

Continue with the rest of the letters in ABABBA.

Then, try extensions.

The first pair AB at [0,1] and [1,2] can be extended from BA
to ABA [0,1,3] and [1,2,4].  Note, the latter group is all the
way to the right so it cannot be extended farther.  ABA cannot
be extended.

Continue until all sequences have extended are far as possible.
Keep only the best of those.

相关问题更多 >

编程相关推荐

热门问题

热门文章