请把这个问题移到Code Review -area。它更适合那里,因为我知道下面的代码是垃圾,我希望关键的反馈完成重写。我几乎是在重新发明轮子。
# Description: you are given a bitwise pattern and a string
# you need to find the number of times the pattern matches in the string.
# The pattern is determined by markov chain.
# For simplicity, suppose the ones and zeros as unbiased coin flipping
# that stops as it hits the pattern, below.
#
# Any one liner or simple pythonic solution?
import random
def matchIt(yourString, yourPattern):
"""find the number of times yourPattern occurs in yourString"""
count = 0
matchTimes = 0
# How can you simplify the for-if structures?
# THIS IS AN EXAMPLE HOW NOT TO DO IT, hence Code-Smell-label
# please, read clarifications in [Update]
for coin in yourString:
#return to base
if count == len(pattern):
matchTimes = matchTimes + 1
count = 0
#special case to return to 2, there could be more this type of conditions
#so this type of if-conditionals are screaming for a havoc
if count == 2 and pattern[count] == 1:
count = count - 1
#the work horse
#it could be simpler by breaking the intial string of lenght 'l'
#to blocks of pattern-length, the number of them is 'l - len(pattern)-1'
if coin == pattern[count]:
count=count+1
average = len(yourString)/matchTimes
return [average, matchTimes]
# Generates the list
myString =[]
for x in range(10000):
myString= myString + [int(random.random()*2)]
pattern = [1,0,0]
result = matchIt(myString, pattern)
print("The sample had "+str(result[1])+" matches and its size was "+str(len(myString))+".\n" +
"So it took "+str(result[0])+" steps in average.\n" +
"RESULT: "+str([a for a in "FAILURE" if result[0] != 8]))
# Sample Output
#
# The sample had 1656 matches and its size was 10000.
# So it took 6 steps in average.
# RESULT: ['F', 'A', 'I', 'L', 'U', 'R', 'E']
[更新]
我将在这里解释一些理论,也许,问题可以这样简化。上面的代码尝试用下面的转移矩阵A
构造马尔可夫链。你可以想象为抛硬币的模式100
与之对应。在
问题中的average
8
成为上面矩阵N=(I-Q)^-1
第一行的值之和,其中Q
。在
>>> (I-Q)**-1
matrix([[ 2., 4., 2.],
[ 0., 4., 2.],
[ 0., 2., 2.]])
>>> numpy.sum(((I-Q)**-1)[0])
8.0
现在,您可能会看到,这个显然唯一的模式匹配问题变成了一个马尔可夫链。我看不出为什么你不能用类似于矩阵或矩阵的东西来代替凌乱的while-if条件。我不知道如何实现它们,但迭代器可能是一种方法,可以进行研究,特别是对于需要分解的更多状态。在
但是Numpy出现了一个问题,-Inf
和{(I-Q)**-1
矩阵中检查它们应该收敛到的值。N
来自N=I+Q+Q^2+Q^3+...=\frac{I-Q^{n}}{I-Q}
。在
>>> (I-Q**99)/(I-Q)
matrix([[ 2.00000000e+00, 1.80853571e-09, -Inf],
[ NaN, 2.00000000e+00, 6.90799171e-10],
[ NaN, 6.90799171e-10, 1.00000000e+00]])
>>> (I-Q**10)/(I-Q)
matrix([[ 1.99804688, 0.27929688, -Inf],
[ NaN, 1.82617188, 0.10742188],
[ NaN, 0.10742188, 0.96679688]])
你可以使用下列物品吗?在
^{pr2}$在您的例子中,您可以将
myString
创建为10000个字符的实际字符串,pattern
也创建为一个字符串,然后用一种简单的python方法计算发生次数。在编辑
提供
text
(可以是字符串或列表)中pattern
出现次数(重叠)的一行代码可以如下所示:这还没准备好。
类似问题,但主要集中在图库here和类似问题,但在C#中,可能有用。
与这个问题相关的文件是
./networkx/generators/degree_seq.py
(997行,关于用给定的度数序列生成图形)和./networkx/algorithms/mixing.py (line 20, function degree_assortativity(G) about probability based graphs)
,还请注意,它的源代码引用了92个引用,不确定是否要重新发明轮子。对于igraph,请阅读文件convert.c
中关于加权边的第835行。您可以获得Networkx here的源代码和igraph here的源代码。请注意,前者是在BSD许可下使用Python完成的,而igraph是在GNU(GPL)下完成的,并且是用C完成的要开始使用Networkx,有一行关于从其jUnits
test_convert_scipy.py
-文件创建加权图的有用一行:因此,要创建马尔可夫链,请参考有向加权图here,可能是这样的:
^{pr2}$或者也许有一些现成的马尔可夫链生成工具,就像其他随机过程一样,更多here。找不到一个algorithm来分析带有异常值的图或用不同的集合进行试验,就像在你的例子中那样,也许没有,你必须坚持使用其他应答器的解决方案。在
好-标准(-ish)字符串搜索:
但你想对一个数字列表这么做?没问题;我们只需要用find()方法创建一个list类,如下所示:
^{pr2}$那么这个例子看起来像
你的代码变成:
相关问题 更多 >
编程相关推荐