<p>在阅读了大量的研究论文和书籍之后,我唯一找到答案的地方是书(<a href="https://books.google.fr/books?id=eQs2i-R9-oYC&lpg=PR11&ots=atCPQJm3OJ&dq=%22Algebraic%20codes%20for%20data%20transmission%22%2C%20Blahut%2C%20Richard%20E.%2C%202003%2C%20Cambridge%20university%20press.&lr&hl=fr&pg=PA193#v=onepage&q=%22Algebraic%20codes%20for%20data%20transmission%22,%20Blahut,%20Richard%20E.,%202003,%20Cambridge%20university%20press.&f=false" rel="nofollow noreferrer">readable online on Google Books</a>,但不是PDF格式):</p>
<blockquote>
<p>"Algebraic codes for data transmission", Blahut, Richard E., 2003, Cambridge university press.</p>
</blockquote>
<p>以下是本书的一些摘录,其中详细描述了我实现的Berlekamp-Massey算法(多项式运算的矩阵化/矢量化表示除外):</p>
<p>{1美元^</p>
<p>以下是Reed Solomon的Berlekamp-Massey算法:</p>
<p><img src="https://i.stack.imgur.com/j1gYU.png" alt="Errors-and-erasures Berlekamp-Massey algorithm for Reed-Solomon"/></p>
<p>如您所见,与通常的描述相反,您只需使用先前计算的擦除定位器多项式的值初始化Lambda,错误定位器多项式</strong>,您还需要跳过第一个v迭代,其中v是擦除次数。注意,这并不等同于跳过最后的v个迭代:<strong>您需要跳过前v个迭代</strong>,因为r(我的实现中的迭代计数器K)不仅用于计算迭代次数,而且还用于生成正确的差异因子Delta。</p>
<p>以下是修改后的结果代码,以支持最多<code>v+2*e <= (n-k)</code>的擦除和错误:</p>
<pre><code>def _berlekamp_massey(self, s, k=None, erasures_loc=None, erasures_eval=None, erasures_count=0):
'''Computes and returns the errata (errors+erasures) locator polynomial (sigma) and the
error evaluator polynomial (omega) at the same time.
If the erasures locator is specified, we will return an errors-and-erasures locator polynomial and an errors-and-erasures evaluator polynomial, else it will compute only errors. With erasures in addition to errors, it can simultaneously decode up to v+2e <= (n-k) where v is the number of erasures and e the number of errors.
Mathematically speaking, this is equivalent to a spectral analysis (see Blahut, "Algebraic Codes for Data Transmission", 2003, chapter 7.6 Decoding in Time Domain).
The parameter s is the syndrome polynomial (syndromes encoded in a
generator function) as returned by _syndromes.
Notes:
The error polynomial:
E(x) = E_0 + E_1 x + ... + E_(n-1) x^(n-1)
j_1, j_2, ..., j_s are the error positions. (There are at most s
errors)
Error location X_i is defined: X_i = α^(j_i)
that is, the power of α (alpha) corresponding to the error location
Error magnitude Y_i is defined: E_(j_i)
that is, the coefficient in the error polynomial at position j_i
Error locator polynomial:
sigma(z) = Product( 1 - X_i * z, i=1..s )
roots are the reciprocals of the error locations
( 1/X_1, 1/X_2, ...)
Error evaluator polynomial omega(z) is here computed at the same time as sigma, but it can also be constructed afterwards using the syndrome and sigma (see _find_error_evaluator() method).
It can be seen that the algorithm tries to iteratively solve for the error locator polynomial by
solving one equation after another and updating the error locator polynomial. If it turns out that it
cannot solve the equation at some step, then it computes the error and weights it by the last
non-zero discriminant found, and delays the weighted result to increase the polynomial degree
by 1. Ref: "Reed Solomon Decoder: TMS320C64x Implementation" by Jagadeesh Sankaran, December 2000, Application Report SPRA686
The best paper I found describing the BM algorithm for errata (errors-and-erasures) evaluator computation is in "Algebraic Codes for Data Transmission", Richard E. Blahut, 2003.
'''
# For errors-and-erasures decoding, see: "Algebraic Codes for Data Transmission", Richard E. Blahut, 2003 and (but it's less complete): Blahut, Richard E. "Transform techniques for error control codes." IBM Journal of Research and development 23.3 (1979): 299-315. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.600&rep=rep1&type=pdf and also a MatLab implementation here: http://www.mathworks.com/matlabcentral/fileexchange/23567-reed-solomon-errors-and-erasures-decoder/content//RS_E_E_DEC.m
# also see: Blahut, Richard E. "A universal Reed-Solomon decoder." IBM Journal of Research and Development 28.2 (1984): 150-158. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.84.2084&rep=rep1&type=pdf
# and another good alternative book with concrete programming examples: Jiang, Yuan. A practical guide to error-control coding using Matlab. Artech House, 2010.
n = self.n
if not k: k = self.k
# Initialize, depending on if we include erasures or not:
if erasures_loc:
sigma = [ Polynomial(erasures_loc.coefficients) ] # copy erasures_loc by creating a new Polynomial, so that we initialize the errata locator polynomial with the erasures locator polynomial.
B = [ Polynomial(erasures_loc.coefficients) ]
omega = [ Polynomial(erasures_eval.coefficients) ] # to compute omega (the evaluator polynomial) at the same time, we also need to initialize it with the partial erasures evaluator polynomial
A = [ Polynomial(erasures_eval.coefficients) ] # TODO: fix the initial value of the evaluator support polynomial, because currently the final omega is not correct (it contains higher order terms that should be removed by the end of BM)
else:
sigma = [ Polynomial([GF256int(1)]) ] # error locator polynomial. Also called Lambda in other notations.
B = [ Polynomial([GF256int(1)]) ] # this is the error locator support/secondary polynomial, which is a funky way to say that it's just a temporary variable that will help us construct sigma, the error locator polynomial
omega = [ Polynomial([GF256int(1)]) ] # error evaluator polynomial. We don't need to initialize it with erasures_loc, it will still work, because Delta is computed using sigma, which itself is correctly initialized with erasures if needed.
A = [ Polynomial([GF256int(0)]) ] # this is the error evaluator support/secondary polynomial, to help us construct omega
L = [ 0 ] # update flag: necessary variable to check when updating is necessary and to check bounds (to avoid wrongly eliminating the higher order terms). For more infos, see https://www.cs.duke.edu/courses/spring11/cps296.3/decoding_rs.pdf
M = [ 0 ] # optional variable to check bounds (so that we do not mistakenly overwrite the higher order terms). This is not necessary, it's only an additional safe check. For more infos, see the presentation decoding_rs.pdf by Andrew Brown in the doc folder.
# Fix the syndrome shifting: when computing the syndrome, some implementations may prepend a 0 coefficient for the lowest degree term (the constant). This is a case of syndrome shifting, thus the syndrome will be bigger than the number of ecc symbols (I don't know what purpose serves this shifting). If that's the case, then we need to account for the syndrome shifting when we use the syndrome such as inside BM, by skipping those prepended coefficients.
# Another way to detect the shifting is to detect the 0 coefficients: by definition, a syndrome does not contain any 0 coefficient (except if there are no errors/erasures, in this case they are all 0). This however doesn't work with the modified Forney syndrome (that we do not use in this lib but it may be implemented in the future), which set to 0 the coefficients corresponding to erasures, leaving only the coefficients corresponding to errors.
synd_shift = 0
if len(s) > (n-k): synd_shift = len(s) - (n-k)
# Polynomial constants:
ONE = Polynomial(z0=GF256int(1))
ZERO = Polynomial(z0=GF256int(0))
Z = Polynomial(z1=GF256int(1)) # used to shift polynomials, simply multiply your poly * Z to shift
# Precaching
s2 = ONE + s
# Iteratively compute the polynomials n-k-erasures_count times. The last ones will be correct (since the algorithm refines the error/errata locator polynomial iteratively depending on the discrepancy, which is kind of a difference-from-correctness measure).
for l in xrange(0, n-k-erasures_count): # skip the first erasures_count iterations because we already computed the partial errata locator polynomial (by initializing with the erasures locator polynomial)
K = erasures_count+l+synd_shift # skip the FIRST erasures_count iterations (not the last iterations, that's very important!)
# Goal for each iteration: Compute sigma[l+1] and omega[l+1] such that
# (1 + s)*sigma[l] == omega[l] in mod z^(K)
# For this particular loop iteration, we have sigma[l] and omega[l],
# and are computing sigma[l+1] and omega[l+1]
# First find Delta, the non-zero coefficient of z^(K) in
# (1 + s) * sigma[l]
# Note that adding 1 to the syndrome s is not really necessary, you can do as well without.
# This delta is valid for l (this iteration) only
Delta = ( s2 * sigma[l] ).get_coefficient(K) # Delta is also known as the Discrepancy, and is always a scalar (not a polynomial).
# Make it a polynomial of degree 0, just for ease of computation with polynomials sigma and omega.
Delta = Polynomial(x0=Delta)
# Can now compute sigma[l+1] and omega[l+1] from
# sigma[l], omega[l], B[l], A[l], and Delta
sigma.append( sigma[l] - Delta * Z * B[l] )
omega.append( omega[l] - Delta * Z * A[l] )
# Now compute the next support polynomials B and A
# There are two ways to do this
# This is based on a messy case analysis on the degrees of the four polynomials sigma, omega, A and B in order to minimize the degrees of A and B. For more infos, see https://www.cs.duke.edu/courses/spring10/cps296.3/decoding_rs_scribe.pdf
# In fact it ensures that the degree of the final polynomials aren't too large.
if Delta == ZERO or 2*L[l] > K+erasures_count \
or (2*L[l] == K+erasures_count and M[l] == 0):
#if Delta == ZERO or len(sigma[l+1]) <= len(sigma[l]): # another way to compute when to update, and it doesn't require to maintain the update flag L
# Rule A
B.append( Z * B[l] )
A.append( Z * A[l] )
L.append( L[l] )
M.append( M[l] )
elif (Delta != ZERO and 2*L[l] < K+erasures_count) \
or (2*L[l] == K+erasures_count and M[l] != 0):
# elif Delta != ZERO and len(sigma[l+1]) > len(sigma[l]): # another way to compute when to update, and it doesn't require to maintain the update flag L
# Rule B
B.append( sigma[l] // Delta )
A.append( omega[l] // Delta )
L.append( K - L[l] ) # the update flag L is tricky: in Blahut's schema, it's mandatory to use `L = K - L - erasures_count` (and indeed in a previous draft of this function, if you forgot to do `- erasures_count` it would lead to correcting only 2*(errors+erasures) <= (n-k) instead of 2*errors+erasures <= (n-k)), but in this latest draft, this will lead to a wrong decoding in some cases where it should correctly decode! Thus you should try with and without `- erasures_count` to update L on your own implementation and see which one works OK without producing wrong decoding failures.
M.append( 1 - M[l] )
else:
raise Exception("Code shouldn't have gotten here")
# Hack to fix the simultaneous computation of omega, the errata evaluator polynomial: because A (the errata evaluator support polynomial) is not correctly initialized (I could not find any info in academic papers). So at the end, we get the correct errata evaluator polynomial omega + some higher order terms that should not be present, but since we know that sigma is always correct and the maximum degree should be the same as omega, we can fix omega by truncating too high order terms.
if omega[-1].degree > sigma[-1].degree: omega[-1] = Polynomial(omega[-1].coefficients[-(sigma[-1].degree+1):])
# Return the last result of the iterations (since BM compute iteratively, the last iteration being correct - it may already be before, but we're not sure)
return sigma[-1], omega[-1]
def _find_erasures_locator(self, erasures_pos):
'''Compute the erasures locator polynomial from the erasures positions (the positions must be relative to the x coefficient, eg: "hello worldxxxxxxxxx" is tampered to "h_ll_ worldxxxxxxxxx" with xxxxxxxxx being the ecc of length n-k=9, here the string positions are [1, 4], but the coefficients are reversed since the ecc characters are placed as the first coefficients of the polynomial, thus the coefficients of the erased characters are n-1 - [1, 4] = [18, 15] = erasures_loc to be specified as an argument.'''
# See: http://ocw.usu.edu/Electrical_and_Computer_Engineering/Error_Control_Coding/lecture7.pdf and Blahut, Richard E. "Transform techniques for error control codes." IBM Journal of Research and development 23.3 (1979): 299-315. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.600&rep=rep1&type=pdf and also a MatLab implementation here: http://www.mathworks.com/matlabcentral/fileexchange/23567-reed-solomon-errors-and-erasures-decoder/content//RS_E_E_DEC.m
erasures_loc = Polynomial([GF256int(1)]) # just to init because we will multiply, so it must be 1 so that the multiplication starts correctly without nulling any term
# erasures_loc is very simple to compute: erasures_loc = prod(1 - x*alpha[j]**i) for i in erasures_pos and where alpha is the alpha chosen to evaluate polynomials (here in this library it's gf(3)). To generate c*x where c is a constant, we simply generate a Polynomial([c, 0]) where 0 is the constant and c is positionned to be the coefficient for x^1. See https://en.wikipedia.org/wiki/Forney_algorithm#Erasures
for i in erasures_pos:
erasures_loc = erasures_loc * (Polynomial([GF256int(1)]) - Polynomial([GF256int(self.generator)**i, 0]))
return erasures_loc
</code></pre>
<p><em>注意</em>:Sigma、Omega、A、B、L和M都是多项式的列表(因此我们保留每次迭代计算的所有中间多项式的完整历史)。这当然可以优化,因为我们只需要<code>Sigma[l]</code>、<code>Sigma[l-1]</code>、<code>Omega[l]</code>、<code>Omega[l-1]</code>、<code>A[l]</code>、<code>B[l]</code>、<code>L[l]</code>和{<cd9>}(因此,只有Sigma和Omega需要将之前的迭代保存在内存中,其他变量不需要)。</p>
<p><em>注意2</em>:update标志L很棘手:在某些实现中,像Blahut的模式一样操作会导致解码时出现错误的失败。在我以前的实现中,必须使用<code>L = K - L - erasures_count</code>来正确解码错误和擦除,直到达到单例边界,但在我的最新实现中,我不得不使用<code>L = K - L</code>(即使存在擦除)来避免错误的解码失败。您应该在自己的实现中尝试这两种方法,看看哪一种不会产生任何错误的解码失败。更多信息,请参阅下面的问题。</p>
<p>这个算法的唯一问题是,它没有描述如何同时计算欧米茄,错误评估器多项式(这本书描述了如何只针对错误初始化欧米茄,而不是在解码错误和擦除时)。我尝试了几个变化和上述工作,但不是完全:最后,欧米茄将包括更高阶的条款,应该取消。可能是欧米茄或一个错误计算器支持多项式,没有用好的值初始化。</p>
<p>但是,您可以通过修剪太高阶项的ω多项式来解决这个问题(因为它应该与Lambda/Sigma具有相同的阶数):</p>
^{pr2}$
<p>或者,您可以使用勘误表定位器Lambda/Sigma在BM之后从头开始计算Omega,它总是正确计算的:</p>
<pre><code>def _find_error_evaluator(self, synd, sigma, k=None):
'''Compute the error (or erasures if you supply sigma=erasures locator polynomial) evaluator polynomial Omega from the syndrome and the error/erasures/errata locator Sigma. Omega is already computed at the same time as Sigma inside the Berlekamp-Massey implemented above, but in case you modify Sigma, you can recompute Omega afterwards using this method, or just ensure that Omega computed by BM is correct given Sigma (as long as syndrome and sigma are correct, omega will be correct).'''
n = self.n
if not k: k = self.k
# Omega(x) = [ Synd(x) * Error_loc(x) ] mod x^(n-k+1) From Blahut, Algebraic codes for data transmission, 2003
return (synd * sigma) % Polynomial([GF256int(1)] + [GF256int(0)] * (n-k+1)) # Note that you should NOT do (1+Synd(x)) as can be seen in some books because this won't work with all primitive generators.
</code></pre>
<p>我在<a href="https://cstheory.stackexchange.com/questions/31606/initialization-of-errata-evaluator-polynomial-for-simultaneous-computation-in-be">following question on CSTheory</a>中寻找更好的解决方案。</p>
<p><strong>/编辑:</strong>我将描述我遇到的一些问题以及如何解决这些问题:</p>
<ul>
<li>别忘了用擦除定位多项式初始化错误定位器多项式(可以很容易地从综合症和擦除位置计算)。在</li>
<li>如果您只能对错误进行解码,并且只能完美地删除,但仅限于<code>2*errors + erasures <= (n-k)/2</code>,那么您就忘记了跳过第一个v次迭代。在</li>
<li>如果您可以解码删除和错误,但最多<code>2*(errors+erasures) <= (n-k)</code>,那么您忘记更新L:<code>L = i+1 - L - erasures_count</code>的赋值,而不是<code>L = i+1 - L</code>。但在某些情况下,这可能会使解码器失败,这取决于您如何实现解码器,请参阅下一点。在</li>
<li>我的第一个解码器仅限于一个generator/prime polyminary/fcr,但当我将其更新为通用的并添加了严格的单元测试时,解码器在不应该的情况下失败了上面Blahut的模式对于L(更新标志)似乎是错误的:它必须使用<code>L = K - L</code>而不是{<cd10>}进行更新,因为这有时会导致解码器失败,即使我们处于单例约束之下(因此我们应该正确解码!)。这似乎得到了这样一个事实的证实:计算<code>L = K - L</code>不仅可以解决这些解码问题,而且它还将给出与不使用更新标志L(即条件<code>if Delta == ZERO or len(sigma[l+1]) <= len(sigma[l]):</code>)的替代更新方法完全相同的结果。但这很奇怪:在我以前的实现中,<code>L = K - L - erasures_count</code>对于错误和擦除解码是强制的,但是现在它似乎产生了错误的失败。因此,您应该尝试使用或不使用您自己的实现,以及是否其中一个会为您产生错误的失败。在</li>
<li>注意,条件<code>2*L[l] > K</code>变为<code>2*L[l] > K+erasures_count</code>。如果不首先添加条件<code>+erasures_count</code>,您可能不会注意到任何副作用,但在某些情况下,解码将失败,而不应该这样做</li>
<li>如果只能修复一个错误或擦除,请检查您的条件是<code>2*L[l] > K+erasures_count</code>,而不是{<cd25>}(注意,<code>></code>而不是{<cd27>})。在</li>
<li>如果你能纠正<code>2*errors + erasures <= (n-k-2)</code>(刚好低于限制,例如,如果你有10个ecc符号,你只能纠正4个错误,而不是正常情况下的5个错误),那么检查你的症候群和BM算法中的循环:如果这个症候群以常数项x^0的系数0开始(有时在书中建议),那么你的症候群就会转移,然后BM中的循环必须从<code>1</code>开始,在<code>n-k+1</code>结束,而不是{<cd31>},如果不移位的话。在</li>
<li>如果您可以更正除最后一个符号(最后一个ecc符号)之外的所有符号,那么请检查您的范围,特别是在您的简氏搜索中:您不应该计算从alpha^0到alpha^255的错误定位器多项式,而应该从alpha^1到alpha^256求值。在</li>
</ul>