不一致的Python计算

2024-09-29 22:32:52 发布

男 | 程序猿一只，喜欢编程写python代码。

我正在编写一个函数来计算一段密文的卡方统计量，以确定凯撒密码最可能的密钥。你知道吗

为了计算卡方统计量，我使用PracticalCryptography.com中的一个文件计算每个英文字符的相对频率，并将其存储在字典中，然后计算每个英文字母的出现次数，忽略空格和标点符号，并将它们存储在字典中，然后使用Wikipedia和PracticalCryptograhy.com中的公式计算卡方统计量。你知道吗

代码：

def ChiSquared(Ciphertext):
    ALPHABET = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

    ExpOccurences = {}
    with open('english_monograms.txt', 'r') as File:
        for Line in File:
            Key, Count = Line.split(' ')
            ExpOccurences[Key] = int(Count)

    for Key, Count in ExpOccurences.items():
        ExpOccurences[Key] = Count / sum(ExpOccurences.values())

    ActOccurences = {}
    for Char in Ciphertext.upper():
        if Char in ALPHABET:
            if Char in ActOccurences.keys():
                ActOccurences[Char] += 1
            else:
                ActOccurences[Char] = 1

    Score = 0.0
    for Char in ActOccurences.keys():
        Score += (ActOccurences[Char] - (ExpOccurences[Char] * len(Ciphertext))) ** 2 / (ExpOccurences[Char])

    return Score

我的问题是，每次使用相同的字符串调用函数时，得到的Score都不同。为什么会发生这种情况？我如何预防/修复它？我应该硬编码频率吗？是因为计算相对频率时的下溢误差吗？你知道吗

不一致示例：

>>> ChiSquared('HELLO WORLD')
139.3055836714786
>>>
>>> ChiSquared('HELLO WORLD')
140.48594498993816
>>> 
>>> ChiSquared('HELLO WORLD')
132.3702317119948
>>>
>>> ChiSquared('HELLO WORLD')
65.840138355439
>>>
>>> ChiSquared('HELLO WORLD')
129.22608450306808
>>>

Tags： key in com hello for world 字典 count

0条回答

目前没有回答

不一致的Python计算

相关问题更多 >

编程相关推荐

热门问题

热门文章

不一致的Python计算

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >