Python CRC16实现中字节字符串的使用不正确?

2024-10-02 00:41:25 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试用Python实现我自己的cyclic redundancy check (CRC)。我的程序布局如下:

  1. 随机_消息(n)生成长度为n的随机字节消息
  2. 使用CRC代码crc16生成校验和值
  3. 对生成的消息运行损坏代码corrupt_data
  4. 检查校验和是否不同(我使用==进行了此操作)
  5. 重复步骤1至4多次,查看错误(即损坏)被忽略的频率

我相信crc16corrupt_data方法是正确的,因此我认为没有太多理由对它们进行过仔细的分析。我想问题是从我在程序的后半部分使用字节字符串开始的,在这两种方法之后

我的代码如下:

from random import random
from random import choice
from string import ascii_uppercase

CORRUPTION_RATE = 0.25

def crc16(data: bytes):
    xor_in = 0x0000  # initial value
    xor_out = 0x0000  # final XOR value
    poly = 0x8005  # generator polinom (normal form)

    reg = xor_in
    for octet in data:
        # reflect in
        for i in range(8):
            topbit = reg & 0x8000
            if octet & (0x80 >> i):
                topbit ^= 0x8000
            reg <<= 1
            if topbit:
                reg ^= poly
        reg &= 0xFFFF
        # reflect out
    return reg ^ xor_out

from random import randbytes


def corrupt_data(data : bytes):
    '''
    some random corruption of byte data
    can be modified as needed using the CORRUPTION_RATE global constant
    ''' 
    temp = data[:]
    while True:
        location = int(len(temp) * random())
        data_list = list(temp)
        if random() < 0.5:
            data_list[location] = (data_list[location] + 1) % 256
        else: 
            data_list[location] = (data_list[location] - 1) % 256
        temp = bytes(data_list)
        if random() < CORRUPTION_RATE and temp != data:
            break
    return temp

# Generate random byte message of length n
def random_message(n):
    
    randomBytes = ''.join(choice(ascii_uppercase) for i in range(n)).encode()
    print("randomBytes is " + str(randomBytes))
    print("The class type of randomBytes is " + str(type(randomBytes)))
    return randomBytes

    
    
numberOfErrors = 0;

for i in range(10000):

    # generating random byte message of length n
    randomMessage = random_message(5)

    # generating the checksum value using the CRC code
    checksumValue = crc16(randomMessage)
    #print("checksumValue is " + str(checksumValue))
    #print("The class type of checksumValue is " + str(type(checksumValue)))

    # running the corruption on the generated message
    #print("The class type of bchecksumValue is " + str(type(b"checksumValue")))
    corrupt = corrupt_data(b"checksumValue")
    #print("The class type of corrupt_data(bchecksumValue) is " + str(type(corrupt)))

    #print("Checking whether the checksum is different ... ")
    different = (b"checksumValue" == corrupt)
    #print("bchecksumValue == corrupt is " + str(different))
    #print("bchecksumValue was " + str(b"checksumValue") + ", and corrupt was " + str(corrupt))
    
    if(different == False):
        numberOfErrors += 1
        
print("numberOfErrors is " + str(numberOfErrors))

如您所见,我插入了各种(现在已注释掉)打印语句,以帮助我进行调试

问题是,当我运行上面的代码时,我得到了numberOfErrors is 10000。显然,这不可能是正确的,因为我们期望其中一些是正确的,因此我们期望numberOfErrors稍微小于10000

正如我所说的,我确信crc16corrupt_data函数是正确的,我怀疑在for循环中使用字节字符串时出现了问题:

numberOfErrors = 0;

for i in range(10000):

    # generating random byte message of length n
    randomMessage = random_message(5)

    # generating the checksum value using the CRC code
    checksumValue = crc16(randomMessage)
    #print("checksumValue is " + str(checksumValue))
    #print("The class type of checksumValue is " + str(type(checksumValue)))

    # running the corruption on the generated message
    #print("The class type of bchecksumValue is " + str(type(b"checksumValue")))
    corrupt = corrupt_data(b"checksumValue")
    #print("The class type of corrupt_data(bchecksumValue) is " + str(type(corrupt)))

    #print("Checking whether the checksum is different ... ")
    different = (b"checksumValue" == corrupt)
    #print("bchecksumValue == corrupt is " + str(different))
    #print("bchecksumValue was " + str(b"checksumValue") + ", and corrupt was " + str(corrupt))
    
    if(different == False):
        numberOfErrors += 1
        
print("numberOfErrors is " + str(numberOfErrors))

我从来没有真正用字节/字节字符串编程过,而且我最近才开始学习Python,所以我不明白我在做什么。我的错误在哪里?我该如何修复它


编辑

正如用户2357112在评论中提到的,问题可能是corrupt = corrupt_data(b"checksumValue")中的b"checksumValue"。我遇到的问题是函数crc16返回一个int,因此,为了将其转换回字节以传递到函数corrupt_data(data : bytes),我尝试使用b前缀。我想这是我对Python缺乏经验的表现


编辑2

好的,我正在尝试this答案中提供的解决方案。修改后的代码如下:

# running the corruption on the generated message
bs = str(checksumValue).encode('ascii')
print("str(checksumValue).encode('ascii') is " + str(bs))
#print("The class type of bchecksumValue is " + str(type(b"checksumValue")))
print("The class type of str(checksumValue).encode('ascii') is " + str(type(bs)))
#corrupt = corrupt_data(b"checksumValue")
corrupt = corrupt_data(bs)
#print("The class type of corrupt_data(bchecksumValue) is " + str(type(corrupt)))
print("The class type of corrupt_data(bs) is " + str(type(corrupt)))

输出是

randomBytes is b'BBVFC'
The class type of randomBytes is <class 'bytes'>
checksumValue is 10073
The class type of checksumValue is <class 'int'>
str(checksumValue).encode('ascii') is b'10073'
The class type of str(checksumValue).encode('ascii') is <class 'bytes'>
The class type of corrupt_data(bs) is <class 'bytes'>

因此,这些课程似乎与我们期望的相符


编辑3

在for循环中实现EDIT2中的更改时,我仍然将numberOfErrors is 10000作为输出。代码如下:

numberOfErrors = 0;

for i in range(10000):

    # generating random byte message of length n
    randomMessage = random_message(5)

    # generating the checksum value using the CRC code
    checksumValue = crc16(randomMessage)
    #print("checksumValue is " + str(checksumValue))
    #print("The class type of checksumValue is " + str(type(checksumValue)))

    # running the corruption on the generated message
    bs = str(checksumValue).encode('ascii')
    #print("str(checksumValue).encode('ascii') is " + str(bs))
    #print("The class type of str(checksumValue).encode('ascii') is " + str(type(bs)))
    corrupt = corrupt_data(bs)
    #print("The class type of corrupt_data(bs) is " + str(type(corrupt)))
    
    #print("Checking whether the checksum is different ... ")
    different = (bs == corrupt)
    #print("bs == corrupt is " + str(different))
    #print("bs was " + str(bs) + ", and corrupt was " + str(corrupt))
    
    if(different == False):
        numberOfErrors += 1
        
print("numberOfErrors is " + str(numberOfErrors))

Tags: ofthemessagedatabsistyperandom
1条回答
网友
1楼 · 发布于 2024-10-02 00:41:25

您的问题不是字节字符串,而是逻辑错误。你想弄错东西。您不希望损坏校验和,而是希望损坏原始消息,然后获取损坏版本的校验和。然后可以比较两个校验和是否匹配

尝试:

undetected_errors = 0

for i in range(10000):
    good_message = random_message(5)
    good_checksum = crc16(good_message)

    corrupted_message = corrupt_data(good_message)
    corrupted_checksum = crc16(corrupted_message)

    if good_checksum == corrupted_checksum:
        undetected_errors += 1

相关问题 更多 >

    热门问题