pytorch状态的序列化在加载到新模型实例后发生更改

import binascii import torch.nn as nn import pickle lin1 = nn.Linear(1, 1, bias=False) lin1s = pickle.dumps(lin1.state_dict()) print("--- original model ---") print(f"hash of state dict: {hex(binascii.crc32(lin1s))}") print(f"weight: {lin1.state_dict()['weight'].item()}") lin2 = nn.Linear(1, 1, bias=False) lin2.load_state_dict(pickle.loads(lin1s)) lin2s = pickle.dumps(lin2.state_dict()) print("\n--- model from deserialized state dict ---") print(f"hash of state dict: {hex(binascii.crc32(lin2s))}") print(f"weight: {lin2.state_dict()['weight'].item()}")

1条回答

网友

1楼 · 发布于 2024-10-02 10:32:55

这可能是因为pickle不希望生成适合散列的repr（请参见Using pickle.dumps to hash mutable objects）。比较键，然后比较dict键中存储的张量是否相等/接近，这可能是一个更好的主意

下面是这个想法的一个粗略实现

def compare_state_dict(dict1, dict2):
    # compare keys
    for key in dict1:
        if key not in dict2:
            return False
    
    for key in dict2:
        if key not in dict1:
            return False

    for (k,v) in dict1.items():
        if not torch.all(torch.isclose(v, dict2[k]))
            return False
    
    return True

但是，如果您仍然希望散列状态dict并避免使用上面的isclose之类的比较，那么可以使用下面的函数

def dict_hash(dictionary):
    for (k,v) in dictionary.items():
        # it did not work without hashing the tensor
        dictionary[k] = hash(v)

    # dictionaries are not hashable and need to be converted to frozenset. 
    return hash(frozenset(sorted(dictionary.items(), key=lambda x: x[0])))

相关问题更多 >

编程相关推荐

热门问题

热门文章