使用选择/忽略特定键展平字典

2024-06-28 19:35:06 发布

您现在位置:Python中文网/ 问答频道 /正文

问题

给定具有多个级别的dict,根据指定的键路径展平dict

样本输入数据

input_data = [
    {
        "CreatedBy": {"Name":"User001"},
        "Lookup": {
            "TextField": "Some text",
            "UserField": {"Id": "ID001", "Name": "Name001"},
            "CreatedBy": {"Name": "User001"},
        },
        "Image": {"a": "b"},
    }
]

测试用例

测试用例1

仅当指定的路径匹配时展平

output = flatten_dict(input_data, use_keys=["Image", "Lookup.CreatedBy", "CreatedBy"])

expected = [{
    'CreatedBy.Name':'User001'},
    "Lookup": {
        "TextField": "Some text",
        "UserField": {"Id": "ID001", "Name": "Name001"},
        "CreatedBy.Name": "User001",
    },
    "Image.a": "b",
}]

测试用例2

output = flatten_dict(input_data, use_keys=["Image", "Lookup.CreatedBy"])

expected = [{
    "CreatedBy": {"Name":"User001"},
    "Lookup": {
        "TextField": "Some text",
        "UserField": {"Id": "ID001", "Name": "Name001"},
        "CreatedBy.Name": "User001",
    },
    "Image.a": "b",
}]

测试用例3-顶级键优先 展平给定父路径的所有子路径。i、 例如,只要给出“查找”,解决方案就可以展平到CreatedBy.Name,而不显式地提到它

output = flatten_dict(input_data, use_keys=["Image", "Lookup.CreatedBy", "Lookup"])

expected = [{
    "CreatedBy": {"Name":"User001"}
    "Lookup.TextField": "Some text",
    "Lookup.UserField.Id": "ID001", 
    "Lookup.UserField.Name": "Name001",
    "Lookup.CreatedBy.Name": "User001",
    "Image.a": "b",
}]

以下是我尝试的

现在,我将解决方案限制为一个dict,稍后我想将其扩展为一个dict列表

def flatten(data, prev_key="", level=0, use_keys=["Image", "CreatedBy"]):
    if isinstance(data, list):
        data = data[0]
    res = {}
    for k, v in data.items():

        if level == 0:
            newkey = k
        else:
            newkey = prev_key + "." + k

        if isinstance(v, dict):
            flattened_val = flatten(data=v, prev_key=newkey, level=level + 1)
            if newkey in use_keys:
                res.update(flattened_val)
            else:
                res.update({".".join(newkey.split(".")[level-2:]): flattened_val})

        else:
            if newkey.split(".")[-2] in use_keys:
                res.update({".".join(newkey.split(".")[level-1:]): v})
            else:
                res.update({k: v})
    return res

Tags: nameimagedataifusereskeyslookup
1条回答
网友
1楼 · 发布于 2024-06-28 19:35:06

可以对生成器使用递归:

[data] = [{'CreatedBy': {'Name': 'User001'}, 'Lookup': {'TextField': 'Some text', 'UserField': {'Id': 'ID001', 'Name': 'Name001'}, 'CreatedBy': {'Name': 'User001'}}, 'Image': {'a': 'b'}}]
def flatten_dict(d, use_keys = []):
  def new_lookup(_d, c = []):
     for a, b in _d.items():
        if not isinstance(b, dict):
           yield c+[a, b]
        else:
           yield from new_lookup(b, c + [a])
  def flatten(_d, c = []):
     new_d = {}
     for a, b in _d.items():
       if any((c+[a])[-len(i.split('.')):] == i.split('.') for i in use_keys):
          for *j, k in new_lookup(b):              
            new_d['.'.join([a,*j])] = k
       else:
          new_d[a] = b if not isinstance(b, dict) else flatten(b, c + [a])
     return new_d
  return flatten(d)

#test case 1
print([flatten_dict(data, use_keys = ["Image", "Lookup.CreatedBy", "CreatedBy"])])

输出:

[
  {'CreatedBy.Name': 'User001', 
   'Lookup': 
      {'TextField': 'Some text', 
      'UserField': {'Id': 'ID001', 'Name': 'Name001'}, 
      'CreatedBy.Name': 'User001'}, 
    'Image.a': 'b'}
]

#test case 2
print([flatten_dict(data, use_keys=["Image", "Lookup.CreatedBy"])])

输出:

[
   {'CreatedBy': {'Name': 'User001'}, 
   'Lookup': {'TextField': 'Some text', 
   'UserField': {'Id': 'ID001', 'Name': 'Name001'}, 
   'CreatedBy.Name': 'User001'}, 
   'Image.a': 'b'}
]

#test case 3
print([flatten_dict(data, use_keys=["Image", "Lookup.CreatedBy", "Lookup"])])

输出:

[
   {'CreatedBy': {'Name': 'User001'}, 
   'Lookup.TextField': 'Some text', 
   'Lookup.UserField.Id': 'ID001', 
   'Lookup.UserField.Name': 'Name001', 
   'Lookup.CreatedBy.Name': 'User001', 
   'Image.a': 'b'}
]

相关问题 更多 >