将字典的值映射到具有新键的新字典

2024-09-24 12:35:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我从pdf文件中读取表格数据,然后将该表格数据转换为数据帧,然后转换为字典。我的问题是每次读取一个表时,字典的数据键都不是固定的,比如它有一些时间键,比如{'Sno':1,'itemsdescription':'ABC'},有时它有{'Sl No':1,'Description':'XYZ'}。我想创建一个新的字典,它有如下固定键,其中键是左侧部分,右侧部分是从dataframe提取的键,因此如果它与列表中的匹配,值应该映射到新键

Srno = ["Sno", "Sl No.", "Order No.", "PO No."]
Productdescription = ["Item Code / Product Description", "Description", "Description of Goods", "Particulars"]
HSNCode = ["HSN / SAC\nCode", "HSN Code", "HSN", "HSN/SAC"]
Quantity = ["Quantity"]
ASIN = ["ASIN"]
ISBN = ["ISBN/EAN/UPC"]
Rate = ["Unit Price\n[INR]", "Rate", "Unit cost", "List price"]
Tax = ["IGST[INR]\nAmount", "Tax rate", "Tax type", "Tax amount"]
Discount = ["Discount", "Disc. %"]
Total = ["Total amount", "Amount", "Total", "Total\n[INR]", "Line Total\n[INR]"]
Model = ["Model #"]

下面是从dataframe创建的dictionary示例

{'item': [{'Sno': 1,
   'ItemCodeProductDescription': 'TGMOCL0015CORSAIRMOUSE,M55RGBPRO,PART#CH-9308011-AP',
   'HSNSACCode': '8471.60.60',
   'Quantity': 7,
   'UnitPrice': 1741,
   'Total': 12187,
   'Rate': 18,
   'LineTotal': 14380.66},
  {'Sno': 2,
   'ItemCodeProductDescription': 'TGMOCL0013CORSAIRMOUSE,HARPOONPRO-BLK-RGB,PART#CH-9301111-AP',
   'HSNSACCode': '8471.60.60',
   'Quantity': 8,
   'UnitPrice': 1200,
   'Total': 9600,
   'Rate': 18,
   'LineTotal': 11328.0},
  {'Sno': 3,
   'ItemCodeProductDescription': 'TGCBCL0029CORSAIRCABINETSPEC-05,BLK-PART#CC-9011138-WW',
   'HSNSACCode': '8473.30.99',
   'Quantity': 37,
   'UnitPrice': 2225,
   'Total': 82325,
   'Rate': 18,
   'LineTotal': 97143.5},
  {'Sno': 4,
   'ItemCodeProductDescription': 'TGHSCL0003CORSAIRGAMINGHEADSETHS50StereoCarbonPART#CA-9011170-AP',
   'HSNSACCode': '8518.30.00',
   'Quantity': 92,
   'UnitPrice': 3000,
   'Total': 276000,
   'Rate': 18,
   'LineTotal': 325680.0},
  {'Sno': 5,
   'ItemCodeProductDescription': 'TGMOCL0001CORSAIRMOUSE,HARPOON-BLK-RGB,PART#CH-9301011-AP',
   'HSNSACCode': '8471.60.60',
   'Quantity': 43,
   'UnitPrice': 1018,
   'Total': 43774,
   'Rate': 18,
   'LineTotal': 51653.32},
  {'Sno': 6,
   'ItemCodeProductDescription': 'TGKBCL0001CORSAIRKEYBOARDK95PLTN-BLK-MXSpeed-RGBPART#CH-9127014-NA',
   'HSNSACCode': '8471.60.40',
   'Quantity': 8,
   'UnitPrice': 10750,
   'Total': 86000,
   'Rate': 18,
   'LineTotal': 101480.0},
  {'Sno': 7,
   'ItemCodeProductDescription': 'TGKBCL0007CORSAIRKEYBOARDK55-BLK-RBRDME-RGBPART#CH-9206015-NA',
   'HSNSACCode': '8471.60.40',
   'Quantity': 14,
   'UnitPrice': 2400,
   'Total': 33600,
   'Rate': 18,
   'LineTotal': 39648.0}]}

最后一本字典应该是这样的

{'item': [{'Srno': 1,
       'ProductDescription': 'TGMOCL0015CORSAIRMOUSE,M55RGBPRO,PART#CH-9308011-AP',
       'HSNCode': '8471.60.60',
       'Quantity': 7,
       'ASIN':Null 
       'ISBN':Null
       'Rate': 1741,
       'Discount':Null,
       'Model':Null,
       'Tax': 18,
       'Total': 14380.66}

请建议从旧词典创建新词典的有效方法


Tags: 数据字典ratechquantitytotalaptax
1条回答
网友
1楼 · 发布于 2024-09-24 12:35:07

由于禁止在原始字典中的迭代过程中更改键,因此这里只能通过新字典进行更改。通过检查可能选项列表中的条目,可以确定正确的密钥

result_list = []

for i in items['item']:
    result = {}

    for key, value in i.items():
        if key in Srno:
            result['Srno'] = value
        elif key in Productdescription:
            result['ProductDescription'] = value
        elif key in HSNCode:
            result['HSNCode'] = value
        elif key in Quantity:
            result['Quantity'] = value
        elif key in ASIN:
            result['ASIN'] = value
        elif key in ISBN:
            result['ISBN'] = value
        elif key in Rate:
            result['Rate'] = value
        elif key in Tax:
            result['Tax'] = value
        elif key in Discount:
            result['Discount'] = value
        elif key in Total:
            result['Total'] = value
        elif key in Model:
            result['Model'] = value

    if result:
        result_list.append(result.copy())

相关问题 更多 >