将包含列表的嵌套dict导入csv

2024-05-18 09:10:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试将以下数据导入CSV:

{'test.foo.com': {'domain': 'foo.com','FQDN': 'test.foo.com', 'AS': 'AS1111', 'ressource_type': 'A', \
'nb_ip': '1', 'IP': '1.1.1.1', 'service': ['UNKNOWN'], 'port': '[443, 8443]'}}

我几乎成功地使用了以下代码:

#!/bin/python3

## Import ##
# Offical
import csv

### Main ###
if __name__ == '__main__':
  ## Variables
  csv_headers = ['domain', 'FQDN', 'AS', 'ressource_type', 'nb_ip', 'IP', 'service', 'port']
  final_data = {'test.foo.com': {'domain': 'foo.com','FQDN': 'test.foo.com', 'AS': 'AS1111', 'ressource_type': 'A', \
  'nb_ip': '1', 'IP': '1.1.1.1', 'service': ['UNKNOWN'], 'port': '[443, 8443]'}}

  # Open the csv file in "write mode"
  with open(file_name, mode='w') as file:
      # Prepare the writer to add a dict into the csv file
      csv_writer = csv.DictWriter(file, fieldnames=headers)

      # Write the columns header into the csv file
      csv_writer.writeheader()
      # Write the dict into the file
      for key, val in nest_dict.items():
          row = {'FQDN': key}
          row.update(val)
          csv_writer.writerow(row)

结果是:

domain,FQDN,AS,ressource_type,nb_ip,IP,service,port
foo.com,test.foo.com,AS1111,A,1,1.1.1.1,['UNKNOWN'],"[443, 8443]"

但我想:

domain,FQDN,AS,ressource_type,nb_ip,IP,service,port
foo.com,test.foo.com,AS1111,A,1,1.1.1.1,'UNKNOWN','443'
foo.com,test.foo.com,AS1111,A,1,1.1.1.1,'UNKNOWN','8443'

看到区别了吗? 我有一个“服务”列表(这里不需要治疗)和一个“端口”列表。 如果“端口”列中有多个端口,我需要为列表中的每个端口打印新行

我很难做到这一点,因为我没有完全理解以下代码:

# Write the dict into the file
      for key, val in nest_dict.items():
          row = {'FQDN': key}
          row.update(val)
          csv_writer.writerow(row)

你能帮我一下吗


Tags: csvthetestipcomfoodomainas
2条回答

这是最后的代码

不确定这是不是最好的方式,但对我来说已经足够好了

#!/bin/python3

## Import ##
# Offical
import csv

### Main ###
if __name__ == '__main__':
  ## Variables
  csv_headers = ['domain', 'FQDN', 'AS', 'ressource_type', 'nb_ip', 'IP', 'service', 'port']
  final_data = {'test.foo.com': {'domain': 'foo.com','FQDN': 'test.foo.com', 'AS': 'AS1111', 'ressource_type': 'A', \
  'nb_ip': '1', 'IP': '1.1.1.1', 'service': ['UNKNOWN'], 'port': '[443, 8443]'}}

# Open the csv file in "write mode"
with open(file_name, mode='w') as file:
    # Prepare the writer to add a dict into the csv file
    csv_writer = csv.DictWriter(file, fieldnames=csv_headers)

    # Write the columns header into the csv file
    csv_writer.writeheader()

    for key, val in final_data.items():
        # ?
        row = {'FQDN': key}
        # Update the row with all columns values
        row.update(val)
        # If service contains multiple elements it will transform the list into a string with each string separate by a space
        # If service contains just one element, it will transform the list into a string (no space) added
        row['service'] = ' '.join(val['service'])

        # Write a row for each value in the port list
        for port in val['port']:
            row['port'] = port
            csv_writer.writerow(row)

结果输出:

domain,FQDN,AS,ressource_type,nb_ip,IP,service,port
foo.com,test.foo.com,AS1111,A,1,1.1.1.1,'UNKNOWN','443'
foo.com,test.foo.com,AS1111,A,1,1.1.1.1,'UNKNOWN','8443'

不要投票支持我的答案,我把它作为知识的目的。 所有奖励都应归@bartonstanley所有

这将给出给定数据的预期结果:

### Main ###
if __name__ == '__main__':
  ## Variables
  csv_headers = ['domain', 'FQDN', 'AS', 'ressource_type', 'nb_ip', 'IP', 'service', 'port']
  final_data = {'test.foo.com': {'domain': 'foo.com','FQDN': 'test.foo.com', 'AS': 'AS1111', 'ressource_type': 'A', \
  'nb_ip': '1', 'IP': '1.1.1.1', 'service': ['UNKNOWN'], 'port': '[443, 8443]'}}

  # Open the csv file in "write mode"
  with open('out.csv', mode='w') as file:
      # Prepare the writer to add a dict into the csv file
      csv_writer = csv.DictWriter(file, fieldnames=csv_headers)

      # Write the columns header into the csv file
      csv_writer.writeheader()
      # Write the dict into the file
      for key, val in final_data.items():
          row = {'FQDN': key}
          # Assume that service is always a list of one value and replace it with the one value
          # it contains.
          val['service'] = val.pop('service')[0]
          row.update(val)
          # Since the value of port is quoted it will be a string, but we wat a list. Remove the
          # value of 'port' from the dict and put it in 'port_string' (= '[443, 8443'')
          port_string = val.pop('port')
          # Remove the opening and closing brackets from the port_string (= '443, 8443').
          port_string = port_string.replace('[', '')
          port_string = port_string.replace(']', '')
          # Now we can split the string into a python list (= ['443', ' 8443'])
          port_list = port_string.split(',')
          # Write a csv row for each value in the port list
          for port in port_list:
              row['port'] = port.strip()
              csv_writer.writerow(row)

(顺便说一句,原始帖子中的代码没有运行。此代码包括使其运行的编辑。)

请注意,由于引用了“port”的值(与“service”的值不同),因此它将作为字符串读入,因此必须首先将其转换为列表。如果删除[443,8443]周围的单引号,则代码的端口部分简化为:

          port_list = val.pop('port')
          # Write a csv row for each value in the port list
          for port in port_list:
              row['port'] = port
              csv_writer.writerow(row)

另一个潜在问题是“服务”。它是一个列表,所以它可以有多个值吗?如果是这样,那么需要修改代码来解决这个问题

最后,我在这里展示的代码可以更具python风格,但我希望确保它对初学者来说尽可能可读。一旦它完全按照需要工作,它就可以变得更像Python

相关问题 更多 >