带有标量值的pd.DataFrame

with open(path_to_read_csv_file, "r") as csv_file: csv_reader = csv.DictReader(csv_file, delimiter=',') for line in csv_reader: # if validation(line[specific_column]): try: df = pd.DataFrame(line) df.to_csv(path_to_save_csv_file) except Exception as e: print('Something Happend!') print(e) continue

inFile = open(path_to_read_csv_file, 'r') outFile = open(path_to_save_csv_file, 'w') for line in inFile: try: print('Analysing:', line) # HERE, how can I get the specific column value? I used to use line[specific_column] in the last version if validation(line[specific_column]): outFile.write(line) else: continue except Exception as e: print('Something Happend!') print(e) continue outFile.close() inFile.close()

2条回答

网友

1楼 · 编辑于 2024-09-27 23:23:46

构造函数pd.DataFrame希望您告诉您提供的数据也必须如何索引。这被记录在案here

函数csv.DictReader使用

the values in the first row of file f will be used as the fieldnames.

有关更多信息，请参阅csvdocumentation

因此，由csv_reader解析的每个line都是一个字典，其中键是CSV头，值是特定行中的每一行

例如，如果我的CSV是：

Header1, Header2, Header3
1,2,3
11,11,33

然后在第一次迭代中，line对象将是：

{'Header1': '1', 'Header2': '2', 'Header3': '3'}

现在，当您将其提供给pd.DataFrame时，需要指定数据是什么以及头/索引是什么。在这种情况下，数据是['1', '2', '3']，头/索引是['Header1', 'Header2', 'Header3']。这些可以分别通过调用line.values()和line.keys()提取

这就是我所做的改变

with open(path_to_read_csv_file, "r") as csv_file:
    csv_reader = csv.DictReader(csv_file, delimiter=',')
    for line in csv_reader:
        try:
            # validation ...
            df = pd.DataFrame(line.values(), line.keys())
            df.to_csv(path_to_save_csv_file)

        except Exception as e:
            print('Something Happend!')
            print(e)
            continue

网友

2楼 · 编辑于 2024-09-27 23:23:46

This应该能帮助你。基本上，您不能仅从标量值创建数据帧。它们必须用例如a list包装起来

相关问题更多 >

编程相关推荐

热门问题

热门文章