如何循环多个数据帧并生成多个csv?

2024-10-01 13:24:18 发布

您现在位置:Python中文网/ 问答频道 /正文

从R改为Python我很难使用pandas从多个数据帧的列表中编写多个csv:

import pandas
from dplython import (DplyFrame, X, diamonds, select, sift, sample_n,
                  sample_frac, head, arrange, mutate, group_by, summarize,
                  DelayFunction)

diamonds = [diamonds, diamonds, diamonds]
path = "/user/me/" 

def extractDiomands(path, diamonds):
    for each in diamonds:
    df = DplyFrame(each) >> select(X.carat, X.cut, X.price) >> head(5)
    df = pd.DataFrame(df) # not sure if that is required
    df.to_csv(os.path.join('.csv', each))

extractDiomands(path,diamonds)

但这会产生错误。谢谢你的建议!在


Tags: csv数据samplepathfromimportpandasdf
1条回答
网友
1楼 · 发布于 2024-10-01 13:24:18

欢迎来到Python!首先,我将加载两个库并下载一个示例数据集。在

import os
import pandas as pd

example_data =  pd.read_csv("http://www.ats.ucla.edu/stat/data/binary.csv")
print(example_data.head(5))

示例数据的前几行:

^{pr2}$

现在我想你要做的是:

# spawn a few datasets to loop through
df_1, df_2, df_3 = example_data.head(20), example_data.tail(20), example_data.head(10)
list_of_datasets = [df_1, df_2, df_3]

output_path = 'scratch'
# in Python you can loop through collections of items directly, its pretty cool.
# with enumerate(), you get the index and the item from the sequence, each step through
for index, dataset in enumerate(list_of_datasets):

    # Filter to keep just a couple columns
    keep_columns =   ['gre', 'admit']
    dataset = dataset[keep_columns]

    # Export to CSV
    filepath = os.path.join(output_path, 'dataset_'+str(index)+'.csv')
    dataset.to_csv(filepath)

最后,我的文件夹'scratch'有三个新的csv,分别是dataset_0.csvdataset_1.csv,和{}

相关问题 更多 >