如何通过循环数据帧列为每列创建新的数据帧?

2024-06-28 19:22:19 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个名为data的数据帧,包含以下列:

'ContextID', 'strategyname', 'Date', 'Time_ms', 'Time_Elapsed', 
'StepID', 'WfrCntSinceLastClean', 'Ar_Flow_sccm', 'BacksGas_Flow_sccm', 
'BacksGas_Prs_Torr', 'EscAct_Curr_A', 'EscAct_Volt_V',
'EscRF_P2P_Volt_V', 'Mano100mTorr_Prs_Torr'

来自Ar_Flow_sccm的列都是一个参数

我想为每个参数创建一个dataframe,dataframes的列必须是ContextID, the parameter column ,StepID, Time_Elapsed

我为它写了一个函数如下:

def param(df, col_name):
    d = df.loc[:, ['ContextID', col_name, 'StepID', 'Time_Elapsed']]
    return d

当我这么做的时候

BacksGas_Flow_sccm  = param(data, 'BacksGas_Flow_sccm')

我得到一个名为BacksGas_Flow_sccm的数据帧,其列如下 ContextID, BacksGas_Flow_sccm ,StepID, Time_Elapsed

我可以对所有的参数列都这样做,但是有没有一种简单的方法可以做到这一点?也许是通过使用

for col in data.columns[7:]:
    'create the dataframes of the col'

编辑:我的数据帧的一部分:

 ContextID   strategyname   Date   Time_ms    Time_Elapsed   StepID    WfrCntSinceLastCount    Ar_Flow_sccm     BacksGas_Flow_sccm     BascksGas_Prs_Torr    EscAct_Curr_A    EscAct_Volt_V    EscRF_P2P_Volt_V         Mano100mTorr_Prs_Torr
    7289973 Speed2_Gas_Basics   2018-07-09  0 days 09:12:48.502000000   0.0 1   0   49.560546875    1.953125    1.00000001335143e-10    0.122100122272968   1.22100126743317    12.4542121887207    0.00263671879656613
    7289973 Speed2_Gas_Basics   2018-07-09  0 days 09:12:48.603000000   0.101   2   0   49.560546875    2.05078125  0.00244140625   0.0 0.0 12.4542121887207    0.00234375009313226
    7289973 Speed2_Gas_Basics   2018-07-09  0 days 09:12:48.934000000   0.43200000000000005 2   0   99.853515625    2.05078125  0.00244140625   0.0 0.0 12.4542121887207    0.00234375009313226
    7289973 Speed2_Gas_Basics   2018-07-09  0 days 09:12:49.924000000   1.4220000000000002  2   0   351.318359375   2.05078125  0.00244140625   0.122100122272968   2.44200253486633    12.4542121887207    0.00380859384313226
    7289973 Speed2_Gas_Basics   2018-07-09  0 days 09:12:50.924000000   2.422   2   0   382.8125    1.953125    1.00000001335143e-10    0.122100122272968   0.0 12.4542121887207    0.004321289248764511
    7289973 Speed2_Gas_Basics   2018-07-09  0 days 09:12:51.924000000   3.422   2   0   382.8125    1.7578125   1.00000001335143e-10    0.122100122272968   1.8315018415451 13.1868133544922    0.004321289248764511
    7289973 Speed2_Gas_Basics   2018-07-09  0 days 09:12:52.934000000   4.432   2   0   382.8125    1.7578125   1.00000001335143e-10    0.122100122272968   0.0 12.4542121887207    0.004321289248764511
    7289973 Speed2_Gas_Basics   2018-07-09  0 days 09:12:54.440000000   5.938000000000001   2   0   382.8125    1.85546875  1.00000001335143e-10    0.122100122272968   0.610500633716583   12.4542121887207    0.004321289248764511
    7289973 Speed2_Gas_Basics   2018-07-09  0 days 09:12:54.992000000   6.49    2   0   382.8125    1.7578125   1.00000001335143e-10    0.122100122272968   0.0 12.4542121887207    0.004321289248764511
    7289973 Speed2_Gas_Basics   2018-07-09  0 days 09:12:56.430000000   7.928000000000001   5   0   382.8125    9.08203125  0.13671875  0.122100122272968   1.8315018415451 12.4542121887207    0.00437011709436774
    7289973 Speed2_Gas_Basics   2018-07-09  0 days 09:12:57.440000000   8.938   5   0   382.8125    46.19140625 2.109375    0.122100122272968   3.05250310897827    12.4542121887207    0.00437011709436774
    7289973 Speed2_Gas_Basics   2018-07-09  0 days 09:12:58.440000000   9.938   5   0   382.8125    46.19140625 2.109375    0.122100122272968   0.610500633716583   13.1868133544922    0.00437011709436774

Tags: timeflowdaysbasicsgasprselapsedvolt
1条回答
网友
1楼 · 发布于 2024-06-28 19:22:19

IIUC,您可以将函数更改为:

def param(df, col_name):
    d= (df.loc[:, ['ContextID']+
        [col_name]+['StepID', 'Time_Elapsed']])
    return d

然后使用^{}创建dataframe的dict

d={'df_{}'.format(i):param(df,i) 
        for e,i in enumerate(df.iloc[:,df.columns.get_loc('Ar_Flow_sccm'):].columns)}
print(d)

这将把数据帧保存在dict中。密钥将被命名为df_Ar_Flow_sccm,依此类推。。这些值将有一个df,其列如下:['ContextID', 'Ar_Flow_sccm', 'StepID', 'Time_Elapsed']

您可以调用每个dict键来查看df示例:

print(d['df_Ar_Flow_sccm'])

注意:df.columns.get_loc('Ar_Flow_sccm')返回7

相关问题 更多 >