基于条件循环向Pandas DF插入新行

2024-10-02 22:33:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一组CSV需要修改。下面的代码找到需要进行修改的地方--“Markers”列有连续的4s、3s、5-3或4-3。我需要在这些图案之间插入一个2(即3,3,应该变成3,2,3)。5,3,应该变成5,2,3等)

下面的代码通过插入一个新的标记复制列(向下移动一个)来查找这些模式:

columns=['TwoThrees','TwoFours', 'FiveThree', 'FourThree']

PVTdfs=[]

def PVTscore(pdframe):
    Taskname ='PVT_'
    ID=(re.findall('\\d+', file))
    dfName = 'Scoringdf_'+str(ID)
    dfName = pd.DataFrame([[0,0,0,0]],columns=columns, index=ID)
    pdframe['ShiftedMarkers'] = pdframe.Markers.shift()
    for index, row in pdframe.iterrows():
        if row[1] == row[2]:
            if row[1]==3:
                print("looks like two threes")
                print(index, row[1],row[2])
                dfName.TwoThrees[0]+=1
            elif row[1]==4:
                print("looks like two fours")
                print(index, row[1],row[2])
                dfName.TwoFours[0]+=1
        if row[1]==3 and row[2]==5:
            print("looks like a three then a five")
            print(index, row[1],row[2])
            dfName.FiveThree[0]+=1
        if row[1]==3 and row[2]==4:
            print("looks like a four then a three")
            print(index, row[1],row[2])
            dfName.FourThree[0]+=1
    if 'post' in file:
        print('Looks like a Post')
        PrePost = 'Post_'
        dfName.columns = [Taskname+ PrePost +x for x in columns]
    elif'pre' in file: 
        print('Looks like a PRE')
        PrePost = 'Pre_'
        dfName.columns = [Taskname+ PrePost +x for x in columns]
    PVTdfs.append(dfName)

CSV示例如下:

^{pr2}$

期望输出:

Relative Time   Markers
1  928      1
2  1312     2
3  1364     5
4  3092     2
5  3167     3
6  5072     2
7   5147    3
8   5908    2
9   5969    3 
10   NAN    2
11  7955    3 <-- fixed
12   NAN    2
13  9560    3 <-- fixed
14  10313   2
15  10391   3
16  11354   2

我试过了np.插入以及航向位置但它们只是替换现有的行,我需要插入一个新的并更新索引。在


Tags: columnsinidindexiflikefilerow
2条回答

以下是我使用的csv示例:

    Relative    Time    Markers
0   928     1   NaN
1   1312    2   NaN
2   1364    5   NaN
3   3092    2   NaN
4   3167    3   NaN
5   5072    2   NaN
6   5147    3   NaN
7   5908    2   NaN
8   5969    3   NaN
9   7955    3   1.0
10  9560    3   1.0
11  10313   2   NaN
12  10391   3   NaN
13  11354   2   NaN
14  12322   5   NaN
15  12377   5   1.0

和代码工作:

^{pr2}$

给出输出:

[9L, 10L, 15L]
   Markers  Relative Time
0      NaN       NaN    2

    Markers    Relative    Time
0   NaN     928.0       1
1   NaN     1312.0      2
2   NaN     1364.0      5
3   NaN     3092.0      2
4   NaN     3167.0      3
5   NaN     5072.0      2
6   NaN     5147.0      3
7   NaN     5908.0      2
8   NaN     5969.0      3
9   NaN     NaN     2
10  1.0     7955.0      3
11  NaN     NaN     2
12  1.0     9560.0      3
13  NaN     10313.0     2
14  NaN     10391.0     3
15  NaN     11354.0     2
16  NaN     12322.0     5
17  NaN     NaN     2
18  1.0     12377.0     5

为什么不使用pd.concat()方法?(see doc)

根据您的工作流程,可以在要插入新行的索引处对数据帧进行切片,然后按以下方式插入行:

>>> d = {'col1': ['A', 'B', 'D'], 'col2': [1, 2, 4]}    
>>> df = pd.DataFrame(data=d)
>>> df
  col1  col2
0    A     1
1    B     2
2    D     4

>>> row = {'col1':['C'], 'col2': [3]}  
>>> row = pd.DataFrame(data=row)

>>> new_df = pd.concat([df.iloc[:2], row, df.iloc[2:]]).reset_index(drop=True)
>>> new_df
  col1  col2
0    A     1
1    B     2
2    C     3
3    D     4

Note您需要在链式方法reset_index()中添加参数drop=True,否则您的“旧”索引将作为新列添加。在

希望这有帮助。在

相关问题 更多 >