如何从一个数据框中获取数据并将其放置到单元级的另一个数据框中

2024-09-30 22:14:03 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个数据帧df\ u criterias和df\ u tofill。你知道吗

数据框标准

     goto_emptycol1     goto_emptycol2    data1     data2
0    some value1        another value1    a         val1
1    some value2        another value2    b         val2
2    some value3        another value3    c         val3
3    some value4        another value4    d         val4
4    some value5        another value5    e         val5
5    some value6        another value6    f         val6
6    some value7        another value7    g         val7

豆腐花

     emptycol1          emptycol2         data1     data2
0                                         f         val6
1                                         nok       nok
2                                         nok       nok
3                                         a         val1
4                                         nok       nok
5                                         g         val7
6                                         d         val4

预期结果

     emptycol1          emptycol2         data1     data2
0    some value6        another value6    f         val6
1                                         nok       nok
2                                         nok       nok
3    some value1        another value1    a         val1
4                                         nok       nok
5    some value7        another value7    g         val7
6    some value4        another value4    d         val4

从这两个列表中,我创建了两个带有索引的列表(其中两个dfs中的一些条件,列“data1”、“data2”匹配)

list_fill = [0,3,5,6] #from df_tofill
list_crt = [5,0,6,3] #from df_criterias

其中list\u crt[0]元素5与list\u fill[0]元素0匹配。你知道吗

为了达到预期的效果,我尝试了以下方法:

for i, icrt in enumerate(list_crt):
        #Get the value
        val1 = df_criterias.loc[icrt,"goto_emptycol1"]
        val2 = df_criterias.loc[icrt,"goto_emptycol2"]
        #Set the value
        df_tofill.loc[list_fill[i], "emptycol1"] = val1
        df_tofill.loc[list_fill[i], "emptycol2"] = val2

我正在努力获得“预期结果”数据框。算法正确吗?你知道吗

更新: 设法使它工作-.at给了我一些奇怪的错误,我用.loc替换了它。在创建带有索引的列表之前,需要一个.reset\u index()。你知道吗

索引列表是使用以下方法创建的:

def common_elements(crtlist, radlist):
    #where crtlist is all criterias and radlist all to be checked
    #returns 2 lists with indexes where elements where a match
    crtli_idx = []
    radli_idx = []
    for idx1, crt in enumerate(crtlist):
        for idx2, rad in enumerate(radlist):
            if rad.startswith(crt):
                crtli_idx.append(idx1)
                radli_idx.append(idx2)    
    return crtli_idx, radli_idx


crtlist = ['1', '21', '444']
radlist = ['asda','aererv','1vrvssq','4447676767']
idxcrt, ixdrad = common_elements(crtlist, radlist)
print(idxcrt, ixdrad)
OUT:
[0, 2] [2, 3]

Tags: dfanothersomeloclistidxval1crt
1条回答
网友
1楼 · 发布于 2024-09-30 22:14:03

一种方法是对齐索引/列,在目标数据帧中用np.nan替换'',然后通过.loc将一个数据帧分配给另一个数据帧。你知道吗

df_criterias = df_criterias.rename(columns={'goto_emptycol1': 'emptycol1',
                                            'goto_emptycol2': 'emptycol2'})\
                           .set_index(['data1', 'data2'])

df_tofill = df_tofill.replace('', np.nan)\
                     .set_index(['data1', 'data2']) 

df_tofill.loc[:] = df_criterias.loc[df_criterias.index.isin(df_tofill.index)]
df_tofill = df_tofill.reset_index()

#   data1 data2   emptycol1      emptycol2
# 0     f  val6  somevalue6  anothervalue6
# 1   nok   nok         NaN            NaN
# 2   nok   nok         NaN            NaN
# 3     a  val1  somevalue1  anothervalue1
# 4   nok   nok         NaN            NaN
# 5     g  val7  somevalue7  anothervalue7
# 6     d  val4  somevalue4  anothervalue4

相关问题 更多 >