分组数据帧，应用输入的函数，然后将结果添加回原点

变更：

benchmark = x y z field_1 1 1 3 a 1 2 5 b 9 2 4 a 1 2 5 c 4 6 1 c

精化：

我还有一个数据框separate_data，其中包含x的单独值

separate_data = x a b c 1 1 3 7 2 2 5 6 3 2 4 4 4 2 5 9 5 6 1 10

这将需要插入到现有的benchmark数据帧中。在separate_data中哪个列应该用于插值取决于benchmark中的列field_1（即上面集合(a,b,c)中的值）。新列中的内插值基于benchmark中的x值。在

结果：

基准=

x y z field_1 field_new 1 1 3 a interpolate using separate_data with x=1 and col=a 1 2 5 b interpolate using separate_data with x=1 and col=b 9 2 4 a ... etc 1 2 5 c ... 4 6 1 c ...

有道理吗？在

3条回答

网友

1楼 · 编辑于 2024-05-09 01:25:49

下面是一个工作示例：

# Sample function that sums x and y, then append the field as string.
def func(x, y, z):
    return (x + y).astype(str) + z

benchmark['new_field'] = benchmark.groupby('field_1')\
                                  .apply(lambda x: func(x['x'], x['y'], x['field_1']))\
                                  .reset_index(level = 0, drop = True)

结果：

^{pr2}$

网友

2楼 · 编辑于 2024-05-09 01:25:49

编辑：

我认为您需要先通过^{}+^{}来重塑separate_data，通过^{}设置索引名，通过重命名设置Serie的名称。在

然后，两个级别都可以使用一些函数groupby。在

然后^{}到{}，使用默认的左连接：

separate_data1 =separate_data.set_index('x').stack().rename_axis(('x','field_1')).rename('d')
print (separate_data1)
x  field_1
1  a           1
   b           3
   c           7
2  a           2
   b           5
   c           6
3  a           2
   b           4
   c           4
4  a           2
   b           5
   c           9
5  a           6
   b           1
   c          10
Name: d, dtype: int64

如有必要，请使用某些函数，主要是如果x与field_1成对出现重复，则返回漂亮的唯一对：

^{pr2}$

我认为您不能使用transform，因为多个列是一起读取的。在

所以使用^{}：

df1 = benchmark.groupby(['field_1']).apply(func)

然后对于新列有多个解决方案，例如使用^{}（默认left join）或{a7}。在

两种方法的样品溶液为here。在

{16{a16>可以使用^ new}或列。在

网友

3楼 · 编辑于 2024-05-09 01:25:49

试试这样的方法：

groups = benchmark.groupby(benchmark["field_1"])    
benchmark = benchmark.join(groups.apply(your_function), on="field_1")

在你的函数中，你可以使用你需要的其他列来创建新的列，例如求平均值、求和等等

apply的文档。 join的文档。在

变更：

相关问题更多 >

编程相关推荐

热门问题

热门文章