groupby和列表上的统计信息

'Location' 'Dir' 'Set' 'H1' 'H2' 0 Chicago H1 4 *LIST* *LIST* 1 Houston H2 4 *LIST* *LIST* 2 Los Angeles H2 4 *LIST* *LIST* 3 Boston H1 0 *LIST* *LIST* 4 NYC H2 0 *LIST* *LIST* 5 Seattle H1 0 *LIST* *LIST*

2条回答

网友

1楼 · 编辑于 2024-10-01 09:39:07

你可以得到你的过滤组的元素平均值，方式如下。一些中间步骤是必要的（重塑数据并将列表转换为numpy数组），但是这些步骤应该产生您想要的方法列表（或数组）。你知道吗

# melt H1 and H2 columns into key-value columns
# this will make it easier to select either the H1 or H2 list
df = pd.melt(df, id_vars=['Location', 'Set', 'Dir'], \
value_vars=['H1', 'H2'], var_name="Target_Dir", value_name="Values")

# convert lists to numpy arrays
# in order to be able to specify the axis for the mean calculation
df.Values = df.Values.apply(np.array)

# filter df to your target Dirs, group by Set
# and calculate element-wise means
df[df['Dir'] == df['Target_Dir']].groupby('Set')['Values'].apply(lambda x: np.mean(x, axis=0))

网友

2楼 · 编辑于 2024-10-01 09:39:07

试试这个：

import pandas as pd

x = pd.DataFrame({'Location': ['Chicago','Houston','Los Angeles','Boston','NYC','Seattle'],
                  'Dir':      ['H1','H2','H2','H1','H2','H1'],
                  'Set':      [4,4,4,0,0,0],
                  'SetCopy':  [4,4,4,0,0,0]})
mean = x.groupby(['Set','Dir']).mean()
sd = x.groupby(['Set','Dir']).std()

根据评论编辑：

import pandas as pd
import numpy as np
import itertools

x = pd.DataFrame({'Location': ['Chicago','Houston','Los Angeles','Boston','NYC','Seattle'],
                  'Dir':      ['H1','H2','H2','H1','H2','H1'],
                  'Set':      [4,4,4,0,0,0],
                  'H1':       [[4,8,10],[8,4,12],[6,9,5],[6,7,9],[0,0,0],[0,0,0]]})

mean = x.groupby(['Set','Dir']).H1.apply(
    lambda x: list(x)).apply(
    lambda x: np.mean(list(itertools.chain.from_iterable(x))))

sd = x.groupby(['Set','Dir']).H1.apply(
    lambda x: list(x)).apply(
    lambda x: np.std(list(itertools.chain.from_iterable(x))))

相关问题更多 >

编程相关推荐

热门问题

热门文章

groupby和列表上的统计信息

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >