基于数据中的组为列指定值

2024-09-30 19:31:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个大致如下的数据集:

data_set = pd.DataFrame([
    {'img_type': 'bias', 'CCD-TEMP': -10, 'explen': 0, 'mean': 1023.4234},
    {'img_type': 'bias', 'CCD-TEMP': -10, 'explen': 0, 'mean': 1024.4334},
    {'img_type': 'bias', 'CCD-TEMP': -15, 'explen': 0, 'mean': 1022.2344},
    {'img_type': 'bias', 'CCD-TEMP': -15, 'explen': 0, 'mean': 1021.1031},
    {'img_type': 'dark', 'CCD-TEMP': -10, 'explen': 30, 'mean': 1025.9959},
    {'img_type': 'dark', 'CCD-TEMP': -10, 'explen': 30, 'mean': 1023.3434},
    {'img_type': 'dark', 'CCD-TEMP': -10, 'explen': 60, 'mean': 1020.1234},
    {'img_type': 'dark', 'CCD-TEMP': -10, 'explen': 60, 'mean': 1022.4234},
    {'img_type': 'dark', 'CCD-TEMP': -15, 'explen': 30, 'mean': 1025.9959},
    {'img_type': 'dark', 'CCD-TEMP': -15, 'explen': 30, 'mean': 1023.3434},
    {'img_type': 'dark', 'CCD-TEMP': -15, 'explen': 60, 'mean': 1020.1234},
    {'img_type': 'dark', 'CCD-TEMP': -15, 'explen': 60, 'mean': 1022.4234},
    ])

我所要做的是分离img\u type='bias'行,用CCD-TEMP对它们进行分组,然后计算每组的'mean'的mean()。这似乎起到了这个作用:

>>> data_set[data_set['img_type'].isin(['bias'])].groupby('CCD-TEMP')['mean'].mean()
... 
CCD-TEMP
-15    1021.66875
-10    1023.92840
Name: mean, dtype: float64

我现在需要做的是将这些值应用到一个名为“Offset”的新列,并基于CCD-TEMP将其应用到一组所有行。到目前为止,我尝试了一些方法,最后一次尝试如下:

>>> data_set['Offset'] = data_set[data_set['img_type'].isin(['bias'])].groupby('CCD-TEMP')['mean'].mean()
>>> data_set
    CCD-TEMP  explen img_type       mean  Offset
0        -10       0     bias  1023.4234     NaN
1        -10       0     bias  1024.4334     NaN
2        -15       0     bias  1022.2344     NaN
3        -15       0     bias  1021.1031     NaN
4        -10      30     dark  1025.9959     NaN
5        -10      30     dark  1023.3434     NaN
6        -10      60     dark  1020.1234     NaN
7        -10      60     dark  1022.4234     NaN
8        -15      30     dark  1025.9959     NaN
9        -15      30     dark  1023.3434     NaN
10       -15      60     dark  1020.1234     NaN
11       -15      60     dark  1022.4234     NaN

很明显,南不是我想要的

用熊猫做这样的事情,最好的方法是什么?一旦我通过了这个障碍,我就需要对一组(CCD-TEMP,explen)进行类似的操作。为此目的提出的任何建议都是受欢迎的


Tags: 方法imgdatatypenanmeantempoffset
1条回答
网友
1楼 · 发布于 2024-09-30 19:31:42

我认为如果要用mean分配给bias^{},需要:

data_set['Offset'] = data_set[data_set['img_type'].isin(['bias'])].groupby('CCD-TEMP')['mean'].transform('mean')
print (data_set)
    CCD-TEMP  explen img_type       mean      Offset
0        -10       0     bias  1023.4234  1023.92840
1        -10       0     bias  1024.4334  1023.92840
2        -15       0     bias  1022.2344  1021.66875
3        -15       0     bias  1021.1031  1021.66875
4        -10      30     dark  1025.9959         NaN
5        -10      30     dark  1023.3434         NaN
6        -10      60     dark  1020.1234         NaN
7        -10      60     dark  1022.4234         NaN
8        -15      30     dark  1025.9959         NaN
9        -15      30     dark  1023.3434         NaN
10       -15      60     dark  1020.1234         NaN
11       -15      60     dark  1022.4234         NaN

或者如果需要按列CCD-TEMP输出mean

s = data_set[data_set['img_type'].isin(['bias'])].groupby('CCD-TEMP')['mean'].mean()
print (s)
CCD-TEMP
-15    1021.66875
-10    1023.92840
Name: mean, dtype: float64

data_set['Offset'] = data_set['CCD-TEMP'].map(s)
print (data_set)
    CCD-TEMP  explen img_type       mean      Offset
0        -10       0     bias  1023.4234  1023.92840
1        -10       0     bias  1024.4334  1023.92840
2        -15       0     bias  1022.2344  1021.66875
3        -15       0     bias  1021.1031  1021.66875
4        -10      30     dark  1025.9959  1023.92840
5        -10      30     dark  1023.3434  1023.92840
6        -10      60     dark  1020.1234  1023.92840
7        -10      60     dark  1022.4234  1023.92840
8        -15      30     dark  1025.9959  1021.66875
9        -15      30     dark  1023.3434  1021.66875
10       -15      60     dark  1020.1234  1021.66875
11       -15      60     dark  1022.4234  1021.66875

相关问题 更多 >