大Pandas序列的水平排序

2024-07-07 01:32:51 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个叫做mydf的数据帧

mydf

enter image description here

我执行了下面的操作,然后它被转换成一个系列。你知道吗

mydf.groupby([mydf.type,mydf.name]).size()

现在我有一个两级类型的系列,即演员和女演员。你知道吗

    type      name               
    actor    'Big' Ben Moroz        1
             'Ducky' Louie          3
             'Fast' Eddie Mahler    1
             'King Kong' Kashey     1
             'Muddy' Berry          1

    actress   Zedra Conde           3
              Zena Marshall         1
              Zinaida Morskaya      1
              Zoe Holland           1
              Zoia Karabanova       2

现在,我希望我的结果在actor中按降序排序,如果actor的“value”(在第三个未命名的say列中给出)是相同的,则排序必须按的“name”进行,然后在另一个中进行排序时必须遵循相同的模式

type      name               
actor    'Ducky' Louie          3
         'Big' Ben Moroz        1
         'Fast' Eddie Mahler    1
         'King Kong' Kashey     1
         'Muddy' Berry          1

actress   Zedra Conde           3
          Zoia Karabanova       2
          Zena Marshall         1
          Zinaida Morskaya      1
          Zoe Holland           1

你知道吗注意:-请避免循环。你知道吗


Tags: name排序typefastactorbeneddiebig
1条回答
网友
1楼 · 发布于 2024-07-07 01:32:51

不幸的是,我想到的所有东西都需要双重分组/排序。假设我们有一个数据帧

import pandas as pd
import numpy as np
import random

d = pd.DataFrame({'type': ['actor']*5+['actress']*5,  
                  'name' : [random.choice(['a', 'b', 'c']) for i in range(10)]})
d


    name    type
0   c   actor
1   c   actor
2   a   actor
3   b   actor
4   a   actor
5   c   actress
6   c   actress
7   c   actress
8   a   actress
9   a   actress


d.groupby([d.type,d.name]).size()

type     name
actor    a       2
         b       1
         c       2
actress  a       2
         c       3
dtype: int64

方法1:

d.groupby([d.type,d.name]).size().groupby(level=[0]).apply(lambda x: x.sort_values(ascending=False))

type     type     name
actor    actor    c       2
                  a       2
                  b       1
actress  actress  c       3
                  a       2
dtype: int64

方法2:

d1 = d.groupby([d.type,d.name]).size()
d2 = d1.reset_index()
d2.columns = ['type', 'actress', 'sz']
d2.sort_values(by = ['type',  'sz', 'actress'], ascending = [True, False, True])

    type    actress sz
0   actor   a   2
2   actor   c   2
1   actor   b   1
4   actress c   3
3   actress a   2

相关问题 更多 >