在Pandas中給多行賦予一個索引

2024-10-03 09:08:59 发布

您现在位置:Python中文网/ 问答频道 /正文

我在Pandas中有一个数据框,看起来像这样:

           Activity Name Activity Start Activity End
0                  Phone          04:00        08:00
1                  Lunch          08:00        08:30
2                 Coffee          08:30        08:45
3                  Phone          08:45        10:30
4         WrittenSupport          10:30        12:30
5                  Phone          04:00        08:00
6                  Lunch          08:00        08:30
7                 Coffee          08:30        08:45
8                  Phone          08:45        09:00
9                  Phone          06:00        09:00

“我的数据框”中的数据描述了在轮班期间分配给代理的不同活动。问题是另一个带有代理的数据帧只有57个名称,而通常有4-5个活动分配给一个人。当我合并我的数据帧时,我最终得到57个代理和265个显然与指定人员不匹配的活动。在

什么是有用的:每个人工作8小时。在

如何将其转换为这样:

^{pr2}$

Tags: 数据name名称代理pandas人员phoneactivity
3条回答

也许可以通过创建一个不同索引的列表来实现,如下所示:

times = [int(x[1][:2]) for x in your_array]
previous = 0
index=[1]
next_agent= 2
for time in times:
    if time >= previous:
        index.append(‘´)
    else:
        index.append(next_agent)
        next_agent+=1
    previous = time

然后设置测向:

^{pr2}$

如果您的代理和活动有单独的行,您可以创建一个多索引,如下所示:

import pandas as pd

# This is the dataframe data with activities you got from a single agent
agent_1 = [['Phone', 'Phone', 'Coffee', 'Lunch', 'Phone', 'Phone', 'Lunch', 'Lunch'],
           ['04:00', '08:30', '10:30', '04:00', '10:30', '04:00', '08:30', '10:30']]

# This is the dataframe data from a second agent
agent_2 = [['Phone', 'Pooping', 'Coffee', 'Lunch', 'Phone', 'Meeting', 'Lunch', 'Lunch'],
           ['08:45', '08:50', '10:30', '04:00', '10:30', '04:00', '08:30', '10:30']]

# We create the dataframe for agent 1
df1 = pd.DataFrame(agent_1).T
df1.columns = ['activity', 'time']


# We create the dataframe for agent 2
df2 = pd.DataFrame(agent_2).T
df2.columns = ['activity', 'time']

# Now we have to dataframes we can't really put together
print(df1)
print("  ")
print(df2)
print("  ")

# So we should give each dataframe a column with its agent.
df1['agent'] = "Agent_1"
df2['agent'] = "Agent_2"

# Now each dataframe has data on its agent
print(df1)
print("  ")
print(df2)
print("  ")

# Let's combine them
overview = pd.concat([df1, df2])
print(overview)
print("  ")

# To make it even better, we could make a multi-index so we can index both agents AND activities
overview.set_index(['agent', 'activity'], inplace=True)
print(overview)

输出:

^{pr2}$

考虑以下数据(添加一些数据以供验证):

print(df)
     Activity Name Activity Start Activity End
0            Phone       04:00:00     08:00:00
1            Lunch       08:00:00     08:30:00
2           Coffee       08:30:00     08:45:00
3            Phone       08:45:00     10:30:00
4   WrittenSupport       10:30:00     12:30:00
5            Phone       04:00:00     08:00:00
6            Lunch       08:00:00     08:30:00
7           Coffee       08:30:00     08:45:00
8            Phone       08:45:00     09:00:00
9            Phone       06:00:00     09:00:00
10  Someother Name       10:30:00     12:30:00
11           Phone       04:00:00     08:00:00
12           Lunch       08:00:00     08:30:00
13          Coffee       08:30:00     08:45:00
14           Phone       08:45:00     09:00:00
15           Phone       06:00:00     09:00:00

使用以下方法:

^{pr2}$

相关问题 更多 >