如何设置Pandas工作日两小时的情节?

2024-09-29 01:34:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据帧,其结构类似于以下内容:

from datetime import datetime
import pandas as pd
from mpu.datetime import generate  # pip install mpu

mind, maxd = datetime(2018, 1, 1), datetime(2018, 12, 30)
df = pd.DataFrame({'datetime': [generate(mind, maxd) for _ in range(10)]})

我想了解这些数据是如何在一天中的几个小时和一周的几天中分布的。我可以通过:

^{pr2}$

最后我有了一个情节:

ax = df.groupby(['weekday', 'hour'])['datetime'].count().plot(kind='line', color='blue')
ax.set_ylabel("#")
ax.set_xlabel("time")
plt.show()

这给了我:

enter image description here

但是你可以注意到,很难区分工作日和工作时间,甚至不明显。如何获得类似于以下内容的两级标签?在

enter image description here


Tags: 数据fromimportpandasdfdatetimeasax
3条回答

如果假设每个可能的工作日和小时都实际出现在数据中,那么轴的单位就是小时,周一午夜为0,周日23h为24*7-1=167。 然后,你可以每24小时用主要的记号勾选一次,并在每一个中午标注一周中的每一天。在

import numpy as np; np.random.seed(42)
import datetime as dt
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator, FuncFormatter, NullFormatter

# Generate example data
N = 5030
delta = (dt.datetime(2019, 1, 1) - dt.datetime(2018, 1, 1)).total_seconds()
df = pd.DataFrame({'datetime': np.array("2018-01-01").astype(np.datetime64) + 
                               (delta*np.random.rand(N)).astype(np.timedelta64)})

# Group the data
df['weekday'] = df['datetime'].dt.weekday
df['hour'] = df['datetime'].dt.hour

counts = df.groupby(['weekday', 'hour'])['datetime'].count()

ax = counts.plot(kind='line', color='blue')
ax.set_ylabel("#")
ax.set_xlabel("time")
ax.grid()
# Now we assume that there is data for every hour and day present
assert len(counts) == 7*24
# Hence we can tick the axis with multiples of 24h
ax.xaxis.set_major_locator(MultipleLocator(24))
ax.xaxis.set_minor_locator(MultipleLocator(1))

days = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"]
def tick(x,pos):
    if x % 24 == 12:
        return days[int(x)//24]
    else:
        return ""
ax.xaxis.set_major_formatter(NullFormatter())
ax.xaxis.set_minor_formatter(FuncFormatter(tick))
ax.tick_params(which="major", axis="x", length=10, width=1.5)
plt.show()

enter image description here

你所说的“熊猫”并不是你的想象。在

df.groupby(['weekday', 'hour'])['datetime'].count().unstack(level=0).plot()

您在代码中提供的数据如下所示:

enter image description here

我无法用您的数据集测试它,而pandas datetime有时在matplotlib datetime中很困难。但是我们的想法是分别设置major and minor ticks和{a2}:

import pandas as pd
from matplotlib import pyplot as plt
from matplotlib import dates as mdates

#create sample data and plot it
from io import StringIO
data = StringIO("""
X,A,B
2018-11-21T12:04:20,1,8
2018-11-21T18:14:17,6,7
2018-11-22T02:18:21,8,14
2018-11-22T12:31:54,7,8
2018-11-22T20:33:20,5,5
2018-11-23T12:23:12,13,2
2018-11-23T21:31:05,7,12
""")
df = pd.read_csv(data, parse_dates = True, index_col = "X")
ax=df.plot()

#format major locator
ax.xaxis.set_major_locator(mdates.DayLocator())
#format minor locator with specific hours
ax.xaxis.set_minor_locator(mdates.HourLocator(byhour = [8, 12, 18]))
#label major ticks
ax.xaxis.set_major_formatter(mdates.DateFormatter('%a %d %m'))
#label minor ticks
ax.xaxis.set_minor_formatter(mdates.DateFormatter("%H:00"))
#set grid for major ticks
ax.grid(which = "major", axis = "x", linestyle = "-", linewidth = 2)
#set grid for minor ticks with different properties
ax.grid(which = "minor", axis = "x", linestyle = " ", linewidth = 1)

plt.show()

样本输出: enter image description here

相关问题 更多 >