使用到的库
pandas、matplotlib、numpy
使用到的函数
df.resample(“H”).sum()
参数
B business day frequency
C custom business day frequency (experimental)
D calendar day frequency
W weekly frequency
M month end frequency
BM business month end frequency
CBM custom business month end frequency
MS month start frequency
BMS business month start frequency
CBMS custom business month start frequency
Q quarter end frequency
BQ business quarter endfrequency
QS quarter start frequency
BQS business quarter start frequency
A year end frequency
BA business year end frequency
AS year start frequency
BAS business year start frequency
BH business hour frequency
H hourly frequency
T minutely frequency
S secondly frequency
L milliseonds
U microseconds
N nanoseconds
主要思路
- 从csv中读取数据
- 将带有时间的列进行装换
- 将object的字符串转成时间,将字符串转成datetime64[ns],再转成float64
# 时间格式的字符串转datetime64[ns]df["报修时间"]=pd.to_datetime(df["报修时间"])# datetime64[ns],再转成float64df["报修时间"]=(df["报修时间"] - np.datetime64('1970-01-01T00:00:00Z')) / np.timedelta64(1, 's')
- 将数df设置成使用时间
df=df.set_index(df["date"])
- 利用resample进行统计
新增加一列便于统计
df["new"] = 1# 统计命令
df["new"].resample("H").sum().head(40)输出
>>> df["new"].resample("H").sum().head(40)
date
2023-01-01 04:00:00 1
2023-01-01 05:00:00 0
2023-01-01 06:00:00 0
2023-01-01 07:00:00 0
- 画图
>>> df2["2023-01-02"].plot
<pandas.plotting._core.PlotAccessor object at 0x0000027582973910>
>>> plt.show()
>>> df2["2023-01-02"].plot()
<Axes: xlabel='date'>
>>> plt.show()
>>> df2["2023-01-02"].plot()
<Axes: xlabel='date'>
>>> df2["2023-01-03"].plot()
<Axes: xlabel='date'>
>>> df2["2023-01-01"].plot()
<Axes: xlabel='date'>
>>> plt.show()
参考
https://blog.csdn.net/weixin_42357472/article/details/115301527
https://blog.csdn.net/AlexTan_/article/details/89763389