一、prophet理论总结
prophet模型是facebook开源的一个时间序列预测算法。[1][2],该算法主要为处理具有周期性、趋势变化以及缺失值和异常值的时间序列数据而设计。适合处理日级别(或以上频率)的时间序列数据,设计考虑了业务场景中的时间序列特点,如季节性变化、假日效应和趋势变化。它的核心思想是将时间序列数据分解为趋势、季节性和假期效应三个部分。
Prophet能够自动检测数据中的趋势和季节性,并将它们组合在一起以获得预测值。它基于加法模型,将时间序列分解成趋势项、周期项、节假日项/特殊事件影响项以及残差项的组合,从而实现对时间序列的有效预测。此外,Prophet还提供了强大的可视化分析辅助工具,便于分析趋势、不同周期、不同节假日/特殊事件各自的贡献,使得模型解释性较强[^3]。
算法优点
- 适用于具有季节性和趋势变化的时间序列。
- 对缺失值和异常值具有较强的鲁棒性。
- 模型易于使用,适合非专业用户。
算法缺点
- 对于数据量很大的情况,计算可能会变得比较慢。
- 对非平稳数据的处理较为简单,可能不足以处理复杂的非平稳特征。
应用场景
- 适用于各种具有强季节性和趋势性的数据[^4]
Prophet模型既可以使用加法模型,也可以使用乘法模型
加法模型
- y(t)=g(t)+s(s)+h(t)+e(t)
- g(t)表示时间序列的趋势,用来拟合非周期性变化的。
- s(t)用来表示时间序列的季节性。
- h(t)表示时间序列的假期效应,节日等特殊原因等造成的变化。
- e(t)为误差项,用他来表示随机无法预测的波动。
适用场景:通常情况下,加法模型适用于时间序列的趋势和季节性与数据规模无关的情况,例如气温和降雨量;
乘法模型
- 在Prophet模型的乘法模型中,时间序列的预测值是趋势、季节性和假期效应的乘积
- y(t)=g(t)∗s(t)∗h(t)∗e(t)
适用场景:用于时间序列的趋势和季节性与数据规模相关的情况,例如商品销售量和股票价格。
python_36">二、python导入模块方式
实际在程序导入该模块时,多次检查该模块已安装,但导入时总是提示如下错误[^6]:
ModuleNotFoundError: No module named ‘Prophet’
经过多次尝试和寻求解决方案,最终发现问题所在:
fbprophet 的命名空间可能会与其他库冲突。因此,fbprophet 在导入时通常使用:
from prophet import Prophet
而不是:
import fbprophet
正确的导入方式:
python">from prophet import Prophet
python_51">三、python实现案例
3.1帮助信息
通过pyhton的帮助,调用help(Prophet)查看如下帮助信息,有助于我们更好的了解python中,该函数具体有哪些参数以及相关参数的含义。
python">Prophet(growth='linear',changepoints=None,n_changepoints=25,changepoint_range=0.8,yearly_seasonality='auto',weekly_seasonality='auto',daily_seasonality='auto',holidays=None,seasonality_mode='additive',seasonality_prior_scale=10.0,holidays_prior_scale=10.0,changepoint_prior_scale=0.05,mcmc_samples=0,interval_width=0.8,uncertainty_samples=1000,stan_backend=None,scaling: str = 'absmax',holidays_mode=None,
)
Docstring:
Prophet forecaster.Parameters
----------
growth: String 'linear', 'logistic' or 'flat' to specify a linear, logistic orflat trend.
changepoints: List of dates at which to include potential changepoints. Ifnot specified, potential changepoints are selected automatically.
n_changepoints: Number of potential changepoints to include. Not usedif input `changepoints` is supplied. If `changepoints` is not supplied,then n_changepoints potential changepoints are selected uniformly fromthe first `changepoint_range` proportion of the history.
changepoint_range: Proportion of history in which trend changepoints willbe estimated. Defaults to 0.8 for the first 80%. Not used if`changepoints` is specified.
yearly_seasonality: Fit yearly seasonality.Can be 'auto', True, False, or a number of Fourier terms to generate.
weekly_seasonality: Fit weekly seasonality.Can be 'auto', True, False, or a number of Fourier terms to generate.
daily_seasonality: Fit daily seasonality.Can be 'auto', True, False, or a number of Fourier terms to generate.
holidays: pd.DataFrame with columns holiday (string) and ds (date type)and optionally columns lower_window and upper_window which specify arange of days around the date to be included as holidays.lower_window=-2 will include 2 days prior to the date as holidays. Alsooptionally can have a column prior_scale specifying the prior scale forthat holiday.
seasonality_mode: 'additive' (default) or 'multiplicative'.
seasonality_prior_scale: Parameter modulating the strength of theseasonality model. Larger values allow the model to fit larger seasonalfluctuations, smaller values dampen the seasonality. Can be specifiedfor individual seasonalities using add_seasonality.
holidays_prior_scale: Parameter modulating the strength of the holidaycomponents model, unless overridden in the holidays input.
changepoint_prior_scale: Parameter modulating the flexibility of theautomatic changepoint selection. Large values will allow manychangepoints, small values will allow few changepoints.
mcmc_samples: Integer, if greater than 0, will do full Bayesian inferencewith the specified number of MCMC samples. If 0, will do MAPestimation.
interval_width: Float, width of the uncertainty intervals providedfor the forecast. If mcmc_samples=0, this will be only the uncertaintyin the trend using the MAP estimate of the extrapolated generativemodel. If mcmc.samples>0, this will be integrated over all modelparameters, which will include uncertainty in seasonality.
uncertainty_samples: Number of simulated draws used to estimateuncertainty intervals. Settings this value to 0 or False will disableuncertainty estimation and speed up the calculation.
stan_backend: str as defined in StanBackendEnum default: None - will try toiterate over all available backends and find the working one
holidays_mode: 'additive' or 'multiplicative'. Defaults to seasonality_mode.
3.2 案例
如下案例脚本,实际使用时,将数据处理成两列数据,模型整体的运行步骤和其他机器学习模型类似,需要注意的一点是:两列数据的名称必须是 ds 和 y 。因此实际处理完数据后,需要重命名列名称。
python">import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from prophet import Prophet # 使用 prophet 替代 fbprophet# 生成示例数据:带有季节性和趋势的时间序列
np.random.seed(1024)
dates = pd.date_range('2023-01-01', periods=365)
data = np.linspace(10, 50, 365) + 10 * np.sin(np.linspace(0, 10 * np.pi, 365)) + np.random.randn(365) * 5# 创建DataFrame
df = pd.DataFrame({'ds': dates, 'y': data})# 拟合Prophet模型
model = Prophet(yearly_seasonality=True)
model.fit(df)# 预测未来30天
future = model.make_future_dataframe(periods=30)
forecast = model.predict(future)# 可视化
fig = model.plot(forecast)
plt.title('Prophet Model Demo')
plt.xlabel('Date')
plt.ylabel('Value')
plt.show()
四、参考学习
[1]facebook官方文档
[2]github文档
[3] Prophet快速入门
[4]十大时间序列模型最强总结(六)Prophet
[5]时间序列模型Prophet使用详细讲解
[6]在win10系统安装fbprophet模块操作方式