对于这个函数,感觉基本上不怎么用的,相关介绍的资料也非常少,今天用了一天的时间,算是明白了它到底是干啥的。
首先,让我们看下help文档中的解释:
x2fx()*函数语法如下:
D = x2fx(X,model)
D = x2fx(X,model,categ)
D = x2fx(X,model,categ,catlevels)
描述如下:
D = x2fx(X,model) converts a matrix of predictors X to a design matrix D for regression analysis.
Distinct predictor variables should appear in different columns of X.The optional input model controls the regression model. By default, x2fx returns the design matrix for a
linear additive model with a constant term. model is one of the following:'linear' — Constant and linear terms. This is the default.
'interaction' — Constant, linear, and interaction terms
'quadratic' — Constant, linear, interaction, and squared terms
'purequadratic' — Constant, linear, and squared terms
啥意思呢?就是说这个函数可以根据变量给的预测矩阵,然后根据模型的类型,推出模型的设计矩阵用于回归分析。
何为设计矩阵,其实很容易理解。举个例子
X=[x1,x2];
D=x2fx(X) =[constant, x1,x2] (线性模型)
D=x2fx(X) =[constant, x1,x2, x1^2, x2^2] (纯二次模型)
不同的回归模型所包含的项也不同,举例说明下,假设X=[x1,x2,x3,x4] 四个变量,那么
**各项系数忽略**
model='linear'时,包含常数项和一次项项,即 constant+x1+x2+x3+x4,共计5项
model='interaction'时,包含常数项、一次项和交互项, 即constant+x1+x2+x3+x4+x1x2+x1x3+x1x4+x2x3+x2x4+x3x4 ,共计11项
model='quadratic'时,包含常数项、一次项、交互项以及二次项,即constant+x1+x2+x3+x4+x1x2+x1x3+x1x4+x2x3+x2x4+x3x4+x1^2+x2^2+x3^2+x4^2,共计15项
model='purequadratic'时,包含包含常数项、一次项和二次项,即constant+x1+x2+x3+x4+x1^2+x2^2+x3^2+x4^2,共计9项
此外每项的顺序如下:
If X has n columns, the order of the columns of D for a full quadratic model is:
1. The constant term
2. The linear terms (the columns of X, in order 1, 2, ..., n)
3. The interaction terms (pairwise products of the columns of X, in order (1, 2), (1, 3), ..., (1, n), (2, 3), ..., (n–1, n))
4. The squared terms (in order 1, 2, ..., n)
需要注意的时model还可以是个矩阵用于指定:
Alternatively, model can be a matrix specifying polynomial terms of arbitrary order. In this case, model should have one column for each column in X and one row for each term in the model.
The entries in any row of model are powers for the corresponding columns of X. For example, if X has columns X1, X2, and X3, then a row [0 1 2] in model specifies the term (X1.^0).*(X2.^1).*(X3.^2).
A row of all zeros in model specifies a constant term, which can be omitted.
如下例子:
Example 2
The following converts 2 predictors X1 and X2 (the columns of X) into a design matrix for a quadratic model with terms constant, X1, X2, X1.*X2, and X1.^2.X = [1 102 203 104 205 156 15];
model = [0 01 00 11 12 0];D = x2fx(X,model)
D =1 1 10 10 11 2 20 40 41 3 10 30 91 4 20 80 161 5 15 75 251 6 15 90 36
另外两个类似的函数的重载(不清楚在matlab中这样叫是否合适,为方便理解先这样叫了)
D = x2fx(X,model,categ) treats columns with numbers listed in the vector categ as categorical variables. Terms involving categorical variables produce dummy variable columns in D.
Dummy variables are computed under the assumption that possible categorical levels are completely enumerated by the unique values that appear in the corresponding column of X.D = x2fx(X,model,categ,catlevels) accepts a vector catlevels the same length as categ, specifying the number of levels in each categorical variable.
In this case, values in the corresponding column of X must be integers in the range from 1 to the specified number of levels. Not all of the levels need to appear in X.