3.1.3.2 多元回归: 包含多因素
考虑用2个变量x和y来解释变量z的线性模型:
$z = x \, c_1 + y \, c_2 + i + e$
这个模型可以被视为在3D世界中用一个平面去拟合 (x, y, z) 的点云。
实例: 鸢尾花数据 (examples/iris.csv)
萼片和花瓣的大小似乎是相关的: 越大的花越大! 但是,在不同的种之间是否有额外的系统效应?
In [33]:
data = pandas.read_csv('examples/iris.csv')
model = ols('sepal_width ~ name + petal_length', data).fit()
print(model.summary())
OLS Regression Results
==============================================================================
Dep. Variable: sepal_width R-squared: 0.478
Model: OLS Adj. R-squared: 0.468
Method: Least Squares F-statistic: 44.63
Date: Thu, 19 Nov 2015 Prob (F-statistic): 1.58e-20
Time: 09:56:04 Log-Likelihood: -38.185
No. Observations: 150 AIC: 84.37
Df Residuals: 146 BIC: 96.41
Df Model: 3
Covariance Type: nonrobust
======================================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------- ------------- ------------- ------------- ------------- ---------------------
Intercept 2.9813 0.099 29.989 0.000 2.785 3.178
name[T.versicolor] -1.4821 0.181 -8.190 0.000 -1.840 -1.124
name[T.virginica] -1.6635 0.256 -6.502 0.000 -2.169 -1.158
petal_length 0.2983 0.061 4.920 0.000 0.178 0.418
==============================================================================
Omnibus: 2.868 Durbin-Watson: 1.753
Prob(Omnibus): 0.238 Jarque-Bera (JB): 2.885
Skew: -0.082 Prob(JB): 0.236
Kurtosis: 3.659 Cond. No. 54.0
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.