Multiple Regression Analysis in Matrix Form

Define three vectors (y, b and e) and a matrix (X) as follows:

\(\mathop {\bf{y}}\limits_{(N \times 1)} = \left[ {\begin{array}{*{20}{c}} {{y_1}}\\ {{y_2}}\\ \cdots \\ {{y_N}} \end{array}} \right],\) \(\mathop {\bf{X}}\limits_{(N \times (p + 1))} = \left[ {\begin{array}{*{20}{c}} 1&{{x_{11}}}&{{x_{12}}}& \cdots &{{x_{1p}}}\\ 1&{{x_{21}}}&{{x_{22}}}& \cdots &{{x_{2p}}}\\ \cdots & \cdots & \cdots & \cdots & \cdots \\ 1&{{x_{N1}}}&{{x_{N2}}}& \cdots &{{x_{Np}}} \end{array}} \right],\) \(\mathop {\bf{b}}\limits_{((p + 1) \times 1)} = \left[ {\begin{array}{*{20}{c}} {{b_0}}\\ {{b_1}}\\ \cdots \\ {{b_p}} \end{array}} \right]\) and \(\mathop {\bf{e}}\limits_{(N \times 1)} = \left[ {\begin{array}{*{20}{c}} {{e_1}}\\ {{e_2}}\\ \cdots \\ {{e_N}} \end{array}} \right].\)

The standard multiple (linear) regression equation with p predictor variables and N observations

\[{y_i} = {b_0} + {b_1}{x_{i1}} + {b_2}{x_{i2}} + \ldots + {b_p}{x_{ip}} + {e_i}{\rm{, where }}i = 1, \ldots ,N,\]

can then be rewritten using these vectors and the matrix as

\[\mathop {\bf{y}}\limits_{(N \times 1)} = \mathop {\bf{X}}\limits_{(N \times (p + 1))} \mathop {\bf{b}}\limits_{((p + 1) \times 1)} + \mathop {\bf{e}}\limits_{(N \times 1)} ,\]

or, in a more compact way that still makes explicit the dimensions of the vectors and the matrix, (and also illustrates their conformability for matrix multiplication)

\[{}_N{{\bf{y}}_1} = {}_N{{\bf{X}}_{p + 1}}{{\bf{b}}_1} + {}_N{{\bf{e}}_1},\]

or even more compactly (dropping the explicit indication of the size of the vectors and matrix) as \({\bf{y}} = {\bf{Xb}} + {\bf{e}}.\)

The optimization problem in regression analysis is to minimize the residual sum of squares S:

\[\begin{array}{c} {\rm{Min }}\space S = \sum\limits_{i = 1}^n {e_i^2} = {\bf{e'e}} = ({\bf{y}} - {\bf{Xb}})'({\bf{y}} - {\bf{Xb)}}\\ = {\bf{y'y}} - {\bf{b'X'y}} - {\bf{y'Xb}} + {\bf{b'X'Xb}}\\ = {\bf{y'y}} - 2{\bf{b'X'y}} + {\bf{b'X'b}}. \end{array}\]

S can be minimized by setting the following partial derivative to 0:

\[\frac{{\partial S}}{{\partial {\bf{b}}}} = - 2{\bf{X'y}} + 2{\bf{X'Xb}} = 0.\]

Rearranging and cancelling the 2 gives,

\[{\bf{X'Xb = X'y}},\] and by multiplying both sides of this equation by \({({\bf{X'X}})^{ - 1}},\) i.e.,

\[{({\bf{X'X}})^{ - 1}}{\bf{X'Xb = }}\space{({\bf{X'X}})^{ - 1}}{\bf{X'y}},\]

the regression coefficients, \({\bf{b}},\) can be obtained as follows:

\[{\bf{b}} = {({\bf{X'X}})^{ - 1}}{\bf{X'y}}.\]