Notation catalog#
Notation |
Meaning |
|---|---|
\(i = 1, 2, \ldots, n\) |
Index of an example in the dataset |
\((\quad)\) |
Parentheses indicate an ordered set of elements (a tuple) |
\((\vec{x}_i, y_i)\) |
Ordered pair of a data example and its answer/label |
\(j = 1, 2, \ldots, p\) |
Index of a feature in a data example |
\(\vec{x} = (x_1, x_2, \ldots, x_p)\) |
A vector with \(p\) features \((x_1, x_2, \ldots, x_p)\) |
\(\vec{x}_i = (x_{i,1}, x_{i,2}, \ldots, x_{i,p})\) |
A vector with \(p\) features \((x_{i,1}, x_{i,2}, \ldots, x_{i,p})\) for the \(i\)-th data example |
\(f(\vec{x})\) |
A function that takes in a data example \(\vec{x}\) and outputs a prediction \(\hat{y}\) |
\(\hat{y}\) |
A prediction (we want this to be close to the true answer) |
\(\mathcal{L}(\vec{w})\) |
A loss function, a function of the weights \(\vec{w} = (w_0, w_1, \ldots, w_p)\), which quantifies the error of the model |
\(\nabla \mathcal{L}(\vec{w})\) = \(\left(\frac{\partial \mathcal{L}}{\partial w_0}, \frac{\partial \mathcal{L}}{\partial w_1}, \ldots, \frac{\partial \mathcal{L}}{\partial w_p}\right)\) |
The gradient of \(\mathcal{L}(\vec{w})\) with respect to the weights \(\vec{w}\), a vector of partial derivatives |
\(\ell_i\) |
The loss function calculated for a single data example \((\vec{x}_i, y_i)\) |
\(\sigma(h) = \frac{1}{1 + e^{-h}}\) |
The sigmoid function (also called the logistic function), a function that takes in a linear combination of features \(h\) and outputs a probability in the range \((0, 1)\) |
\(\mathbf{1}[\text{condition}]\) |
The indicator function: equals 1 if the condition is true, 0 if false |