Notation catalog

Notation catalog#

Notation	Meaning
\(i = 1, 2, \ldots, n\)	Index of an example in the dataset
\((\quad)\)	Parentheses indicate an ordered set of elements (a tuple)
\((\vec{x}_i, y_i)\)	Ordered pair of a data example and its answer/label
\(j = 1, 2, \ldots, p\)	Index of a feature in a data example
\(\vec{x} = (x_1, x_2, \ldots, x_p)\)	A vector with \(p\) features \((x_1, x_2, \ldots, x_p)\)
\(\vec{x}_i = (x_{i,1}, x_{i,2}, \ldots, x_{i,p})\)	A vector with \(p\) features \((x_{i,1}, x_{i,2}, \ldots, x_{i,p})\) for the \(i\)-th data example
\(f(\vec{x})\)	A function that takes in a data example \(\vec{x}\) and outputs a prediction \(\hat{y}\)
\(\hat{y}\)	A prediction (we want this to be close to the true answer)
\(\mathcal{L}(\vec{w})\)	A loss function, a function of the weights \(\vec{w} = (w_0, w_1, \ldots, w_p)\), which quantifies the error of the model
\(\nabla \mathcal{L}(\vec{w})\) = \(\left(\frac{\partial \mathcal{L}}{\partial w_0}, \frac{\partial \mathcal{L}}{\partial w_1}, \ldots, \frac{\partial \mathcal{L}}{\partial w_p}\right)\)	The gradient of \(\mathcal{L}(\vec{w})\) with respect to the weights \(\vec{w}\), a vector of partial derivatives
\(\ell_i\)	The loss function calculated for a single data example \((\vec{x}_i, y_i)\)
\(\sigma(h) = \frac{1}{1 + e^{-h}}\)	The sigmoid function (also called the logistic function), a function that takes in a linear combination of features \(h\) and outputs a probability in the range \((0, 1)\)
\(\mathbf{1}[\text{condition}]\)	The indicator function: equals 1 if the condition is true, 0 if false