Linear Algebra
Theorem. Every finite dimensional Hilbert space has scalar field either $\R$ or $\C$.
Question. Is this true? What about $\R[X]$ or quaternions?
Definition. Given a symmetric positive semidefinite matrix $K ∈ \R^{n × n}$, the Mahalanobis norm $\norm{\dummyarg}_K$ is
$$ \norm{\vec x}_K ≜ \sqrt{\vec x^T ⋅ K ⋅ \vec x} $$
Definition. Given a symmetric positive semidefinite matrix $K ∈ \R^{n × n}$, a whitening transform is a matrix $W ∈ \R^{n × n}$ such that $W ⋅ K ⋅ W^T = I$.
Note. There are many solutions $W$ that satisfy the above. In particular given any whitening matrix $W$ and unitary matrix $U ∈ \R^{n × n}$ the matrix $U ⋅ W$ is also a whitening matrix.
Definition. The Mahalanobis whitening matrix is $W = K^{ \frac 12}$.
Definition. The Cholesky transform is $W = L^T$ where $L^T ⋅ L = K^{1}$ is the Cholesky decomposition.
Definition. The ZCACor transform is $W = P^{\frac 12} ⋅ V^{ \frac 12}$ where $V$ is the diagonal of $K$ and $P = V^{\frac 12} ⋅ K ⋅ V^{\frac 12}$.
Theorem. ZCACor maximizes the correlation between the original an whitened values. See KLS16.
To do. Proof that this actually is a norm. Explain https://en.wikipedia.org/wiki/Whitening_transformation and relationship to Euclidean norm.
To do. The whitening matrix is not unqiue see here for two solutions.
Note. When $Ω = I$ the Mahalanobis norm equals the Euclidean norm.
To do. Summarize SVD and how it relates to almost everything else.
To do. Discuss the Generalized SVD GSVD and its (potentia) application here.
To do. Discuss MoorePenrose pseudoinverse.
https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant
https://en.wikipedia.org/wiki/Hessian_matrix
Norms and means
Definition. Given $p ∈ \R ∪ \set{∞, +∞}$ The $p$norm is
$$ \norm{\vec x}_p ≜ \p{\sum_i \abs{x_i} ^p}^{\frac{1}{p}} $$
Note. The limiting cases $\set{∞, 0, +∞}$ are not real norms.
Note. The special case $p=1$ sL1norm is also known as the *Taxicab norm, Manhattan norm, or in the case of one dimension, the Absolutevalue norm.
Note. The limiting case of a $0$norm is the Hamming weight. The corresponding distance is the Hamming distance.
Note. The $2$norm is also known as the Euclidean norm.
Note. The $+∞$norm is also known as the $∞$norm, the infinity norm or the maximum norm.
To do. What happens with the negative norms?
Definition. Each norm has an associated distance.
$$ d(\vec a, \vec b) = \norm{a  b} $$
Note. Each distance has an associated nearest scalar value in the following sence:
$$ s = \arg \min_s d\p{\vec x, s ⋅ \vec 1} $$
Note. The summary statistic from the $0$norm is called the mode and is not unique.
Note. The summary statistic from the $1$norm is called the median.
Note. The summary statistic from the $2$norm is called the mean.
https://www.johnmyleswhite.com/notebook/2013/03/22/modesmediansandmeansanunifyingperspective/
https://www.johnmyleswhite.com/notebook/2013/03/22/usingnormstounderstandlinearregression/
Definition. For values $\vec x ∈ \R^n$ the generalized mean is
$$ \p{\frac{1}{n} \sum_i x_i^p}^{\frac{1}{p}} $$
https://en.wikipedia.org/wiki/Lehmer_mean
https://en.wikipedia.org/wiki/Quasiarithmetic_mean
Note. The $∞$norm is also known as the minimum norm?
https://www.johnmyleswhite.com/notebook/2014/03/24/anoteonthejohnsonlindenstrausslemma/
Matrix Calculus
$$ \pder{\vec x} \vec y $$
References

KLS16 Agnan Kessy, Alex Lewin, Korbinian Strimmer (2016) “Optimal whitening and decorrelation” https://arxiv.org/abs/1512.00809

Terrence Parr, Jeremy Howards (2018). “The Matrix Calculus You Need For Deep Learning” https://arxiv.org/abs/1802.01528 https://explained.ai/matrixcalculus/