Non-Linear Programming
Tip: Vectorial product rule Let $F, G \colon \mathbb{R}^n \to \mathbb{R}^n$. Then, $$ \nabla (F(X)^\top G(X)) = \nabla F(X)^\top G(X) + \nabla G(X)^\top F(X) $$ $$ \DeclareMathOperator*{\argmax}{arg \,max \,} \DeclareMathOperator*{\argmin}{arg \,min \,} $$ Without constraints We are interested in minimizing an arbitrary function $f \colon \mathbb{R}^n \to \mathbb{R}$. To do so, we will use the gradient descent method. Let $\mathbf{x}^{(0)} \in \mathbb{R}^n$ be a random starting point. Then, for $k \in \mathbb{N}$, $$ \mathbf{x}^{(k + 1)} = \mathbf{x}^{(k)} - s^{(k)} \nabla f(\mathbf{x}^{(k)})....